[src] Rework error logging for safety and cleanliness #3064

kkm000 · 2019-03-02T11:02:38Z

Kaldi error logging was giving fits in i-vector training, when the ivector-extractor-sum-accs spawned multiple i-e-acc-stats processes. Training aborted randomly, and all I was getting in logs was a flurry of std::runtime_errors with an empty message from the child processes (they all died together, apparently). At this point I came back to the logging code, and, while it was not exactly clear how I was getting the message, it was easy enough to figure out where it came from, the logging collector destructor. In other cases, I was getting double-logging, which was also kind of understandable: FatalMessageLogger calls HandleMessage then throws, then some magic in the catch clause allowed the destruction to complete, and it's base class MessageLogger invoked logging again.

I feel responsible for messing this code when I introduced the logging-function hook-up, along with some bugs that Karel @vesis84 had to later fix, for which I'm really grateful. I decided to clean-up the code significantly, to avoid throwing in the destructor entirely, as the behavior wrt the base class destructor is undefined, as far as I understand, and in fact proved compiler-dependent.

This is not a small changeset, so let me summarize it part-by-part. If any of these is not clear, each is detailed in a subsection below (some with additional links to M&M in separate gists).

The code has been compiled and tested cleanly with: gcc6, gcc7, gcc8, clang6, clang7, msvc19. A stand-alone test is available in a gist.

Short summary

No more throwing destructors. A separate class is tasked with it.
The result of logging expression e. g. KALDI_LOG << x is void.
A distinct exception of type KaldiFatalError is thrown, not the runtime_error.
All uses of the endl manipulator in log << pipes removed from codebase.
Minor fixes to stack-tracing, to prevent empty lines on demangling errors.
Commentary on KALDI_ASSERT branch prediction removed.

1. Throwing destructors are ugly

Hope this statement is not contentious. Also, the logic there, relying on std::uncaught_exception(), was unsound at best. You do not have to subscribe to Herb Sutter's view that the function is entirely useless as defined (C++20 improves it, FWIW), but I found his arguments quite compelling. So I studied boost::log, gLog and some other logging code to find a better solution.

The idea is to throw the exception from some operator on a special class, not the destructor itself. This has a benefit of a logging expression (e. g., auto useless = KALDI_LOG << x), well, useless. This is similar to what gLog does to make such an expression void. I found their use of operator&() too confusing for Kaldi codebase; operator=() works as good and less confusing to read. Two special small classes expose it: Log() = MessageLogger(....) << 42; just logs the message, and LogAndThrow() = .... also throws. The tatter is [[noreturn]]; this one does not cause a confusion between compilers. It was quite unsurprising that compilers were confused by a [[noreturn]] destructor; I would hardly expect this in their battery of tests.

2. `KALDI_LOG << x` is void

It makes no sense to use it as a value any more, as logging is done by the operator=() mentioned above. The return value of the operator is void. Logging libraries using the << inserter pipelining pattern commonly do that to prevent a "creative use" of the logging macros.

3. `KALDI_ERROR` throws `KaldiFatalError`

This exceptions class is derived from std::runtime_error(), so current catch() handlers are compatible with it, as long as the object is caught by const&, which should normally be the only right way to catch exceptions. This potentially solved the IsKaldiError() problem previously addressed by Karel (see a4fff0d) in case an error source must be distinguished. I believe it's a good thing in general to have a dedicated exception class for Kaldi.

NEED FEEDBACK PLEASE: I do not like these two members of this class:

class KaldiFatalError : public std::runtime_error {
 public:
  //TODO(kkm): Temporary(?) hack. Do we really need it? I think a better
  // approach to avoid double-logging the message is catch the KaldiFatalError
  // in binaries explicitly (and print nothing).
  virtual const char *what() const noexcept override { return ""; }

  const char *KaldiMessage() const { return std::runtime_error::what(); }
};

Here, what() returns an empty string, to avoid double-logging, while there is a member to access a Kaldi-specific error message in case it is requested. This is an abuse of the semantics of the standard what() member. What I would actually find cleaner is the use of one of these patterns in each bin:

int main(int argc, char *argv[]) {
 try {
   /* ... */
 } catch(const KaldiFatalError&) {
   return -1;  // Error has been logged already.
 }
}

which just exists with a failure exit code because the error has been already logged, or

int main(int argc, char *argv[]) {
  try {
  /* ... */
  } catch(const KaldiFatalError&) {
    return -1;  // Error has been logged already.
  } catch(const std::exception &e) {
    std::cerr << "FATAL: " << e.what() << "\n";
    return -1;
  }
}

to stderr other stdc++ exceptions in case of a library error. Personally, I'd prefer the former, as it would puke a coredump on an unhandled exception, while the latter would lose stack trace of the error, and is quite uninformative (what's the point of e.g. "FATAL: )

So what is the best approach, in your opinion?

Leave bins alone, keep the hack.
Update them all to the former pattern (one catch clause).
Update them all to the latter pattern (two catch clauses).

4. Most uses of `std::endl` in Kaldi codebase are incorrect

The manipulator does not equal a '\n'; it is a '\n' followed by flush()!

While an overload of operator << on the MessageLogger would have allowed it, I decided to bite the bullet and remove it (or replace with a '\n' if not at EOL) everywhere it is used with logging macros. This allowed me to get rid of the code that removed trailing linefeeds in the logger as well.

But generally speaking, I see many incorrect uses of this manipulator in the codebase. Here's a very relevant note from comments in the glibc ostream header:

This manipulator is often mistakenly used when a simple newline is > desired, leading to poor buffering performance. See > https://gcc.gnu.org/onlinedocs/libstdc++/manual/streambufs.html#io.streambuf.buffering > for more on this subject.

I sanitized its use everywhere in those files that I had to edit only. The only exception is a baffling loop starting at onlinebin/online-audio-client.cc:175, which seems to output one empty line to stdout per each line of header read from the server, except for lines starting with "PARTIAL:", for which the rest of such a line is printed and flushed explicitly (twice, as it currently is); I do not understand how it is used, so I left the flushing version intact.

There are places where endl is used when actual files are being written; I changed two instances in the same onlinebin/online-audio-client.cc where the file was being flushed to disk on every write. There is more cases like this; they should probably be taken care of eventually.

It is also important that std::cerr is set to auto-flush after every insertion << by default; per N3337 §27.4.2, 5:

After the object cerr is initialized, cerr.flags() & unitbuf is nonzero [...]

5. Stack trace could contain empty strings on demangling failure

It actually did without -rdynamic with gcc (but not clang). I massaged the code a bit to print the original reported string from the stack trace in such a case.

6. Commentary on `KALDI_ASSERT` branch prediction was dubious

I have been long surprised by the statement that compilers prefer the if-true branch rather than the else branch. It always bothered me, as modern compilers are rather dealing with very thoroughly cooked internal representation when the code reaches the optimizer. Doing this rework, I decided to spend some time to check if that is still the case these days.

TL;DR: absolutely not, with no exceptions, for all the 4 ways to write ASSERT-like statement I wrote, and all compilers I tested: gcc, clang, icc, msvc, multiple version each where available. Full exposure is available in this gist, if you are interested in details of the test.

In the end, I removed the comment, as it has become rather misleading.

It is also notable that [[noreturn]] seems a strong optimization hint, suggesting the compiler that the code calling that function should be put on the conditional jump branch. All compilers/versions tested do that with -O2, and most but not all with -O1. Again, specific details are at the link above. It was a fascinating exercise, really.

danpovey · 2019-03-02T17:48:46Z

src/base/kaldi-error.h

+  explicit KaldiFatalError(const char *message)
+      : std::runtime_error(message) { }
+
+  //TODO(kkm): Temporary(?) hack. Do we really need it? I think a better


if this is going to be permanent, better to just explain the purpose,

Sure thing, I want to remove the TODO and finalize the interface before this merge. We need to decide how this will be used. My complete question is under the "NEED FEEDBACK PLEASE" heading in the description. Very briefly, I do not like the current approach, which is: log the error then throw runtime_error(""). My understanding this empty string is to avoid double-printing of the already logged message, because all bins do cerr << e.what();. The two member functions of the KaldiFatalError exist solely to continue to support this hack: what() always returns "", but the additional member KaldiMessage() is available (just in case) to get the error message.

What I would like to do is change the catch clause in all binaries to catch (const KaldiFatalError&) { return -1; }, essentially. In this case, the what() hack becomes unnecessary, and the KaldiMessage() accessor won't be needed (so no non-inherited members in the class KaldiFatalError). It's a non-zero amount of work, naturally, and I wanted to run it by you first. And what should be the form of the catch-clause wrt catching non-Kaldi std:: exceptions, see the full question for the two possibilities.

Or maybe I am not understanding the reason for the empty what() message at all?

I make use of catching exceptions with a custom logger and thus don't assume the message has gone to the output. Does this effect your thoughts on your "NEED FEEDBACK PLEASE" design ?

I also do, although I know I should not; Kaldi has never been designed for that. At some points it leaks memory, for one.

I am not changing any existing behavior. The custom logger (or the default one, no matter) is called before throwing, and has always been. And if it's called with kError, you know it's going to throw. It it's called with kAssertFailed, Kaldi will abort, but you can throw your own exception at this point and bypass the abort. Whether to log or not, your choice in the logging function. Of course, having caught the exception, you should assume that all the state has been corrupted, memory leaked and gremlins are out to get you.

danpovey · 2019-03-02T17:54:44Z

Thanks for this. It seems reasonable to me; and thanks for testing on multiple compilers.
I'm thinking, if no-one objects, to just merge it and we'll deal with any problems from weird compilers (if any), as they come up.

danpovey · 2019-03-03T01:52:47Z

I think you are right about the reason for the empty what() message. I think it was added in a previous refactoring. I don't like the idea of catching the kaldi-fatal thing separately for all binaries, it would be a lot of work and too much boilerplate; I think it would be easiest just to make what() be "" or something short and standard like the class name, in this case.

…

On Sat, Mar 2, 2019 at 8:16 PM kkm (aka Kirill Katsnelson) < ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In src/base/kaldi-error.h <#3064 (comment)>: > @@ -78,81 +80,100 @@ struct LogMessageEnvelope { int32 line; }; -// Class MessageLogger is invoked from the KALDI_ASSERT, KALDI_ERR, KALDI_WARN and -// KALDI_LOG macros. It formats the message, then either prints it to stderr or -// passes to the log custom handler if provided, then, in case of the error, -// throws an std::runtime_exception, in case of failed KALDI_ASSERT calls abort(). +/// Kaldi runtime error exception. Thrown from any use of KALDI_ERR. +class KaldiFatalError : public std::runtime_error { + public: + explicit KaldiFatalError(const std::string &message) + : std::runtime_error(message) { } + explicit KaldiFatalError(const char *message) + : std::runtime_error(message) { } + + //TODO(kkm): Temporary(?) hack. Do we really need it? I think a better Or maybe I am not understanding the reason for the empty what() message at all? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3064 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVuxJ1xBudZ9RDeRBUK6TPgrTexPRfks5vSyJkgaJpZM4baUoM> .

kkm000 · 2019-03-03T02:10:38Z

it would be easiest just to make what() be "" or something short and standard like the class name, in this case.

SGTM! Yes, most std::exception derivates just return their class name (e. g., "std::bad_alloc"), will do same, and leave KaldiMessage() for those library uses where it might be helpful.

* Do not throw exceptions from the destructor in the logging code. * Voidify the value of logging constructs. * Add `class KaldiFatalError : public std::runtime_error`, and throw it from KALDI_ERROR instead of the unspecific`std::runtime_error`. * Remove all uses of `std::endl` from logging code, and disallow it for the future (won't compile). * Improve the handling of decode failures in stack tracing code. * Clean-up obsolete commentary. * Hide global `g_program_name` into kaldi-error.cc unit scope.

* Replace 'throw runtime_error' where KALDI_ERR was intended (everywhere except fsext/, basically). * Catch KaldiFatalError instead of runtime_error in tree/build-tree-utils.cc, and add a sensible message * Document throwing KaldiFatalError where runtime_error was previously mentioned in comments.

kkm000 · 2019-03-03T10:15:27Z

Done, the exception class documented. Also also added doxygen comments for public logging interfaces too, and un-doxygened the MessageLogger class, as it is technically an implementation detail.

I also added a second commit, replacing a few throw runtime_exception with KALDI_ERR, as that was intended (except in fsext/, naturally), and changed the thrown exception class name in all comments where runtime_exception was said to be thrown.

danpovey · 2019-03-03T20:22:25Z

Thanks. What kind of testing have you done, that the output looks similar to the current output?

…

On Sun, Mar 3, 2019 at 5:15 AM kkm (aka Kirill Katsnelson) < ***@***.***> wrote: Done, the exception class documented. Also also added doxygen comments for public logging interfaces too, and un-doxygened the MessageLogger class, as it is technically an implementation detail. I also added a second commit, replacing a few throw runtime_exception with KALDI_ERR, as that was intended (except in fsext/, naturally), and changed the thrown exception class name in all comments where runtime_exception was said to be thrown. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#3064 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ADJVu_tK1n4z6o4qlqc9NGQXRzxwvsY6ks5vS6DDgaJpZM4baUoM> .

btiplitz · 2019-03-04T13:28:39Z

src/base/kaldi-error.cc

+    MessageLogger (LogMessageEnvelope::kAssertFailed, func, file, line)
+      << "Assertion failed: (" << cond_str << ")";
+  fflush(NULL);  // Flush all pending buffers, abort() may not flush stderr.
+  std::abort();


within the c++ construct, is it better to rethrow a fatal exception vs a direct call to abort?

the abort breaks programs I have that catch errors and continue on the next task

This function only handles KALDI_ASSERT. It's reasonable to die at this point, because the macro says "it's impossible that the condition is false." You can throw around this abort in your logging function.

ok, I see that. Thanks

btiplitz · 2019-03-04T16:09:24Z

src/base/kaldi-error.cc

+// Trim filename to at most 1 trailing directory long. Given a filename like
+// "/a/b/c/d/e/f.cc", return "e/f.cc". Support both '/' and '\' as the path
+// separator.
+static const char *GetShortFileName(const char *path) {


Should this be an inline function ?

I am not following. Why?

never mind. Not that important. I was thinking of performance, but it should not get called that much

Ah, got it! Compilers take care of this these days. Look how the register keyword has gone from the language.

btiplitz · 2019-03-04T17:51:00Z

src/base/kaldi-error.h

-public:
-  /// Constructor stores the info,
+ public:
+  /// The constructor stores the message's "envelope", a set of data which


Since you cleaned up the destructor output, it would seem to make sense to make this a static function as to not re-construct the class on every log message

I'm sorry, I am a little bit dense today. The thing is constructed on the stack, and takes whopping 16 bytes of it. Why should I store a 16-byte buffer in TLS? Or am I totally missing your idea?

it's not a bunch of memory. But it seems like a waste to construct the class and destruct it. Wasted cpu time

Ah, I thought you are talking about the envelope, and you are talking about the whole MessageLoger thing. The class just contains an ostrstream. The logger itself lives on the stack, together with the stream (which in turn just encapsulates an std::string in the end). The stream allocates memory, of course. But we still need to log to something, so this is unavoidable.

It could have been more efficient if we did not pull the string from this logger to append to another stream. But compared to other computational stuff which is going on, it hardly worth an optimization. Verbose logging is for debugging, and normal logging is pretty insignificant.

if (GetVerboseLevel() >= 3) { 00007FF614FFA88D xor ebx,ebx 00007FF614FFA88F lea r12,[string "kaldi::nnet3::Optimize" (07FF61524E4A0h)] 00007FF614FFA896 lea r13,[string "ext\\kaldi\\@"... (07FF61524E6A0h)] 00007FF614FFA89D cmp dword ptr [kaldi::g_kaldi_verbose_level (07FF61531D448h)],3 00007FF614FFA8A4 jl kaldi::nnet3::Optimize+0E2h (07FF614FFA932h) CheckComputation(nnet, *computation, true); 00007FF614FFA8AA mov r8b,1 00007FF614FFA8AD mov rdx,r9 00007FF614FFA8B0 mov rcx,r15 00007FF614FFA8B3 call kaldi::nnet3::CheckComputation (07FF614F88DA0h) KALDI_LOG << "Before optimization, max memory use (bytes) = " 00007FF614FFA8B8 lea rcx,[rbp-50h] 00007FF614FFA8BC call std::basic_ostringstream<char,std::char_traits<char>,std::allocator<char> >::basic_ostringstream<char,std::char_traits<char>,std::allocator<char> > (07FF614E29680h) 00007FF614FFA8C1 mov dword ptr [rbp-70h],ebx 00007FF614FFA8C4 mov qword ptr [rbp-68h],r12 00007FF614FFA8C8 lea edx,[rbx+2Fh] 00007FF614FFA8CB mov rcx,r13

So this is all it boils down to. An ostringstream constructor call.

btiplitz · 2019-03-04T17:52:18Z

I get many compilation warnings saying not to throw in a destructor, so updating the code to solve this seems like a good idea

kkm000 · 2019-03-04T19:56:40Z

@danpovey

What kind of testing have you done, that the output looks similar to the current output?

Besides compiling and running the code snippet in the bash loop, I checked the output of kaldi-error-test and, rather surprisingly, nnetbin/cuda-gpu-available, because it's output is formatted in an interesting way.

kkm000 · 2019-03-08T03:41:00Z

@danpovey, do you want me to ping somebody else to ask to review this (Karel, Yenda)? Or just need more time? I just want to make sure this does not fall of everyone's radar. I nearly forgot about this my own MR... : ) To me, it should be good to go, I checked it very thoroughly, as I do with all base stuff.

I have a few little fixes that are based on top of this.

src/base/io-funcs-inl.h

@cbtpkzm

* [build] Allow configure script to handle package-based OpenBLAS (kaldi-asr#2618) * [egs] updating local/make_voxceleb1.pl so that it works with newer versions of VoxCeleb1 (kaldi-asr#2684) * [egs,scripts] Remove unused --nj option from some scripts (kaldi-asr#2679) * [egs] Fix to tedlium v3 run.sh (rnnlm rescoring) (kaldi-asr#2686) * [scripts,egs] Tamil OCR with training data from yomdle and testing data from slam (kaldi-asr#2621) note: this data may not be publicly available at the moment. we'll work on that. * [egs] mini_librispeech: allow relative pathnames in download_and_untar.sh (kaldi-asr#2689) * [egs] Updating SITW recipe to account for changes to VoxCeleb1 (kaldi-asr#2690) * [src] Fix nnet1 proj-lstm bug where gradient clipping not used; thx:@cbtpkzm (kaldi-asr#2696) * [egs] Update aishell2 recipe to allow online decoding (no pitch for ivector) (kaldi-asr#2698) * [src] Make cublas and cusparse use per-thread streams. (kaldi-asr#2692) This will reduce synchronization overhead when we actually use multiple cuda devices in one process go down drastically, since we no longer synchronize on the legacy default stream. More details here: https://docs.nvidia.com/cuda/cuda-runtime-api/stream-sync-behavior.html * [src] improve handling of low-rank covariance in ivector-compute-lda (kaldi-asr#2693) * [egs] Changes to IAM handwriting-recognition recipe, including BPE encoding (kaldi-asr#2658) * [scripts] Make sure pitch is not included in i-vector feats, in online decoding preparation (kaldi-asr#2699) * [src] fix help message in post-to-smat (kaldi-asr#2703) * [scripts] Fix to steps/cleanup/debug_lexicon.sh (kaldi-asr#2704) * [egs] Cosmetic and file-mode fixes in HKUST recipe (kaldi-asr#2708) * [scripts] nnet1: remove the log-print of args in 'make_nnet_proto.py', thx:mythilisharan@gmail.com (kaldi-asr#2706) * [egs] update README in AISHELL-2 (kaldi-asr#2710) * [src] Make constructor of CuDevice private (kaldi-asr#2711) * [egs] fix sorting issue in aishell v1 (kaldi-asr#2705) * [egs] Add soft links for CNN+TDNN scripts (kaldi-asr#2715) * [build] Add missing packages in extras/check_dependencies.sh (kaldi-asr#2719) * [egs] madcat arabic: clean scripts, tuning, use 6-gram LM (kaldi-asr#2718) * [egs] Update WSJ run.sh: comment out outdated things, add run_tdnn.sh. (kaldi-asr#2723) * [scripts,src] Fix potential issue in scripts; minor fixes. (kaldi-asr#2724) The use of split() in latin-1 encoding (which might be used for other ASCII-compatible encoded data like utf-8) is not right because character 160 (expressed here in decimal) is a NBSP in latin-8 encoding and is also in the range UTF-8 uses for encoding. The same goes for strip(). Thanks @ChunChiehChang for finding the issue. * [egs] add example script for RNNLM lattice rescoring for WSJ recipe (kaldi-asr#2727) * [egs] add rnnlm example on tedlium+lm1b; add rnnlm rescoring results (kaldi-asr#2248) * [scripts] Small fix to utils/data/convert_data_dir_to_whole.sh (RE backups) (kaldi-asr#2735) * [src] fix memory bug in kaldi::~LatticeFasterDecoderTpl(), (kaldi-asr#2737) - found it when running 'latgen-faster-mapped-parallel', - core-dumps from the line: decoder/lattice-faster-decoder.cc:52 -- the line is doing 'delete &(FST*)', i.e. deleting the pointer to FST, instead of deleting the FST itslef, -- bug was probably introduced by refactoring commit d0c68a6 from 2018-09-01, -- after the change the code runs fine... (the unit tests for src/decoder are missing) * [egs] Remove per-utt option from nnet3/align scripts (kaldi-asr#2717) * [egs] Small Librispeech example fix, thanks: Yasasa Tennakoon. (kaldi-asr#2738) * [egs] Aishell2 recipe: turn off jieba's new word discovery in word segmentation (kaldi-asr#2740) * [egs] Add missing file local/join_suffix.py in TEDLIUM s5_r3; thx:anand@sayint.ai (kaldi-asr#2741) * [egs,scripts] Add Tunisian Arabic (MSA) recipe; cosmetic fixes to pbs.pl (kaldi-asr#2725) * [scripts] Fix missing import in utils/langs/grammar/augment_words_txt.py (kaldi-asr#2742) * [scripts] Fix build_const_arpa_lm.sh w.r.t. where <s> appears inside words (kaldi-asr#2745) * [scripts] Slight improvements to decode_score_fusion.sh usability (kaldi-asr#2746) * [build] update configure to support cuda 10 (kaldi-asr#2747) * [scripts] Fix bug in utils/data/resample_data_dir.sh (kaldi-asr#2749) * [scripts] Fix bug in cleanup after steps/cleanup/clean_and_segment_data*.sh (kaldi-asr#2750) * [egs] several updates of the tunisian_msa recipe (kaldi-asr#2752) * [egs] Small fix to Tunisian MSA TDNN script (RE train_stage) (kaldi-asr#2757) * [src,scripts] Batched nnet3 computation (kaldi-asr#2726) This PR adds the underlying utilities for much faster nnet3 inference on GPU, and a command-line binary (and script support) for nnet3 decoding and posterior computation. TBD: a binary for x-vector computation. This PR also contains unrelated decoder speedups (skipping range checks for transition ids... this may cause segfaults when graphs are mismatched). * [build] Add python3 compatibility to install scripts (kaldi-asr#2748) * [scripts] tfrnnlm: Modify TensorFlow flag format for compatibility with recent versions (kaldi-asr#2760) * [egs] fix old style perl regex in egs/chime1/s5/local/chime1_prepare_data.sh (kaldi-asr#2762) * [scripts] Fix bug in steps/cleanup/debug_lexicon.sh (kaldi-asr#2763) * [egs] Add example for Yomdle Farsi OCR (kaldi-asr#2702) * [scripts] debug_lexicon.sh: Fix bug introduced in kaldi-asr#2763. (kaldi-asr#2764) * [egs] add missing online cmvn config in aishell2 (kaldi-asr#2767) * [egs] Add CNN-TDNN-F script for Librispeech (kaldi-asr#2744) * [src] Some minor cleanup/fixes regarding CUDA memory allocation; other small fixes. (kaldi-asr#2768) * [scripts] Update reverberate_data_dir.py so that it works with python3 (kaldi-asr#2771) * [egs] Chime5: fix total number of words for WER calculation (kaldi-asr#2772) * [egs] RNNLMs on Tedlium w/ Google 1Bword: Increase epochs, update results (kaldi-asr#2775) * [scripts,egs] Added phonetisaurus-based g2p scripts (kaldi-asr#2730) Phonetisaurus is much faster to train then sequitur. * [egs] madcat arabic: clean scripts, tuning, rescoring, text localization (kaldi-asr#2716) * [scripts] Enhancements & minor bugfix to segmentation postprocessing (kaldi-asr#2776) * [src] Update gmm-decode-simple to accept ConstFst (kaldi-asr#2787) * [scripts] Update documentation of train_raw_dnn.py (kaldi-asr#2785) * [src] nnet3: extend what descriptors can be parsed. (kaldi-asr#2780) * [src] Small fix to 'fstrand' (make sure args are parsed) (kaldi-asr#2777) * [src,scripts] Minor, mostly cosmetic updates (kaldi-asr#2788) * [src,scripts] Add script to compare alignment directories. (kaldi-asr#2765) * [scripts] Small fixes to script usage messages, etc. (kaldi-asr#2789) * [egs] Update ami_download.sh after changes on Edinburgh website. (kaldi-asr#2769) * [scripts] Update compare_alignments.sh to allow different lang dirs. (kaldi-asr#2792) * [scripts] Change make_rttm.py so output is in determinstic order (kaldi-asr#2794) * [egs] Fixes to yomdle_zh RE encoding direction, etc. (kaldi-asr#2791) * [src] Add support for context independent phones in gmm-init-biphone (for e2e) (kaldi-asr#2779) * [egs] Simplifying multi-condition version of AMI recipe (kaldi-asr#2800) * [build] Fix openblas build for aarch64 (kaldi-asr#2806) * [build] Make CUDA_ARCH configurable at configure-script level (kaldi-asr#2807) * [src] Print maximum memory stats in CUDA allocator (kaldi-asr#2799) * [src,scripts] Various minor code cleanups (kaldi-asr#2809) * [scripts] Fix handling of UTF-8 in filenames, in wer_per_spk_details.pl (kaldi-asr#2811) * [egs] Update AMI chain recipes (kaldi-asr#2817) * [egs] Improvements to multi_en tdnn-opgru/lstm recipes (kaldi-asr#2824) * [scripts] Fix initial prob of silence when lexicon has silprobs. Thx:@agurianov (kaldi-asr#2823) * [scripts,src] Fix to multitask nnet3 training (kaldi-asr#2818); cosmetic code change. (kaldi-asr#2827) * [scripts] Create shared versions of get_ctm_conf.sh, add get_ctm_conf_fast.sh (kaldi-asr#2828) * [src] Use cuda streams in matrix library (kaldi-asr#2821) * [egs] Add online-decoding recipe to aishell1 (kaldi-asr#2829) * [egs] Add DIHARD 2018 diarization recipe. (kaldi-asr#2822) * [egs] add nnet3 online result for aishell1 (kaldi-asr#2836) * [scripts] RNNLM scripts: don't die when features.txt is not present (kaldi-asr#2837) * [src] Optimize cuda allocator for multi-threaded case (kaldi-asr#2820) * [build] Add cub library for cuda projects (kaldi-asr#2819) not needed now but will be in future. * [src] Make Cuda allocator statistics visible to program (kaldi-asr#2835) * [src] Fix bug affecting scale in GeneralDropoutComponent (non-continuous case) (kaldi-asr#2815) * [build] FIX kaldi-asr#2842: properly check $use_cuda against false. (kaldi-asr#2843) * [doc] Add note about OOVs to data-prep. (kaldi-asr#2844) * [scripts] Allow segmentation with nnet3 chain models (kaldi-asr#2845) * [build] Remove -lcuda from cuda makefiles which breaks operation when no driver present (kaldi-asr#2851) * [scripts] Fix error in analyze_lats.sh for long lattices (replace awk with perl) (kaldi-asr#2854) * [egs] add rnnlm recipe for librispeech (kaldi-asr#2830) * [build] change configure version from 9 to 10 (kaldi-asr#2853) (kaldi-asr#2855) * [src] fixed compilation errors when built with --DOUBLE_PRECISION=1 (kaldi-asr#2856) * [build] Clarify instructions if cub is not found (kaldi-asr#2858) * [egs] Limit MFCC feature extraction job number in Dihard recipe (kaldi-asr#2865) * [egs] Added Bentham handwriting recognition recipe (kaldi-asr#2846) * [src] Share roots of different tones of phones aishell (kaldi-asr#2859) * [egs] Fix path to sequitur in commonvoice egs (kaldi-asr#2868) * [egs] Update reverb recipe (kaldi-asr#2753) * [scripts] Fix error while analyzing lattice (parsing bugs) (kaldi-asr#2873) * [src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe (kaldi-asr#2872) * [egs] TIMIT: fix mac compatibility of sed command (kaldi-asr#2874) * [egs] mini_librispeech: fixing some bugs and limiting repeated downloads (kaldi-asr#2861) * [src,scripts,egs] Speedups to GRU-based networks (special components) (kaldi-asr#2712) * [src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876) * Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877) This reverts commit 84435ff. * Revert "Revert "[src] Fix infinite recursion with -DDOUBLE_PRECISION=1. Thx: @hwiorn (kaldi-asr#2875) (kaldi-asr#2876)" (kaldi-asr#2877)" (kaldi-asr#2878) This reverts commit b196b7f. * Revert "[src] Fix memory leak in OnlineCacheFeature; thanks @Worldexe" (kaldi-asr#2882) the fix was buggy. apologies. * [src] Remove unused code that caused Windows compile failure. Thx:@btiplitz (kaldi-asr#2881) * [src] Really fix memory leak in online decoding; thx:@Worldexe (kaldi-asr#2883) * [src] Fix Windows cuda build failure (use C++11 standard include) (kaldi-asr#2880) * [src] Add #include that caused build failure on Windows (kaldi-asr#2886) * [scripts] Fix max duration check in sad_to_segments.py (kaldi-asr#2889) * [scripts] Fix speech duration calculation in sad_to_segments.py (kaldi-asr#2891) * [src] Fix Windows build problem (timer.h) (kaldi-asr#2888) * [egs] add HUB4 spanish tdnn-f and cnn-tdnn script (kaldi-asr#2895) * [egs] Fix Aishell2 dict prepare bug; should not affect results (kaldi-asr#2890) * [egs] Self-contained example for KWS for mini_librispeech (kaldi-asr#2887) * [egs,scripts] Fix bugs in Dihard 2018 (kaldi-asr#2897) * [scripts] Check last character of files to match with newline (kaldi-asr#2898) * [egs] Update Librispeech RNNLM results; use correct training data (kaldi-asr#2900) * [scripts] RNNLM: old iteration model cleanup; save space (kaldi-asr#2885) * [scripts] Make prepare_lang.sh cleanup beforehand (prevents certain failures) (kaldi-asr#2906) * [scripts] Expose dim-range-node at xconfig level (kaldi-asr#2903) * [scripts] Fix bug related to multi-task in train_raw_rnn.py (kaldi-asr#2907) [scripts] Fix bug related to multi-task in train_raw_rnn.py. Thx:tessfu2001@gmail.com * [scripts] Cosmetic fix/clarification to utils/prepare_lang.sh (kaldi-asr#2912) * [scripts,egs] Added a new lexicon learning (adaptation) recipe for tedlium, in accordance with the IS17 paper. (kaldi-asr#2774) * [egs] TDNN+LSTM example scripts, with RNNLM, for Librispeech (kaldi-asr#2857) * [src] cosmetic fix in nnet1 code (kaldi-asr#2921) * [src] Fix incorrect invocation of mutex in nnet-batch-compute code (kaldi-asr#2932) * [egs,minor] Fix typo in comment in voxceleb script (kaldi-asr#2926) * [src,egs] Mostly cosmetic changes; add some missing includes (kaldi-asr#2936) * [egs] Fix path of rescoring binaries used in tfrnnlm scripts (kaldi-asr#2941) * [src] Fix bug in nnet3-latgen-faster-batch for determinize=false (kaldi-asr#2945) thx: Maxim Korenevsky. * [egs] Add example for rimes handwriting database; Madcat arabic script cleanup (kaldi-asr#2935) * [egs] Add scripts for yomdle korean (kaldi-asr#2942) * [build] Refactor/cleanup build system, easier build on ubuntu 18.04. (kaldi-asr#2947) note: if this breaks someone's build we'll have to debug it then. * [scripts,egs] Changes for Python 2/3 compatibility (kaldi-asr#2925) * [egs] Add more modern DNN recipe for fisher_callhome_spanish (kaldi-asr#2951) * [scripts] switch from bc to perl to reduce dependencies (diarization scripts) (kaldi-asr#2956) * [scripts] Further fix for Python 2/3 compatibility (kaldi-asr#2957) * [egs] Remove no-longer-existing option in tedlium_r3 recipe (kaldi-asr#2959) * [build] Handle dependencies for .cu files in addition to .cc files (kaldi-asr#2944) * [src] remove duplicate test mode option from class GeneralDropoutComponent (kaldi-asr#2960) * [egs] Fix minor bugs in WSJ's flat-start/e2e recipe (kaldi-asr#2968) * [egs] Fix to BSD compatibility of TIMIT data prep (kaldi-asr#2966) * [scripts] Fix RNNLM training script problem (chunk_length was ignored) (kaldi-asr#2969) * [src] Fix bug in lattice-1best.cc RE removing insertion penalty (kaldi-asr#2970) * [src] Compute a separate avg (start, end) interval for each sausage word (kaldi-asr#2972) * [build] Move nvcc verbose flag to proper location (kaldi-asr#2962) * [egs] Fix mini_librispeech download_lm.sh crash; thx:chris.keith.johnson@gmail.com (kaldi-asr#2974) * [egs] minor fixes related to python2 vs python3 differences (kaldi-asr#2977) * [src] Small fix in test code, avoid spurious failure (kaldi-asr#2978) * [egs] Fix CSJ data-prep; minor path fix for USB version of data (kaldi-asr#2979) * [egs] Add paper ref to README.txt in reverb example (kaldi-asr#2982) * [egs] Minor fixes to sitw recipe (fix problem introdueced in kaldi-asr#2925) (kaldi-asr#2985) * [scripts] Fix bug introduced in kaldi-asr#2957, RE integer division (kaldi-asr#2986) * [egs] Update WSJ flat-start chain recipes to use TDNN-F not TDNN+LSTM (kaldi-asr#2988) * [scripts] Fix typo introduced in kaldi-asr#2925 (kaldi-asr#2989) * [build] Modify Makefile and travis script to fix Travis failures (kaldi-asr#2987) * [src] Simplification and efficiency improvement in ivector-plda-scoring-dense (kaldi-asr#2991) * [egs] Update madcat Arabic and Chinese egs, IAM (kaldi-asr#2964) * [src] Fix overflow bug in convolution code (kaldi-asr#2992) * [src] Fix nan issue in ctm times introduced in kaldi-asr#2972, thx: @vesis84 (kaldi-asr#2993) * [src] Fix 'sausage-time' issue which occurs with disabled MBR decoding. (kaldi-asr#2996) * [egs] Add scripts for yomdle Russian (OCR task) (kaldi-asr#2953) * [egs] Simplify lexicon preparation in Fisher callhome Spanish (kaldi-asr#2999) * [egs] Update GALE Arabic recipe (kaldi-asr#2934) * [egs] Remove outdated NN results from Gale Arabic recipe (kaldi-asr#3002) * [egs] Add RESULTS file for the tedlium s5_r3 (release 3) setup (kaldi-asr#3003) * [src] Fixes to grammar-fst code to handle LM-disambig symbols properly (kaldi-asr#3000) thanks: armando.muscariello@gmail.com * [src] Cosmetic change to mel computation (fix option string) (kaldi-asr#3011) * [src] Fix Visual Studio error due to alternate syntactic form of noreturn (kaldi-asr#3018) * [egs] Fix location of sequitur installation (kaldi-asr#3017) * [src] Fix w/ ifdef Visual Studio error from alternate syntactic form noreturn (kaldi-asr#3020) * [egs] Some fixes to getting data in heroico recipe (kaldi-asr#3021) * [egs] BABEL script fix: avoid make_L_align.sh generating invalid files (kaldi-asr#3022) * [src] Fix to older online decoding code in online/ (OnlineFeInput; was broken by commit cc2469e). (kaldi-asr#3025) * [script] Fix unset bash variable in make_mfcc.sh (kaldi-asr#3030) * [scripts] Extend limit_num_gpus.sh to support --num-gpus 0. (kaldi-asr#3027) * [scripts] fix bug in utils/add_lex_disambig.pl when sil-probs and pron-probs used (kaldi-asr#3033) bug would likely have resulted in determinization failure (only when not using word-position-dependent phones). * [egs] Fix path in Tedlium r3 rnnlm training script (kaldi-asr#3039) * [src] Thread-safety for GrammarFst (thx:armando.muscariello@gmail.com) (kaldi-asr#3040) * [scripts] Cosmetic fix to get_degs.sh (kaldi-asr#3045) * [egs] Small bug fixes for IAM and UW3 recipes (kaldi-asr#3048) * [scripts] Nnet3 segmentation: fix default params (kaldi-asr#3051) * [scripts] Allow perturb_data_dir_speed.sh to work with utt2lang (kaldi-asr#3055) * [scripts] Make beam in monophone training configurable (kaldi-asr#3057) * [scripts] Allow reverberate_data_dir.py to support unicode filenames (kaldi-asr#3060) * [scripts] Make some cleanup scripts work with python3 (kaldi-asr#3054) * [scripts] bug fix to nnet2->3 conversion, fixes kaldi-asr#886 (kaldi-asr#3071) * [src] Make copies occur in per-thread default stream (for GPUs) (kaldi-asr#3068) * [src] Add GPU version of MergeTaskOutput().. relates to batch decoding (kaldi-asr#3067) * [src] Add device options to enable tensor core math mode. (kaldi-asr#3066) * [src] Log nnet3 computation to VLOG, not std::cout (kaldi-asr#3072) * [src] Allow upsampling in compute-mfcc-feats, etc. (kaldi-asr#3014) * [src] fix problem with rand_r being undefined on Android (kaldi-asr#3037) * [egs] Update swbd1_map_words.pl, fix them_1's -> them's (kaldi-asr#3052) * [src] Add const overload OnlineNnet2FeaturePipeline::IvectorFeature (kaldi-asr#3073) * [src] Fix syntax error in egs/bn_music_speech/v1/local/make_musan.py (kaldi-asr#3074) * [src] Memory optimization for online feature extraction of long recordings (kaldi-asr#3038) * [build] fixed a bug in linux_configure_redhat_fat when use_cuda=no (kaldi-asr#3075) * [scripts] Add missing '. ./path.sh' to get_utt2num_frames.sh (kaldi-asr#3076) * [src,scripts,egs] Add count-based biphone tree tying for flat-start chain training (kaldi-asr#3007) * [scripts,egs] Remove sed from various scripts (avoid compatibility problems) (kaldi-asr#2981) * [src] Rework error logging for safety and cleanliness (kaldi-asr#3064) * [src] Change warp-synchronous to cub::BlockReduce (safer but slower) (kaldi-asr#3080) * [src] Fix && and || uses where & and | intended, and other weird errors (kaldi-asr#3087) * [build] Some fixes to Makefiles (kaldi-asr#3088) clang is unhappy with '-rdynamic' in compile-only step, and the switch is really unnecessary. Also, the default location for MKL 64-bit libraries is intel64/. The em64t/ was explained already obsolete by an Intel rep in 2010: https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/285973 * [src] Fixed -Wreordered warnings in feat (kaldi-asr#3090) * [egs] Replace bc with perl -e (kaldi-asr#3093) * [scripts] Fix python3 compatibility issue in data-perturbing script (kaldi-asr#3084) * [doc] fix some typos in doc. (kaldi-asr#3097) * [build] Make sure expf() speed probe times sensibly (kaldi-asr#3089) * [scripts] Make sure merge_targets.py works in python3 (kaldi-asr#3094) * [src] ifdef to fix compilation failure on CUDA 8 and earlier (kaldi-asr#3103) * [doc] fix typos and broken links in doc. (kaldi-asr#3102) * [scripts] Fix frame_shift bug in egs/swbd/s5c/local/score_sclite_conf.sh (kaldi-asr#3104) * [src] Fix wrong assertion failure in nnet3-am-compute (kaldi-asr#3106) * [src] Cosmetic changes to natural-gradient code (kaldi-asr#3108) * [src,scripts] Python2 compatibility fixes and code cleanup for nnet1 (kaldi-asr#3113) * [doc] Small documentation fixes; update on Kaldi history (kaldi-asr#3031) * [src] Various mostly-cosmetic changes (copying from another branch) (kaldi-asr#3109) * [scripts] Simplify text encoding in RNNLM scripts (now only support utf-8) (kaldi-asr#3065) * [egs] Add "formosa_speech" recipe (Taiwanese Mandarin ASR) (kaldi-asr#2474) * [egs] python3 compatibility in csj example script (kaldi-asr#3123) * [egs] python3 compatibility in example scripts (kaldi-asr#3126) * [scripts] Bug-fix for removing deleted words (kaldi-asr#3116) The type of --max-deleted-words-kept-when-merging in segment_ctm_edits.py was a string, which prevented the mechanism from working altogether. * [scripts] Add fix regarding num-jobs for segment_long_utterances*.sh(kaldi-asr#3130) * [src] Enable allow_{upsample,downsample} with online features (kaldi-asr#3139) * [src] Fix bad assert in fstmakecontextsyms (kaldi-asr#3142) * [src] Fix to "Fixes to grammar-fst & LM-disambig symbols" (kaldi-asr#3000) (kaldi-asr#3143) * [build] Make sure PaUtils exported from portaudio (kaldi-asr#3144) * [src] cudamatrix: fixing a synchronization bug in 'normalize-per-row' (kaldi-asr#3145) was only apparent using large matrices * [src] Fix typo in comment (kaldi-asr#3147) * [src] Add binary that functions as a TCP server (kaldi-asr#2938) * [scripts] Fix bug in comment (kaldi-asr#3152) * [scripts] Fix bug in steps/segmentation/ali_to_targets.sh (kaldi-asr#3155) * [scripts] Avoid holding out more data than the requested num-utts (due to utt2uniq) (kaldi-asr#3141) * [src,scripts] Add support for two-pass agglomerative clustering. (kaldi-asr#3058) * [src] Disable unget warning in PeekToken (and other small fix) (kaldi-asr#3163) * [build] Add new nvidia tools to windows build (kaldi-asr#3159) * [doc] Fix documentation errors and add more docs for tcp-server decoder (kaldi-asr#3164)

danpovey reviewed Mar 2, 2019

View reviewed changes

kkm added 2 commits March 3, 2019 01:43

kkm000 force-pushed the 19-kaldi-error branch from 288adfd to 302d232 Compare March 3, 2019 09:58

btiplitz reviewed Mar 4, 2019

View reviewed changes

danpovey reviewed Mar 8, 2019

View reviewed changes

src/base/io-funcs-inl.h Show resolved Hide resolved

danpovey merged commit 2f95609 into kaldi-asr:master Mar 8, 2019

kkm000 deleted the 19-kaldi-error branch March 11, 2019 02:25

chenzhehuai mentioned this pull request Jun 4, 2019

update (#32) chenzhehuai/kaldi#33

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[src] Rework error logging for safety and cleanliness #3064

[src] Rework error logging for safety and cleanliness #3064

kkm000 commented Mar 2, 2019

danpovey Mar 2, 2019

kkm000 Mar 3, 2019

kkm000 Mar 3, 2019

btiplitz Mar 4, 2019

kkm000 Mar 4, 2019

danpovey commented Mar 2, 2019

danpovey commented Mar 3, 2019 via email

kkm000 commented Mar 3, 2019

kkm000 commented Mar 3, 2019

danpovey commented Mar 3, 2019 via email

btiplitz Mar 4, 2019 •

edited

Loading

kkm000 Mar 4, 2019

btiplitz Mar 4, 2019

btiplitz Mar 4, 2019

kkm000 Mar 4, 2019

btiplitz Mar 4, 2019

kkm000 Mar 4, 2019

btiplitz Mar 4, 2019

kkm000 Mar 4, 2019

btiplitz Mar 4, 2019

kkm000 Mar 4, 2019

kkm000 Mar 4, 2019

btiplitz commented Mar 4, 2019

kkm000 commented Mar 4, 2019 •

edited

Loading

kkm000 commented Mar 8, 2019

[src] Rework error logging for safety and cleanliness #3064

[src] Rework error logging for safety and cleanliness #3064

Conversation

kkm000 commented Mar 2, 2019

Short summary

1. Throwing destructors are ugly

2. KALDI_LOG << x is void

3. KALDI_ERROR throws KaldiFatalError

4. Most uses of std::endl in Kaldi codebase are incorrect

5. Stack trace could contain empty strings on demangling failure

6. Commentary on KALDI_ASSERT branch prediction was dubious

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpovey commented Mar 2, 2019

danpovey commented Mar 3, 2019 via email

kkm000 commented Mar 3, 2019

kkm000 commented Mar 3, 2019

danpovey commented Mar 3, 2019 via email

btiplitz Mar 4, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

btiplitz commented Mar 4, 2019

kkm000 commented Mar 4, 2019 • edited Loading

kkm000 commented Mar 8, 2019

2. `KALDI_LOG << x` is void

3. `KALDI_ERROR` throws `KaldiFatalError`

4. Most uses of `std::endl` in Kaldi codebase are incorrect

6. Commentary on `KALDI_ASSERT` branch prediction was dubious

btiplitz Mar 4, 2019 •

edited

Loading

kkm000 commented Mar 4, 2019 •

edited

Loading