no entry found for key #260

djstrong · 2020-05-06T10:01:18Z

I am finetuning a model and for some hyperparameters (different number of epochs or learning rate) I get an error:

thread '<unnamed>' panicked at 'no entry found for key', /rustc/6d0e58bff88f620c1a4f641a627f046bf4cde4ad/src/libstd/collections/hash/map.rs:1023:9
stack backtrace:
   0: backtrace::backtrace::libunwind::trace
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/libunwind.rs:86
   1: backtrace::backtrace::trace_unsynchronized
             at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.44/src/backtrace/mod.rs:66
   2: std::sys_common::backtrace::_print_fmt
             at src/libstd/sys_common/backtrace.rs:78
   3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
             at src/libstd/sys_common/backtrace.rs:59
   4: core::fmt::write
             at src/libcore/fmt/mod.rs:1052
   5: std::io::Write::write_fmt
             at src/libstd/io/mod.rs:1428
   6: std::sys_common::backtrace::_print
             at src/libstd/sys_common/backtrace.rs:62
   7: std::sys_common::backtrace::print
             at src/libstd/sys_common/backtrace.rs:49
   8: std::panicking::default_hook::{{closure}}
             at src/libstd/panicking.rs:204
   9: std::panicking::default_hook
             at src/libstd/panicking.rs:224
  10: std::panicking::rust_panic_with_hook
             at src/libstd/panicking.rs:470
  11: rust_begin_unwind
             at src/libstd/panicking.rs:378
  12: core::panicking::panic_fmt
             at src/libcore/panicking.rs:85
  13: core::option::expect_failed
             at src/libcore/option.rs:1203
  14: serde::ser::Serializer::collect_map
  15: <tokenizers::models::bpe::model::BPE as tokenizers::tokenizer::Model>::save
  16: tokenizers::models::__init2023689508296420652::__init2023689508296420652::__wrap
  17: _PyMethodDef_RawFastCallKeywords
             at Objects/call.c:694
  18: _PyCFunction_FastCallKeywords
             at Objects/call.c:734
  19: call_function
             at Python/ceval.c:4568
  20: _PyEval_EvalFrameDefault
             at Python/ceval.c:3139
  21: _PyEval_EvalCodeWithName
             at Python/ceval.c:3930
  22: _PyFunction_FastCallKeywords
             at Objects/call.c:433
  23: call_function
             at Python/ceval.c:4616
  24: _PyEval_EvalFrameDefault
             at Python/ceval.c:3110
  25: function_code_fastcall
             at Objects/call.c:283
  26: _PyFunction_FastCallKeywords
             at Objects/call.c:408
  27: call_function
             at Python/ceval.c:4616
  28: _PyEval_EvalFrameDefault
             at Python/ceval.c:3110
  29: function_code_fastcall
             at Objects/call.c:283
  30: _PyFunction_FastCallKeywords
             at Objects/call.c:408
  31: call_function
             at Python/ceval.c:4616
  32: _PyEval_EvalFrameDefault
             at Python/ceval.c:3110
  33: _PyEval_EvalCodeWithName
             at Python/ceval.c:3930
  34: _PyFunction_FastCallKeywords
             at Objects/call.c:433
  35: call_function
             at Python/ceval.c:4616
  36: _PyEval_EvalFrameDefault
             at Python/ceval.c:3139
  37: _PyEval_EvalCodeWithName
             at Python/ceval.c:3930
  38: _PyFunction_FastCallDict
             at Objects/call.c:376
  39: _PyObject_Call_Prepend
             at Objects/call.c:908
  40: PyObject_Call
             at Objects/call.c:245
  41: do_call_core
             at Python/ceval.c:4645
  42: _PyEval_EvalFrameDefault
             at Python/ceval.c:3191
  43: _PyEval_EvalCodeWithName
             at Python/ceval.c:3930
  44: _PyFunction_FastCallKeywords
             at Objects/call.c:433
  45: call_function
             at Python/ceval.c:4616
  46: _PyEval_EvalFrameDefault
             at Python/ceval.c:3139
  47: _PyEval_EvalCodeWithName
             at Python/ceval.c:3930
  48: PyEval_EvalCodeEx
             at Python/ceval.c:3959
  49: PyEval_EvalCode
             at Python/ceval.c:524
  50: run_mod
             at Python/pythonrun.c:1035
  51: PyRun_FileExFlags
             at Python/pythonrun.c:988
  52: PyRun_SimpleFileExFlags
             at Python/pythonrun.c:429
  53: pymain_run_file
             at Modules/main.c:427
  54: pymain_run_filename
             at Modules/main.c:1606
  55: pymain_run_python
             at Modules/main.c:2867
  56: pymain_main
             at Modules/main.c:3028
  57: _Py_UnixMain
             at Modules/main.c:3063
  58: __libc_start_main
  59: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
fatal runtime error: failed to initiate panic, error 5
/var/spool/slurmd/job18823086/slurm_script: line 29:  4069 Aborted                 $1

Version: 0.5.2

The text was updated successfully, but these errors were encountered:

djstrong · 2020-05-10T10:30:21Z

It was a problem with vocab.json missing one entry.

ecchochan · 2020-05-19T08:49:09Z

~~I encountered this issue too due to special_tokens contains tokens that does not exists in the corpus (as I was testing it out using a small corpus)~~

I encountered this issue too due to having a "<n>" in special_tokens .

:3

Why?

GarethAusten · 2020-06-04T00:24:27Z

I'm also having this issue, is there a way to know what entry is missing? It seems to save the vocab.json file but fails on the merges.txt file.

n1t0 · 2020-06-04T02:55:53Z

This should be fixed with the latest release 0.8.0.dev2, can you confirm?

GarethAusten · 2020-06-04T12:47:20Z

Yes, it's working in 0.8.0.dev2. Thanks for the quick response!

GarethAusten · 2020-06-04T15:16:59Z

tl:dr; use save_model instead of save in 0.8.0.dev2

Leaving this comment here in case anyone else stumbles upon this issue but the functionality between 0.7.0 and 0.8.0.dev2 is different. In 0.8.0.dev2 the save method for the tokenizer seems to save the whole tokenizer object while save_model in 0.8.0.dev2 seems to perform the same way as save in 0.7.0.

This is a little confusing because if you want to load a saved tokenizer the tokenizer object expects a merges file and vocab file. I don't see any functionality to load the whole object as saved so it seems the best bet is to use save_model in 0.8.0.dev2.

n1t0 · 2020-06-04T18:42:35Z

If you want to load a model (BPE for example), then save_model will save the needed files to load a model directly. You can then load a saved BPE using BPE(vocab_file, merges_file).

But if you want to load the whole tokenizer, then you can do

tokenizer = Tokenizer.from_file(tokenizer_file)

This file will have been saved with save. Does it make sense?

djstrong changed the title ~~fatal runtime error~~ no entry found for key May 6, 2020

djstrong closed this as completed May 10, 2020

JTWang2000 mentioned this issue Jul 20, 2021

Cannot reproduce fine tuning on ChnSentiCorp ShannonAI/ChineseBert#13

Closed

gchhablani mentioned this issue Nov 2, 2021

Fast tokenizer converter leads to PanicException: no entry found for key huggingface/transformers#14252

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

no entry found for key #260

no entry found for key #260

djstrong commented May 6, 2020

djstrong commented May 10, 2020 •

edited

Loading

ecchochan commented May 19, 2020 •

edited

Loading

GarethAusten commented Jun 4, 2020

n1t0 commented Jun 4, 2020

GarethAusten commented Jun 4, 2020

GarethAusten commented Jun 4, 2020 •

edited

Loading

n1t0 commented Jun 4, 2020

no entry found for key #260

no entry found for key #260

Comments

djstrong commented May 6, 2020

djstrong commented May 10, 2020 • edited Loading

ecchochan commented May 19, 2020 • edited Loading

GarethAusten commented Jun 4, 2020

n1t0 commented Jun 4, 2020

GarethAusten commented Jun 4, 2020

GarethAusten commented Jun 4, 2020 • edited Loading

n1t0 commented Jun 4, 2020

djstrong commented May 10, 2020 •

edited

Loading

ecchochan commented May 19, 2020 •

edited

Loading

GarethAusten commented Jun 4, 2020 •

edited

Loading