Unable to run BertChecker #77

ParadaCarleton · 2022-09-23T02:33:18Z

pc.py"""
       from neuspell import BertChecker
       
       checker = BertChecker(pretrained=True)
       """
loading vocab from path:/home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/../data/checkpoints/subwordbert-probwordnoise/vocab.pkl
initializing model
Some weights of the model checkpoint at bert-base-cased were not used when initializing BertModel: ['cls.predictions.transform.LayerNorm.weight', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.weight', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias']
- This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
SubwordBert(
  (bert_dropout): Dropout(p=0.2, inplace=False)
  (bert_model): BertModel(
    (embeddings): BertEmbeddings(
      (word_embeddings): Embedding(28996, 768, padding_idx=0)
      (position_embeddings): Embedding(512, 768)
      (token_type_embeddings): Embedding(2, 768)
      (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
      (dropout): Dropout(p=0.1, inplace=False)
    )
    (encoder): BertEncoder(
      (layer): ModuleList(
        (0): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (1): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (2): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (3): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (4): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (5): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (6): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (7): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (8): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (9): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (10): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
        (11): BertLayer(
          (attention): BertAttention(
            (self): BertSelfAttention(
              (query): Linear(in_features=768, out_features=768, bias=True)
              (key): Linear(in_features=768, out_features=768, bias=True)
              (value): Linear(in_features=768, out_features=768, bias=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
            (output): BertSelfOutput(
              (dense): Linear(in_features=768, out_features=768, bias=True)
              (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
              (dropout): Dropout(p=0.1, inplace=False)
            )
          )
          (intermediate): BertIntermediate(
            (dense): Linear(in_features=768, out_features=3072, bias=True)
            (intermediate_act_fn): GELUActivation()
          )
          (output): BertOutput(
            (dense): Linear(in_features=3072, out_features=768, bias=True)
            (LayerNorm): LayerNorm((768,), eps=1e-12, elementwise_affine=True)
            (dropout): Dropout(p=0.1, inplace=False)
          )
        )
      )
    )
    (pooler): BertPooler(
      (dense): Linear(in_features=768, out_features=768, bias=True)
      (activation): Tanh()
    )
  )
  (dense): Linear(in_features=768, out_features=100002, bias=True)
  (criterion): CrossEntropyLoss()
)
185211810
loading pretrained weights from path:/home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/../data/checkpoints/subwordbert-probwordnoise
Loading model params from checkpoint dir: /home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/../data/checkpoints/subwordbert-probwordnoise
ERROR: PyError ($(Expr(:escape, :(ccall(#= /home/lime/.julia/packages/PyCall/ygXW2/src/pyeval.jl:38 =# @pysym(:PyEval_EvalCode), PyPtr, (PyPtr, PyPtr, PyPtr), o, globals, locals))))) <class '_pickle.UnpicklingError'>
UnpicklingError("invalid load key, '<'.")
  File "/home/lime/.julia/packages/PyCall/ygXW2/src/pyeval.jl", line 3, in <module>
    const Py_eval_input = 258
  File "/home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/corrector_subwordbert.py", line 32, in __init__
    self.from_pretrained(self.ckpt_path)
  File "/home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/corrector_subwordbert.py", line 49, in from_pretrained
    self.model = load_pretrained(self.model, self.weights_path, device=self.device)
  File "/home/lime/.julia/conda/3/lib/python3.9/site-packages/neuspell/seq_modeling/subwordbert.py", line 23, in load_pretrained
    checkpoint_data = torch.load(os.path.join(checkpoint_path, "model.pth.tar"), map_location=map_location)
  File "/home/lime/.julia/conda/3/lib/python3.9/site-packages/torch/serialization.py", line 713, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/home/lime/.julia/conda/3/lib/python3.9/site-packages/torch/serialization.py", line 920, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)

The text was updated successfully, but these errors were encountered:

nishan-chatterjee · 2022-09-28T00:37:08Z

Same issue as above:

grammar_correction.ipynb Cell 3 in <cell line: 6>()
4 """ select spell checkers & load """
5 checker = BertChecker()
----> 6 checker.from_pretrained()

File ~/.local/lib/python3.10/site-packages/neuspell/corrector_subwordbert.py:49, in CorrectorSubwordBert.from_pretrained(self, ckpt_path, vocab, weights)
47 self.weights_path = weights if weights else self.ckpt_path
48 print(f"loading pretrained weights from path:{self.weights_path}")
---> 49 self.model = load_pretrained(self.model, self.weights_path, device=self.device)
50 return

File ~/.local/lib/python3.10/site-packages/neuspell/seq_modeling/subwordbert.py:23, in load_pretrained(model, checkpoint_path, optimizer, device)
21 map_location = 'cpu'
22 print(f"Loading model params from checkpoint dir: {checkpoint_path}")
---> 23 checkpoint_data = torch.load(os.path.join(checkpoint_path, "model.pth.tar"), map_location=map_location)
24 # print(f"previously model saved at : {checkpoint_data['epoch_id']}")
26 model.load_state_dict(checkpoint_data['model_state_dict'])

File ~/miniconda3/envs/gnnVD/lib/python3.10/site-packages/torch/serialization.py:713, in load(f, map_location, pickle_module, **pickle_load_args)
711 return torch.jit.load(opened_file)
712 return _load(opened_zipfile, map_location, pickle_module, **pickle_load_args)
--> 713 return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
...
--> 920 magic_number = pickle_module.load(f, **pickle_load_args)
921 if magic_number != MAGIC_NUMBER:
922 raise RuntimeError("Invalid magic number; corrupt file?")

UnpicklingError: invalid load key, '<'.

pszemraj · 2022-10-10T02:05:34Z

same here

anjalibshah · 2022-10-13T21:24:34Z

Was this resolved? I am seeing the same error

richard-sobey · 2022-10-17T11:03:12Z

Also experiencing this error. Anyone figure out a workaround?

pszemraj · 2022-10-17T11:30:15Z

Also experiencing this error. Anyone figure out a workaround?

@richard-sobey I've found that if you load the model files from somewhere that is not their google drive it works - I uploaded one to huggingface under pszemraj/neuspell-scrnn-probwordnoise

Getting LFS to work on Colab can be a bit tricky so I would try from somewhere else, but if I have time later this week or next maybe I can put an example notebook in the model card

richard-sobey · 2022-10-17T12:09:20Z

@pszemraj cheers mate this worked for me

Did you simply upload the model files from their google drive to HF? Keen to try get the Bert implementation working

pszemraj · 2022-10-17T14:02:06Z

@richard-sobey if I remember correctly: I think so, or a previous version that was not corrupted. Something like that. Also, I uploaded this one which works with BertChecker()

The reason why 'if I remember' is because I've been using this workaround since January/February this year but used to have the backups on my dropbox 😂 since it's still not fixed, I figured I would post them

richard-sobey · 2022-10-17T23:41:17Z

@pszemraj you're a legend thank you!

michaelliangau · 2022-11-11T00:15:07Z

In case other are like me and looking for the checkpoints, the original google drive folder with all models are here.

KSSSenapati · 2023-06-25T08:54:40Z

#36 (comment)

Please refer to this comment for solving this issue
Similarly, for other models you can use the checkpoints from here - https://drive.google.com/drive/folders/1jgNpYe4TVSF4mMBVtFh4QfB2GovNPdh7

zehrackgl · 2024-01-03T19:29:52Z

@richard-sobey if I remember correctly: I think so, or a previous version that was not corrupted. Something like that. Also, I uploaded this one which works with BertChecker()

The reason why 'if I remember' is because I've been using this workaround since January/February this year but used to have the backups on my dropbox 😂 since it's still not fixed, I figured I would post them

hi, i have the same error how did you fix this error? Can you more understandable please, thank you in advance

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to run BertChecker #77

Unable to run BertChecker #77

ParadaCarleton commented Sep 23, 2022

nishan-chatterjee commented Sep 28, 2022

pszemraj commented Oct 10, 2022

anjalibshah commented Oct 13, 2022

richard-sobey commented Oct 17, 2022

pszemraj commented Oct 17, 2022

richard-sobey commented Oct 17, 2022

pszemraj commented Oct 17, 2022

richard-sobey commented Oct 17, 2022

michaelliangau commented Nov 11, 2022

KSSSenapati commented Jun 25, 2023

zehrackgl commented Jan 3, 2024

Unable to run BertChecker #77

Unable to run BertChecker #77

Comments

ParadaCarleton commented Sep 23, 2022

nishan-chatterjee commented Sep 28, 2022

pszemraj commented Oct 10, 2022

anjalibshah commented Oct 13, 2022

richard-sobey commented Oct 17, 2022

pszemraj commented Oct 17, 2022

richard-sobey commented Oct 17, 2022

pszemraj commented Oct 17, 2022

richard-sobey commented Oct 17, 2022

michaelliangau commented Nov 11, 2022

KSSSenapati commented Jun 25, 2023

zehrackgl commented Jan 3, 2024