Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: Input embedding matrix must match size: 250000 x 300, found torch.Size([100000, 300]) #1285

Closed
zhaolinlee opened this issue Sep 19, 2023 · 1 comment
Labels

Comments

@zhaolinlee
Copy link

zhaolinlee commented Sep 19, 2023

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

2023-09-19 16:26:47 INFO: Checking for updates to resources.json in case models have been updated.  Note: this behavior can be turned off with download_method=None or download_method=DownloadMethod.REUSE_RESOURCES
Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.5.1.json: 328kB [00:00, 1.08Downloading https://raw.githubusercontent.com/stanfordnlp/stanza-resources/main/resources_1.5.1.json: 328kB [00:00, 1.08MB/s]
2023-09-19 16:26:48 INFO: "zh" is an alias for "zh-hans"
2023-09-19 16:26:49 INFO: Loading these models for language: zh-hans (Simplified_Chinese):
===================================
| Processor    | Package          |
-----------------------------------
| tokenize     | gsdsimp          |
| pos          | gsdsimp_charlm   |
| lemma        | gsdsimp_nocharlm |
| constituency | ctb-51_charlm    |
| depparse     | gsdsimp_charlm   |
| sentiment    | ren              |
| ner          | ontonotes        |
===================================

2023-09-19 16:26:49 INFO: Using device: cpu
2023-09-19 16:26:49 INFO: Loading: tokenize
2023-09-19 16:26:49 INFO: Loading: pos
2023-09-19 16:26:49 INFO: Loading: lemma
2023-09-19 16:26:50 INFO: Loading: constituency
2023-09-19 16:26:50 INFO: Loading: depparse
2023-09-19 16:26:50 INFO: Loading: sentiment
2023-09-19 16:26:50 INFO: Loading: ner
Traceback (most recent call last):
  File "run_700_Stanford.py", line 8, in <module>
    nlp = stanza.Pipeline('zh', use_gpu=False, ner_pretrain_path=ner_pretrain_path)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\pipeline\core.py", line 296, in __init__
    self.processors[processor_name] = NAME_TO_PROCESSOR_CLASS[processor_name](config=curr_processor_config,
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\pipeline\processor.py", line 193, in __init__
    self._set_up_model(config, pipeline, device)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\pipeline\ner_processor.py", line 52, in _set_up_model
    trainer = Trainer(args=args, model_file=model_path, pretrain=pretrain, device=device, foundation_cache=pipeline.foundation_cache)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\models\ner\trainer.py", line 66, in __init__
    self.load(model_file, pretrain, args, foundation_cache)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\models\ner\trainer.py", line 161, in load
    self.model = NERTagger(self.args, self.vocab, emb_matrix=emb_matrix, foundation_cache=foundation_cache)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\models\ner\model.py", line 52, in __init__
    self.init_emb(emb_matrix)
  File "C:\Users\lizhaolin\AppData\Roaming\Python\Python38\site-packages\stanza\models\ner\model.py", line 123, in init_emb
    assert emb_matrix.size() == (vocab_size, dim), \
AssertionError: Input embedding matrix must match size: 250000 x 300, found torch.Size([100000, 300])

Expected behavior

"stanza_resources" has been downloaded when running the code

Environment (please complete the following information):

  • OS: [Windows11]
  • Python version: [Python 3.8.6]
  • Stanza version: [1.5.1]

Additional context
Thank you very much for your help and look forward to your reply.

@zhaolinlee zhaolinlee added the bug label Sep 19, 2023
@AngledLuffa
Copy link
Collaborator

Sorry for the inconvenience. That should now be fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants