-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'NoneType' object has no attribute 'tokenize' #41
Comments
Sometimes the embeddings cannot read the saved tokenizer correctly. I add some lines in |
Still have error using the updated train.py. Here is the error message:
I had |
Oops, this line is not needed in the code, fixed it. |
Another error:
EDIT (debug info): self.name='/home/yongjiang.jy/.flair/embeddings/en-xlmr-first-docv2_10epoch_1batch_4accumulate_0.000005lr_10000lrrate_eng_monolingual_nocrf_fast_norelearn_sentbatch_sentloss_finetune_nodev_saving_ner5/roberta-large' and sentence.batch_pos={}, thus runs the else branch and sentence= |
I didn't consider the scenario for document-level ACE for prediction. I add some tricks in |
Sorry, I still have the same error as before |
Using the latest code and running
|
Hello,
Hit an error while running
python .\train.py --config .\config\doc_ner_best.yaml --batch_size 1 --parse --target_dir .\datasets\mytest --keep_order
on Windows 10, Python 3.7, no GPU.
Here is the error message:
2022-07-28 14:35:50,789 Reading data from datasets\mytest
2022-07-28 14:35:50,789 Train: datasets\mytest\doc_train.txt
2022-07-28 14:35:50,789 Dev: None
2022-07-28 14:35:50,791 Test: None
Traceback (most recent call last):
File ".\train.py", line 345, in
train_eval_result, train_loss = student.evaluate(loader,out_path=Path('outputs/train.'+config.config['model_name']+'.'+tar_file_name+'.conllu'),embeddings_storage_mode="none",prediction_mode=True)
File "C:\Users\ebb\ACE\flair\models\sequence_tagger_model.py", line 2218, in evaluate
features = self.forward(batch,prediction_mode=prediction_mode)
File "C:\Users\ebb\ACE\flair\models\sequence_tagger_model.py", line 818, in forward
self.embeddings.embed(sentences,embedding_mask=self.selection)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 184, in embed
embedding.embed(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 97, in embed
self._add_embeddings_internal(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 2962, in _add_embeddings_internal
self._add_embeddings_to_sentences(sentences)
File "C:\Users\ebb\ACE\flair\embeddings.py", line 3051, in _add_embeddings_to_sentences
subtokenized_sentence = self.tokenizer.tokenize(tokenized_string)
AttributeError: 'NoneType' object has no attribute 'tokenize'
The error is trigged by this line: https://github.com/Alibaba-NLP/ACE/blob/main/flair/embeddings.py#L3041
because self.tokenizer is None.
Any suggestions how to debug this issue? Thanks.
btw, the content of doc_train.txt is the following gibberish:
-DOCSTART- O
Amazon O
predict O
Paypal O
and O
do O
7-11 O
for O
Canada O
and O
Hongkong O
The text was updated successfully, but these errors were encountered: