-
Notifications
You must be signed in to change notification settings - Fork 142
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
saved models #43
Comments
Hi, thank you for reporting this potential issue. I will fix it soon later, or feel free to create a PR. |
I have a patch.
invoking this method in class SubwordField:
and biaffine-parser also uses it to create it, in class method build():
You may also want too avoid to save the tokenize function, by doing this in Parser.save():
What do you think? I cannot make a PR, since I have other changes in my code. |
@attardi That's might not be very elegant, I suppose |
@attardi As a temporary solution, I have re-uploaded the models trained with transformers 3.2.0. |
@attardi Done. The newly uploaded models are trained with |
It seems that the models saved with torch.save() include external objects, like BertTokenizer.
If you try to run the model on a machine where a new version of transformers (e.g. 3.1.0) becomes available, the program will crash.
This is a pity, since it makes all trained model no more usable.
It should be better to avoid saving the whole tokenizer object and only save its class name in order to recreate a new instance when loading the model.
2020-09-18 17:00:10 INFO Loading the data
Traceback (most recent call last):
File "/usr/lib64/python3.6/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/usr/lib64/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/homenfs/tempGPU/iwpt2020/supar/supar/cmds/biaffine_dependency.py", line 43, in
main()
File "/homenfs/tempGPU/iwpt2020/supar/supar/cmds/biaffine_dependency.py", line 39, in main
parse(parser)
File "/homenfs/tempGPU/iwpt2020/supar/supar/cmds/cmd.py", line 35, in parse
parser.predict(**args)
File "/homenfs/tempGPU/iwpt2020/supar/supar/parsers/biaffine_dependency.py", line 125, in predict
return super().predict(**Config().update(locals()))
File "/homenfs/tempGPU/iwpt2020/supar/supar/parsers/parser.py", line 137, in predict
dataset.build(args.batch_size, args.buckets)
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/data.py", line 88, in build
self.fields = self.transform(self.sentences)
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/transform.py", line 39, in call
pairs[f] = f.transform([getattr(i, f.name) for i in sentences])
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/field.py", line 302, in transform
for seq in sequences]
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/field.py", line 302, in
for seq in sequences]
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/field.py", line 301, in
sequences = [[self.preprocess(token) for token in seq]
File "/homenfs/tempGPU/iwpt2020/supar/supar/utils/field.py", line 157, in preprocess
sequence = self.tokenize(sequence)
File "/homenfs/tempGPU/iwpt2020/.env/lib64/python3.6/site-packages/transformers/tokenization_utils.py", line 349, in tokenize
no_split_token = self.unique_no_split_tokens
AttributeError: 'BertTokenizer' object has no attribute 'unique_no_split_tokens'
The text was updated successfully, but these errors were encountered: