Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unidecode error when trying to load model saved locally #35

Closed
Verena96 opened this issue Jan 11, 2021 · 1 comment
Closed

Unidecode error when trying to load model saved locally #35

Verena96 opened this issue Jan 11, 2021 · 1 comment

Comments

@Verena96
Copy link

Hello, I trained the model with my own parameters, and saved it.
However, whenever I try to use it, I get the following error:

UnicodeDecodeError Traceback (most recent call last)
in
4 tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")
5
----> 6 model = AutoModelForSequenceClassification.from_pretrained("C:/Users/Verena/Documents/finbert_new/models/classifier_model/finbert-sentiment.bin")
7 label_list = label_list=['positive','negative','neutral']

~\anaconda3\envs\finbert\lib\site-packages\transformers-4.0.1-py3.8.egg\transformers\models\auto\modeling_auto.py in from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
1237 if not isinstance(config, PretrainedConfig):
1238 config, kwargs = AutoConfig.from_pretrained(
-> 1239 pretrained_model_name_or_path, return_unused_kwargs=True, **kwargs
1240 )
1241

~\anaconda3\envs\finbert\lib\site-packages\transformers-4.0.1-py3.8.egg\transformers\models\auto\configuration_auto.py in from_pretrained(cls, pretrained_model_name_or_path, **kwargs)
339 {'foo': False}
340 """
--> 341 config_dict, _ = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
342
343 if "model_type" in config_dict:

~\anaconda3\envs\finbert\lib\site-packages\transformers-4.0.1-py3.8.egg\transformers\configuration_utils.py in get_config_dict(cls, pretrained_model_name_or_path, **kwargs)
387 )
388 # Load config dict
--> 389 config_dict = cls._dict_from_json_file(resolved_config_file)
390
391 except EnvironmentError as err:

~\anaconda3\envs\finbert\lib\site-packages\transformers-4.0.1-py3.8.egg\transformers\configuration_utils.py in _dict_from_json_file(cls, json_file)
470 def _dict_from_json_file(cls, json_file: str):
471 with open(json_file, "r", encoding="utf-8") as reader:
--> 472 text = reader.read()
473 return json.loads(text)
474

~\anaconda3\envs\finbert\lib\codecs.py in decode(self, input, final)
320 # decode input (taking the buffer into account)
321 data = self.buffer + input
--> 322 (result, consumed) = self._buffer_decode(data, self.errors, final)
323 # keep undecoded input until the next call
324 self.buffer = data[consumed:]

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 0: invalid start byte

The same happens when I try to load the language model, even though both models are downloaded locally. I was only able to use finbert through transformers.

Can you please help me? Thanks!

@doguaraci
Copy link
Member

Hi, the problem is that you're giving the .bin file as the input in .from_pretrained part.

There should be a folder with the model weights (.bin file) and config.json, and you should give the folder name as the input.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants