Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Internal: could not parse ModelProto from /home/nlp/miniconda3/lib/python3.9/site-packages/inltk/models/hi/tokenizer.model #99

Open
bezaisingh opened this issue Mar 5, 2024 · 2 comments

Comments

@bezaisingh
Copy link

On Ubuntu 18, Python 3.9, Installation of iNLTK had no issue, but when language is set to hi, using the command
from inltk.inltk import setup
setup('hi')
We see the below error message:


RuntimeError Traceback (most recent call last)
Cell In[9], line 2
1 from inltk.inltk import setup
----> 2 setup('hi')

File ~/miniconda3/lib/python3.9/site-packages/inltk/inltk.py:33, in setup(language_code)
31 loop = asyncio.get_event_loop()
32 tasks = [asyncio.ensure_future(download(language_code))]
---> 33 learn = loop.run_until_complete(asyncio.gather(*tasks))[0]
34 loop.close()

File ~/miniconda3/lib/python3.9/asyncio/base_events.py:623, in BaseEventLoop.run_until_complete(self, future)
612 """Run until the Future is done.
613
614 If the argument is a coroutine, it is wrapped in a Task.
(...)
620 Return the Future's result, or raise its exception.
621 """
622 self._check_closed()
--> 623 self._check_running()
625 new_task = not futures.isfuture(future)
626 future = tasks.ensure_future(future, loop=self)

File ~/miniconda3/lib/python3.9/asyncio/base_events.py:583, in BaseEventLoop._check_running(self)
581 def _check_running(self):
582 if self.is_running():
--> 583 raise RuntimeError('This event loop is already running')
584 if events._get_running_loop() is not None:
585 raise RuntimeError(
586 'Cannot run the event loop while another loop is running')

RuntimeError: This event loop is already running
Downloading Model. This might take time, depending on your internet connection. Please be patient.
We'll only do this for the first time.
Downloading Model. This might take time, depending on your internet connection. Please be patient.
We'll only do this for the first time.
Done!


As we saw the done message we ignored it and moved to the next step
from inltk.inltk import tokenize
text = 'गीक्स फॉर गीक्स एक बेहतरीन टेक्नोलॉजी लर्निंग प्लेटफॉर्म है।'
tokenize(text ,'hi')
And we get the below given error:


RuntimeError Traceback (most recent call last)
Cell In[14], line 4
1 from inltk.inltk import tokenize
3 text = 'गीक्स फॉर गीक्स एक बेहतरीन टेक्नोलॉजी लर्निंग प्लेटफॉर्म है।'
----> 4 tokenize(text ,'hi')

File ~/miniconda3/lib/python3.9/site-packages/inltk/inltk.py:62, in tokenize(input, language_code)
60 def tokenize(input: str, language_code: str):
61 check_input_language(language_code)
---> 62 tok = LanguageTokenizer(language_code)
63 output = tok.tokenizer(input)
64 return output

File ~/miniconda3/lib/python3.9/site-packages/inltk/tokenizer.py:14, in LanguageTokenizer.init(self, lang)
12 def init(self, lang: str):
13 self.lang = lang
---> 14 self.base = EnglishTokenizer(lang) if lang == LanguageCodes.english else IndicTokenizer(lang)

File ~/miniconda3/lib/python3.9/site-packages/inltk/tokenizer.py:63, in IndicTokenizer.init(self, lang)
61 self.sp = spm.SentencePieceProcessor()
62 model_path = path/f'models/{lang}/tokenizer.model'
---> 63 self.sp.Load(str(model_path))

File ~/miniconda3/lib/python3.9/site-packages/sentencepiece/init.py:961, in SentencePieceProcessor.Load(self, model_file, model_proto)
959 if model_proto:
960 return self.LoadFromSerializedProto(model_proto)
--> 961 return self.LoadFromFile(model_file)

File ~/miniconda3/lib/python3.9/site-packages/sentencepiece/init.py:316, in SentencePieceProcessor.LoadFromFile(self, arg)
315 def LoadFromFile(self, arg):
--> 316 return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)

RuntimeError: Internal: could not parse ModelProto from /home/nlp/miniconda3/lib/python3.9/site-packages/inltk/models/hi/tokenizer.model


Can anyone kindly suggest a way to resolve this issue??

@Cecilia-zwq
Copy link

Have you sovled it? I have this kind of issue too.
In this post: chatchat-space/Langchain-Chatchat#3103, I saw someone had the same issue. This person tried to download the model again, and it worked out.

@bezaisingh
Copy link
Author

No, I couldn't resolve the issue although I tried to connect to the developer Gaurav Arora, and unfortunately got no reply from him.
Thanks for sharing the post I'll try it, and let's see if it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants