New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak #337
Comments
I was able to figure out that here argos-translate/argostranslate/translate.py Line 163 in 478010b
self.translator is always None on every translation request. This means that for each translation event, the program creates a new PackageTranslation instance. This can lead to a memory leak if suddenly ctranslate2 stores some process-bound data (globally).
|
|
Still, I was not able to completely eliminate the memory leak in But still, if you rely on tokenization from Now I settled on a self-made tokenization and this method of translation https://github.com/kolserdav/ana/blob/6d3329403a38c67ff5f77336be937f9b4b58e2d7/packages/bottle/core/translate.py#L45-L94 , I don’t know how it will show itself in the future, but at least as tests show, this method completely eliminates memory leaks. To observe the leak, I ran similar query loops in multiple windows: for i in $(seq 1 10000) ; do curl -X POST -d '{"q": "test", "source":"en", "target":"fi"}' -H 'Content-Type: application/json' http://127.0.0.1:8000/translate ; done |
Please re-open this. It is still an issue. |
I have used a simple Django server for my application https://github.com/kolserdav/ana/tree/1f9d445926f9b565010667dbd0daa21fea6a1080/packages/translate .
Django
urls.py
file https://github.com/kolserdav/ana/blob/1f9d445926f9b565010667dbd0daa21fea6a1080/packages/translate/translate/urls.py#L1-L27Django
translate
handler file https://github.com/kolserdav/ana/blob/1f9d445926f9b565010667dbd0daa21fea6a1080/packages/translate/translate/api/translate.py#L1-L16My
Translate
class https://github.com/kolserdav/ana/blob/1f9d445926f9b565010667dbd0daa21fea6a1080/packages/translate/translate/core/translate.py#L1-L41After I start the server, I start repeating the same request with Curl:
In another window, I open
top
with a filter by namepython
:top | grep python
After a certain number (depending on server resources) of repeated requests, I see that the "python" process consumes a significant amount of memory. This share of memory will now never be freed, even if you stop repeating requests. This memory consumption remains until the process is restarted:
If you continue to make requests, then the process will soon crash with status 247
I will be grateful for any help.
The text was updated successfully, but these errors were encountered: