Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Offline mode tutorial #42

Closed
kucingkembar opened this issue Jan 1, 2022 · 6 comments
Closed

Offline mode tutorial #42

kucingkembar opened this issue Jan 1, 2022 · 6 comments

Comments

@kucingkembar
Copy link

hi, sorry for my bad English, and I am quite a newbie
I am quite confused with the offline tutorial
"Now, move everything in the dlt directory to your offline environment. Create a virtual environment:"
-where is the "offline environment"?
and
-how to Create a "virtual environment"?
I using windows 11 and python 3.9

@xhluca
Copy link
Owner

xhluca commented Jan 9, 2022

Offline environment means where you want to use your model without internet access. So maybe in PC A, you have internet and you can download dl-translate and other pip packages. Then you transfer the content of your folder via a USB key to PC B, where you don't have internet.

You can find out instructions on virtualenvs with this quick search query. The first result shows you how to create a virtual env.

@kucingkembar
Copy link
Author

thank you for the reply xhlulu,

  1. is there any default "virtual environment" in windows 10?, or I must create it?
  2. if I must create it, can you make the bat or ps1 script?
  3. can you provide the links to the offline data? i download via python script always stuck at teens percentage
  4. sorry if this is out of topic, if I need online for translation, why does this "mt = dlt.TranslationModel()" load so many resources to the RAM?

@xhluca
Copy link
Owner

xhluca commented Jan 10, 2022

is there any default "virtual environment" in windows 10?, or I must create it?
if I must create it, can you make the bat or ps1 script?

You must create it. I don't usually use Windows so i'm not aware of the best practices, but it's something you should be able to find on Google.

can you provide the links to the offline data? i download via python script always stuck at teens percentage

The instructions in the readme is the only way I am aware for downloading the weights and libraries to run this offline. You might need a more stable or faster internet connection - there's many GBs of files to download.

sorry if this is out of topic, if I need online for translation, why does this "mt = dlt.TranslationModel()" load so many resources to the RAM?

This uses a quite large multilingual model, which takes a lot of memory, whether it's running on CPU or GPU. If you can't allow that locally, you might want to look into external hosting services (Huggingface, GCP, AWS, etc.)

@kucingkembar
Copy link
Author

thank you again for the reply xhlulu,
I will stick to online mode,

anyway
is there any change in the future you will only use 2 languages instead of 50 languages at once?
I think it will cut down 48/50 data needed to load to RAM
something like this:

import dl_translate as dlt
mt = dlt.TranslationModel(source=dlt.lang.HINDI, target=dlt.lang.ENGLISH)
text_hi = "संयुक्त राष्ट्र के प्रमुख का कहना है कि सीरिया में कोई सैन्य समाधान नहीं है"
mt.translate(text_hi)

@xhluca
Copy link
Owner

xhluca commented Jan 11, 2022

I've thought about adding bilingual models from MarianNMT (they have a few hundred models for two-directions and are much smaller). However this is already covered by easynmt, which was created at the same time as this library and now also uses huggingface in the backend. They also have a neat implementation for bilingual models so definitely recommend checking it out.

@kucingkembar
Copy link
Author

thank you xhlulu for the reply,
I am very satisfied with the answer

Repository owner locked and limited conversation to collaborators Jan 13, 2022
@xhluca xhluca converted this issue into discussion #46 Jan 13, 2022

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants