Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem when run bitextor using document aligner NMT #205

Closed
lumosnysm opened this issue Mar 17, 2021 · 6 comments
Closed

Problem when run bitextor using document aligner NMT #205

lumosnysm opened this issue Mar 17, 2021 · 6 comments

Comments

@lumosnysm
Copy link

I run bitextor with config just like the documentation

...
initCorpusTrainPrefix: ['/home/user/Europarl.en-fr.train']
initCorpusDevPrefix: ['/home/user/Europarl.en-fr.dev']
initCorpusTestPrefix: ['/home/user/Europarl.en-fr.test']

marianDir: /home/user/marian-dev
mosesDir: /home/user/mosesdecoder
subwordNmtDir: /home/user/subword-nmt

nmtVocabSize: 50000

LANG2Detokenizer: "/home/user/mosesdecoder/scripts/tokenizer/detokenizer.perl -l fr"

gpuId: 0

marianArgs: [" --optimizer-delay 1", "--mini-batch-fit", "--mini-batch 1000", "--maxi-batch 1000", "--overwrite", "--keep-best", "--valid-metrics perplexity", "--valid-log valid.log", "--log train.log", "--dropout-rnn 0.2", "--dropout-src 0.2", "--dropout-trg 0.2 ", "--cost-type ce-mean-words", "--layer-normalization", "--exponential-smoothing", "--tied-embeddings", "--valid-metrics bleu"]
...

but it throwing this error

KeyError in line 23 of /workspace/thviet/bitextor/snakemake/nmt/Snakefile:
'LANG1Tokenizer'
  File "/workspace/thviet/bitextor/snakemake/nmt/Snakefile", line 23, in <module>

Can you help me to fix this?, thank you.

@lpla
Copy link
Member

lpla commented Mar 17, 2021

Hi. Which version of Bitextor are you using?

@lumosnysm
Copy link
Author

lumosnysm commented Mar 17, 2021

@lpla I am using the master branch of this repo.

lpla added a commit that referenced this issue Mar 18, 2021
Not tested
@lpla
Copy link
Member

lpla commented Mar 18, 2021

It is a bug. Just pushed a workaround to master. Pull these changes and try to run Bitextor again. Let me know if this works or not, as I couldn't test it (I don't have a Marian environment with GPU available right now).

Thank you for reporting!

@lumosnysm
Copy link
Author

lumosnysm commented Mar 19, 2021

@lpla thanks for your quick support
Do i need to set the marianModelFile in the config file as i don't see it in the documentation
I got this error

KeyError in line 98 of /workspace/thviet/bitextor/snakemake/nmt/Snakefile:
'marianModelFile'
  File "/workspace/thviet/bitextor/snakemake/nmt/Snakefile", line 98, in <module>

@lpla
Copy link
Member

lpla commented Mar 23, 2021

Yes. We forgot to add that one to the README.md. Thank you for reporting it!

@lpla
Copy link
Member

lpla commented Mar 26, 2021

Hi. We decided to remove MarianNMT training settings and code from our last push to master towards Bitextor 8.0 release, as it is deprecated. If you want to keep using it, you will need to use Bitextor 7.x code. Closing issue.

@lpla lpla closed this as completed Mar 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants