Error with pre-trained word embeddings #3

michellegiang · 2018-03-15T15:05:07Z

Hi,

When I run test with your pre-trained word embeddings: .
./run.sh "/home/michelle/mlc/mlconvgec2018/data/test/conll14st-test/conll14st-test.tok.src" "/home/michelle/mlc/test" 2 "/home/michelle/mlc/mlconvgec2018/models/mlconv_embed"

I have below error. Could you please let me know how to solve it ? And how to get the M2 score instead of BLEU score ?

(michelle) michelle@k:~/mlc/mlconvgec2018$ ./run.sh "/home/michelle/mlc/mlconvgec2018/data/test/conll14st-test/conll14st-test.tok.src" "/home/michelle/mlc/test" 2 "/home/michelle/mlc/mlconvgec2018/models/mlconv_embed"
++ source paths.sh
+++++ dirname paths.sh
++++ cd .
++++ pwd
+++ BASE_DIR=/home/michelle/mlc/mlconvgec2018
+++ DATA_DIR=/home/michelle/mlc/mlconvgec2018/data
+++ MODEL_DIR=/home/michelle/mlc/mlconvgec2018/models
+++ SCRIPTS_DIR=/home/michelle/mlc/mlconvgec2018/scripts
+++ SOFTWARE_DIR=/home/michelle/mlc/mlconvgec2018/software
++ '[' 4 -ge 4 ']'
++ input_file=/home/michelle/mlc/mlconvgec2018/data/test/conll14st-test/conll14st-test.tok.src
++ output_dir=/home/michelle/mlc/test
++ device=2
++ model_path=/home/michelle/mlc/mlconvgec2018/models/mlconv_embed
++ '[' 4 -eq 6 ']'
++ '[' -d /home/michelle/mlc/mlconvgec2018/models/mlconv_embed ']'
+++ ls /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model1.pt /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model2.pt /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model3.pt /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model4.pt
+++ tr '\n' ' '
+++ sed 's| ([^$])| --path \1|g'
++ models='/home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model1.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model2.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model3.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model4.pt '
++ echo /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model1.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model2.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model3.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model4.pt
/home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model1.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model2.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model3.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model4.pt
++ FAIRSEQPY=/home/michelle/mlc/mlconvgec2018/software/fairseq-py
++ NBEST_RERANKER=/home/michelle/mlc/mlconvgec2018/software/nbest-reranker
++ beam=12
++ nbest=12
++ threads=12
++ mkdir -p /home/michelle/mlc/test
++ /home/michelle/mlc/mlconvgec2018/scripts/apply_bpe.py -c /home/michelle/mlc/mlconvgec2018/models/bpe_model/train.bpe.model
++ CUDA_VISIBLE_DEVICES=2
++ python3.6 /home/michelle/mlc/mlconvgec2018/software/fairseq-py/generate.py --no-progress-bar --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model1.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model2.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model3.pt --path /home/michelle/mlc/mlconvgec2018/models/mlconv_embed/model4.pt --beam 12 --nbest 12 --interactive --workers 12 /home/michelle/mlc/mlconvgec2018/models/data_bin
Traceback (most recent call last):
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/generate.py", line 167, in
main()
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/generate.py", line 41, in main
models, dataset = utils.load_ensemble_for_inference(args.path, args.data)
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/fairseq/utils.py", line 127, in load_ensemble_for_inference
model = build_model(args, dataset)
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/fairseq/utils.py", line 31, in build_model
return getattr(models, args.model).build_model(args, dataset)
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/fairseq/models/fconv.py", line 541, in build_model
dictionary=dataset.src_dict
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/fairseq/models/fconv.py", line 100, in init
self.embed_tokens = load_embeddings(embed_path, dictionary, self.embed_tokens)
File "/home/michelle/mlc/mlconvgec2018/software/fairseq-py/fairseq/models/fconv.py", line 22, in load_embeddings
with open(embed_path) as f_embed:
FileNotFoundError: [Errno 2] No such file or directory: '/home.local/shamil/wiki/wiki.bpe.fasttext/model.vec'

michellegiang · 2018-03-15T15:18:19Z

Hi,

I also post another issue at below link when I train with my data.

https://github.com/facebookresearch/fairseq-py/issues/129

shamilcm · 2018-03-15T15:21:13Z

Hi, the current issue seems to be with our fork of fairseq-py, so you can close the issue you opened here: facebookresearch/fairseq-py#129 and re-post the issue here.

shamilcm · 2018-03-15T16:50:02Z

The issue was due to some hardcoded paths in arguments.
It is now fixed here: shamilcm/fairseq-py@ceb2f12
Can you retry with this?

Regarding the training issue, can you close it at facebookresearch/fairseq-py#129 and open a new issue here. I will take a look at it.

Also what data is it trained on? Is it a very small training data with fewer than 30000 words in the vocabulary?

michellegiang · 2018-03-15T19:43:32Z

hi, the training data is Lang-8 Learner Corpus of English v1.0 and NUCLE

michellegiang · 2018-03-15T19:46:28Z

Hi Shamil,

So I need to delete the software/fairseq-py, download the new one and reinstall it right ?

Regards,
Viet Anh

shamilcm · 2018-03-16T00:53:56Z

If you installed fairseq-py using setup.py, pull the new changes and run it again. Otherwise, you just need to just pull the changes. The change is done in only one file: fairseq/utils.py

michellegiang · 2018-03-16T02:01:35Z

Thank Shamil. If I already installed fairseq-py, could I just copy your new utils to replace the old utils ?

The reason is that I used your version of fairseq-py with the new version of PyTorch and I had some trouble with setup.py build and I need to apply some manual patch. (The version of PyTorch in your original read me has some problems, thus I need to use the newest version of PyTorch)

https://github.com/facebookresearch/fairseq-py/issues/120

shamilcm · 2018-03-16T02:15:35Z

If you installed by python setup.py develop, just getting utils.py should work.

shamilcm · 2018-03-16T03:15:10Z

Closing this issue as it has been resolved.

shamilcm closed this as completed Mar 16, 2018

Ibuki1129 mentioned this issue Sep 23, 2019

ImportError: cannot import name 'libbleu' #28

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error with pre-trained word embeddings #3

Error with pre-trained word embeddings #3

michellegiang commented Mar 15, 2018 •

edited

Loading

michellegiang commented Mar 15, 2018

shamilcm commented Mar 15, 2018 •

edited

Loading

shamilcm commented Mar 15, 2018 •

edited

Loading

michellegiang commented Mar 15, 2018

michellegiang commented Mar 15, 2018

shamilcm commented Mar 16, 2018 •

edited

Loading

michellegiang commented Mar 16, 2018 •

edited

Loading

shamilcm commented Mar 16, 2018

shamilcm commented Mar 16, 2018

Error with pre-trained word embeddings #3

Error with pre-trained word embeddings #3

Comments

michellegiang commented Mar 15, 2018 • edited Loading

michellegiang commented Mar 15, 2018

shamilcm commented Mar 15, 2018 • edited Loading

shamilcm commented Mar 15, 2018 • edited Loading

michellegiang commented Mar 15, 2018

michellegiang commented Mar 15, 2018

shamilcm commented Mar 16, 2018 • edited Loading

michellegiang commented Mar 16, 2018 • edited Loading

shamilcm commented Mar 16, 2018

shamilcm commented Mar 16, 2018

michellegiang commented Mar 15, 2018 •

edited

Loading

shamilcm commented Mar 15, 2018 •

edited

Loading

shamilcm commented Mar 15, 2018 •

edited

Loading

shamilcm commented Mar 16, 2018 •

edited

Loading

michellegiang commented Mar 16, 2018 •

edited

Loading