Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use provided model #5

Open
marcelbra opened this issue Apr 20, 2020 · 22 comments
Open

How to use provided model #5

marcelbra opened this issue Apr 20, 2020 · 22 comments
Assignees

Comments

@marcelbra
Copy link

Here you provide a ready-to-use model if I understand correctly? How do I use this model? I am having troubles with this as I'm quite new to torch models and fairseq.

Besides that I ran the setup until the tuning but it takes forever even on colab. So I figured there must be a quick way.. Thanks!

@yuyan2do
Copy link
Member

Will it be helpful if we provide an example using colab?

@marcelbra
Copy link
Author

Yes, that would be amazing!!

@marcelbra
Copy link
Author

I managed to run the inference script (however, evaluation is still throwing an error). You can find my workspace here to have a look. My question is, how can I now pass a new input to the model? I guess evaluation is not that important for now, I just want to see how the model's output might look like to my type of text. The input parameter --input did not work. Thanks!

@yuyan2do
Copy link
Member

I take a quick look. The error is during evaluation, which you said is not important for now.
The inference output should be in "cnndm/sort_hypo$SUFFIX.txt"

SUFFIX=_ck9_pelt1.2_test_beam5
BEAM=5
LENPEN=1.2
CHECK_POINT=cnndm/finetune_cnndm_checkpoints/checkpoint9.pt
OUTPUT_FILE=cnndm/output$SUFFIX.txt
SCORE_FILE=cnndm/score$SUFFIX.txt
INPUT=input/test.txt

fairseq-generate cnndm/processed --path $CHECK_POINT --user-dir prophetnet --task translation_prophetnet --batch-size 32 --gen-subset test --beam $BEAM --num-workers 4 --min-len 45 --max-len-b 110 --no-repeat-ngram-size 3 --lenpen $LENPEN 2>&1 > $OUTPUT_FILE

grep ^H $OUTPUT_FILE | cut -c 3- | sort -n | cut -f3- | sed "s/ ##//g" > cnndm/sort_hypo$SUFFIX.txt

Remove below line for evaluation, compute rouge score, to avoid the error.
(FileNotFoundError: [Errno 2] No such file or directory: 'cnndm/original_data/test.summary')

python cnndm/eval/postprocess_cnn_dm.py --generated cnndm/sort_hypo$SUFFIX.txt --golden cnndm/original_data/test.summary > $SCORE_FILE

@marcelbra
Copy link
Author

I managed to infer on the command line using this fairly simple script
fairseq-interactive cnndm/processed --path ../../checkpoint9.pt --user-dir prophetnet --task translation_prophetnet. It will prompt to input some text.

I have two questions.

When creating the binary files, it replaced a large portion with [UNK].

| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/train.src: 287113 sents, 268357288 tokens, 16.9% replaced by [UNK]
| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/valid.src: 13368 sents, 12065326 tokens, 16.3% replaced by [UNK]
| [src] Dictionary: 30521 types
| [src] cnndm/prophetnet_tokenized/test.src: 11490 sents, 10518620 tokens, 16.3% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/train.tgt: 287113 sents, 19659024 tokens, 16.2% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/valid.tgt: 13368 sents, 1019609 tokens, 16.9% replaced by [UNK]
| [tgt] Dictionary: 30521 types
| [tgt] cnndm/prophetnet_tokenized/test.tgt: 11490 sents, 833967 tokens, 16.8% replaced by [UNK]
| Wrote preprocessed data to cnndm/processed

Is that correct?

Also, when predicting, it seems like important words are being replaced by [UNK].

For example:

This paragraph

We investigate how perceived job riskiness and individual attitudes impact the vocational choice of business graduates. The hypotheses are tested with a sample of 182 similarly qualified students at two European business schools. Participants are randomly allocated to two conditions under which they receive a job-description that highlights job security or job risk. The findings indicate that risk negatively affects employer attractiveness and the inclination to apply. Besides that, the subjective person-job fit has a positive direct impact on employer attractiveness and the inclination to apply. Contrary to the expectations, risk had no significantly stronger effect on women.

will be evaluated to

[UNK] investigate how perceived job [UNK] and individual attitudes impact the vocational choice of business [UNK] [UNK] [UNK] are tested with a sample of 182 similarly qualified students at two [UNK] business [UNK] [UNK] are randomly allocated to two conditions under which they receive a [UNK] that highlights job security or job [UNK] [UNK] findings indicate that risk negatively affects employer [UNK] and the inclination to [UNK] [UNK] [UNK] the subjective [UNK] fit has a positive direct impact on employer [UNK] and the inclination to [UNK] [UNK] to the [UNK] risk had no significantly stronger effect on [UNK]

yielding the hypothesis

students are randomly allocated to receive a [UNK] that highlights job security or job [UNK] . [X_SEP] the subjective fit has a positive direct impact on employer [UNK] . [X_SEP] the inclination to [UNK] [UNK] to the [UNK] risk had no significantly stronger effect on [UNK] .

This looks quite good, however, the frequency of these tokes seems weird. I guess it's related to the UniLM but I'm quite unsure how to proceed here.

@yuyan2do
Copy link
Member

It missed a preprocess step, which caused many token been replaced by [UNK].

@qiweizhen Could you add a tutorial about "training/inference on own data"?

@marcelbra
Copy link
Author

marcelbra commented Apr 23, 2020

Okay I figured. In the UniLM data you linked there were only dev.src/dev.tgt. When creating binaries it threw error valid.src/valid.tgt missing so I changed dev.src/dev.tgt to valid.src/valid.tgt since it was the only one named wrong. Was that correct?

Just checked preprocessing without applying changes to provided data. High percentage is still there plus thrown error due to wrong naming.

I might have missed running the python script before. I will try that later and tell you what happened.

@yuyan2do
Copy link
Member

You need run below script to do the modify, instead of rename.
https://github.com/microsoft/ProphetNet/blob/master/src/cnndm/preprocess_cnn_dm.py

preocess('cnndm/original_data/dev.article', 'cnndm/prophetnet_tokenized/valid.src', keep_sep=False)
preocess('cnndm/original_data/dev.summary', 'cnndm/prophetnet_tokenized/valid.tgt', keep_sep=True)

@marcelbra
Copy link
Author

Yes, I did that last night! Preprocessing replaced 0.0% now, so everything seems fine. But when infering on many texts there seem to be replacements of (sometimes unusual but sometimes quite usual) words. Is that intentional?

@yuyan2do
Copy link
Member

Yes, I did that last night! Preprocessing replaced 0.0% now, so everything seems fine. But when infering on many texts there seem to be replacements of (sometimes unusual but sometimes quite usual) words. Is that intentional?

Could you give some examples for this?

@marcelbra
Copy link
Author

marcelbra commented Apr 25, 2020

  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt
  3. created binaries using
    fairseq-preprocess \ --user-dir ./prophetnet \ --task translation_prophetnet \ --source-lang src --target-lang tgt \ --trainpref cnndm/prophetnet_tokenized/train --validpref cnndm/prophetnet_tokenized/valid --testpref cnndm/prophetnet_tokenized/test \ --destdir cnndm/processed --srcdict ./vocab.txt --tgtdict ./vocab.txt \ --workers 20
  4. downloaded checkpoint from here
  5. ran inference script
    fairseq-interactive cnndm/processed --path ../../checkpoint9.pt --user-dir prophetnet --task translation_prophetnet
  6. Two paragraphs I just tried:

I encountered this story—which is about Taylor Swift clones—when it won the Gulf Coast Barthelme Prize a couple of years ago. The judge was Steve Almond, who wrote, “I tried quite hard to resist choosing “Taylor Swift” as the winner of this year’s Barthelme Award. Why? Because all the stories I received were worthy and many were more technically ambitious when it came to language and form, by which I guess I mean experimental. . . . But what the hell. In the end, I just wanted to read this thing again and again.” Which is exactly right. Whatever you think of the actual Taylor Swift, this story is just plain fun.
S-0 [UNK] encountered this [UNK] is about [UNK] [UNK] [UNK] it won the [UNK] [UNK] [UNK] [UNK] a couple of years [UNK] [UNK] judge was [UNK] [UNK] who [UNK] [UNK] tried quite hard to resist choosing [UNK] [UNK] as the winner of this [UNK] [UNK] [UNK] [UNK] [UNK] all the stories [UNK] received were worthy and many were more technically ambitious when it came to language and [UNK] by which [UNK] guess [UNK] mean [UNK] . . . [UNK] what the [UNK] [UNK] the [UNK] [UNK] just wanted to read this thing again and [UNK] [UNK] is exactly [UNK] [UNK] you think of the actual [UNK] [UNK] this story is just plain [UNK]
H-0 -0.5018416047096252 the [UNK] [UNK] [UNK] [UNK] [UNK] won the [UNK] [UNK] [UNK] [UNK] a couple of years ago . [X_SEP] the [UNK] [UNK] [UNK] [UNK] judge was [UNK] [UNK] who tried quite hard to resist choosing [UNK] [UNK] as the winner .
P-0 -2.3202 -0.8823 -0.1841 -1.0270 -0.4661 -0.7258 -2.6165 -0.2648 -0.2291 -0.1468 -0.1247 -0.1055 -0.8802 -0.0800 -0.1379 -0.0884 -0.1111 -0.3555 -0.2556 -1.7888 -0.9244 -0.1134 -0.3385 -0.4988 -1.5235 -0.2080 -0.4371 -0.2541 -0.3722 -0.3488 -0.1160 -0.0461 -0.1520 -0.1284 -0.4877 -0.7384 -0.2469 -0.1625 -0.0919 -0.0594 -0.4204 -0.6187

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem.
S-0 [UNK] have a potential of learning [UNK] [UNK] but are limited by a [UNK] context in the setting of language [UNK] [UNK] propose a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method not only enables capturing [UNK] [UNK] but also [UNK] the context fragmentation [UNK]
H-0 -0.3602469563484192 a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method .
P-0 -2.6825 -1.4001 -0.0427 -0.0278 -0.4756 -0.5836 -0.1250 -0.0397 -0.1123 -0.0313 -0.0801 -0.0057 -0.0625 -0.0624 -0.2568 -0.1473 -0.6857 -0.4404 -0.5872 -0.1196 -0.1033 -0.3800 -0.2844 -0.3009 -0.1307 -0.1259 -0.2720 -0.3956 -0.1504 -0.3195 -0.3839 -0.0469 -0.2630 -1.1234

@qiweizhen
Copy link
Contributor

I encountered this story—which is about Taylor Swift clones—when it won the Gulf Coast Barthelme Prize a couple of years ago. The judge was Steve Almond, who wrote, “I tried quite hard to resist choosing “Taylor Swift” as the winner of this year’s Barthelme Award. Why? Because all the stories I received were worthy and many were more technically ambitious when it came to language and form, by which I guess I mean experimental. . . . But what the hell. In the end, I just wanted to read this thing again and again.” Which is exactly right. Whatever you think of the actual Taylor Swift, this story is just plain fun.
S-0 [UNK] encountered this [UNK] is about [UNK] [UNK] [UNK] it won the [UNK] [UNK] [UNK] [UNK] a couple of years [UNK] [UNK] judge was [UNK] [UNK] who [UNK] [UNK] tried quite hard to resist choosing [UNK] [UNK] as the winner of this [UNK] [UNK] [UNK] [UNK] [UNK] all the stories [UNK] received were worthy and many were more technically ambitious when it came to language and [UNK] by which [UNK] guess [UNK] mean [UNK] . . . [UNK] what the [UNK] [UNK] the [UNK] [UNK] just wanted to read this thing again and [UNK] [UNK] is exactly [UNK] [UNK] you think of the actual [UNK] [UNK] this story is just plain [UNK]
H-0 -0.5018416047096252 the [UNK] [UNK] [UNK] [UNK] [UNK] won the [UNK] [UNK] [UNK] [UNK] a couple of years ago . [X_SEP] the [UNK] [UNK] [UNK] [UNK] judge was [UNK] [UNK] who tried quite hard to resist choosing [UNK] [UNK] as the winner .
P-0 -2.3202 -0.8823 -0.1841 -1.0270 -0.4661 -0.7258 -2.6165 -0.2648 -0.2291 -0.1468 -0.1247 -0.1055 -0.8802 -0.0800 -0.1379 -0.0884 -0.1111 -0.3555 -0.2556 -1.7888 -0.9244 -0.1134 -0.3385 -0.4988 -1.5235 -0.2080 -0.4371 -0.2541 -0.3722 -0.3488 -0.1160 -0.0461 -0.1520 -0.1284 -0.4877 -0.7384 -0.2469 -0.1625 -0.0919 -0.0594 -0.4204 -0.6187

Transformers have a potential of learning longer-term dependency, but are limited by a fixed-length context in the setting of language modeling. We propose a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence. It consists of a segment-level recurrence mechanism and a novel positional encoding scheme. Our method not only enables capturing longer-term dependency, but also resolves the context fragmentation problem.
S-0 [UNK] have a potential of learning [UNK] [UNK] but are limited by a [UNK] context in the setting of language [UNK] [UNK] propose a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method not only enables capturing [UNK] [UNK] but also [UNK] the context fragmentation [UNK]
H-0 -0.3602469563484192 a novel neural architecture [UNK] that enables learning dependency beyond a fixed length without [UNK] temporal [UNK] [UNK] consists of a [UNK] [UNK] mechanism and a novel [UNK] encoding [UNK] [UNK] method .
P-0 -2.6825 -1.4001 -0.0427 -0.0278 -0.4756 -0.5836 -0.1250 -0.0397 -0.1123 -0.0313 -0.0801 -0.0057 -0.0625 -0.0624 -0.2568 -0.1473 -0.6857 -0.4404 -0.5872 -0.1196 -0.1033 -0.3800 -0.2844 -0.3009 -0.1307 -0.1259 -0.2720 -0.3956 -0.1504 -0.3195 -0.3839 -0.0469 -0.2630 -1.1234

Two outputs I just tried.

It looks like that you didn't tokenize the provided text into word pieces. For now, tokenizing whole word into word pieces is commonly used to alleviate some vocabulary problems, and you may refer to here.

@marcelbra
Copy link
Author

@qiweizhen as you said, after

  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt

I ran this and the output is 100% identical.

It replaces words but I feel like the summarization look ok. Another paragraph:

I'm noticing 2 types of replacements:

  1. Beginning of sentence
  2. Any type of punctuation

Wild guess is, word+punct is observed -> not found in dict -> replaced.
For beginning of sentence idk.

Merkel was educated at Karl Marx University, Leipzig, where she studied physics from 1973 to 1978. While a student, she participated in the reconstruction of the ruin of the Moritzbastei, a project students initiated to create their own club and recreation facility on campus. Such an initiative was unprecedented in the GDR of that period, and initially resisted by the University; however, with backing of the local leadership of the SED party, the project was allowed to proceed. At school she learned to speak Russian fluently, and was awarded prizes for her proficiency in Russian and mathematics. She was the best in her class in mathematics and Russian, and completed her school education with the best possible average Abitur grade 1.0.
S-0 [UNK] was educated at [UNK] [UNK] [UNK] [UNK] where she studied physics from 1973 to [UNK] [UNK] a [UNK] she participated in the reconstruction of the ruin of the [UNK] a project students initiated to create their own club and recreation facility on [UNK] [UNK] an initiative was unprecedented in the [UNK] of that [UNK] and initially resisted by the [UNK] [UNK] with backing of the local leadership of the [UNK] [UNK] the project was allowed to [UNK] [UNK] school she learned to speak [UNK] [UNK] and was awarded prizes for her proficiency in [UNK] and [UNK] [UNK] was the best in her class in mathematics and [UNK] and completed her school education with the best possible average [UNK] grade [UNK]
H-0 -0.604485273361206 she studied physics from 1973 to [UNK] [UNK] a [UNK] . [X_SEP] she was the best in her class in mathematics and [UNK] .
P-0 -1.8327 -1.4006 -0.3280 -1.2938 -0.0892 -0.1472 -0.9993 -0.6959 -0.1060 -1.2664 -0.9566 -0.1618 -0.6522 -1.1117 -1.3683 -0.1477 -0.1564 -0.0790 -0.0344 -0.1136 -0.4401 -0.1032 -0.7496 -0.2032 -0.6752

@qiweizhen
Copy link
Contributor

@qiweizhen as you said, after

  1. Downloaded date from here
  2. used preprocess_cnn_dm.py to create train/test/valid+src/tgt

I ran this and the output is 100% identical.

It replaces words but I feel like the summarization look ok. Another paragraph:

I'm noticing 2 types of replacements:

  1. Beginning of sentence
  2. Any type of punctuation

Wild guess is, word+punct is observed -> not found in dict -> replaced.
For beginning of sentence idk.

Merkel was educated at Karl Marx University, Leipzig, where she studied physics from 1973 to 1978. While a student, she participated in the reconstruction of the ruin of the Moritzbastei, a project students initiated to create their own club and recreation facility on campus. Such an initiative was unprecedented in the GDR of that period, and initially resisted by the University; however, with backing of the local leadership of the SED party, the project was allowed to proceed. At school she learned to speak Russian fluently, and was awarded prizes for her proficiency in Russian and mathematics. She was the best in her class in mathematics and Russian, and completed her school education with the best possible average Abitur grade 1.0.
S-0 [UNK] was educated at [UNK] [UNK] [UNK] [UNK] where she studied physics from 1973 to [UNK] [UNK] a [UNK] she participated in the reconstruction of the ruin of the [UNK] a project students initiated to create their own club and recreation facility on [UNK] [UNK] an initiative was unprecedented in the [UNK] of that [UNK] and initially resisted by the [UNK] [UNK] with backing of the local leadership of the [UNK] [UNK] the project was allowed to [UNK] [UNK] school she learned to speak [UNK] [UNK] and was awarded prizes for her proficiency in [UNK] and [UNK] [UNK] was the best in her class in mathematics and [UNK] and completed her school education with the best possible average [UNK] grade [UNK]
H-0 -0.604485273361206 she studied physics from 1973 to [UNK] [UNK] a [UNK] . [X_SEP] she was the best in her class in mathematics and [UNK] .
P-0 -1.8327 -1.4006 -0.3280 -1.2938 -0.0892 -0.1472 -0.9993 -0.6959 -0.1060 -1.2664 -0.9566 -0.1618 -0.6522 -1.1117 -1.3683 -0.1477 -0.1564 -0.0790 -0.0344 -0.1136 -0.4401 -0.1032 -0.7496 -0.2032 -0.6752

Hi, I tried your input sentence, whose tokenized text should be like this:
image

@cpipi
Copy link

cpipi commented May 14, 2020

Hi, had someone faced to problem with numpy during binary trainable files generating?
I have:

  • Windows 10
  • Python 3.7
  • Pip 20.0.2
  • numpy 1.18.4
  • torch 1.5.0

I am running this script like here:

fairseq-preprocess \ --user-dir prophetnet \ --task translation_prophetnet \ --source-lang src --target-lang tgt \ --trainpref gigaword/prophetnet_tokenized/train --validpref gigaword/prophetnet_tokenized/dev --testpref gigaword/prophetnet_tokenized/test \ --destdir gigaword/processed --srcdict vocab.txt --tgtdict vocab.txt \ --workers 20

while generating files I am getting warnings if I use numpy == 1.8 and errors if I use 1.7

In case of numpy 1.8: execution never ends, and shows only warnings :

(virtenv) C:\..\..\..\prophetnet\src>bash binary.sh c:\..\..\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1) Namespace(align_suffix=None, alignfile=None, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='gigaword/processed', empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, seed=1, source_lang='src', srcdict='vocab.txt', target_lang='tgt', task='translation_prophetnet', tensorboard_logdir='', testpref='gigaword/prophetnet_tokenized/test', tgtdict='vocab.txt', threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='gigaword/prophetnet_tokenized/train', user_dir='prophetnet', validpref='gigaword/prophetnet_tokenized/dev', workers=20) | [src] Dictionary: 30521 types c:\..\..\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\..\..\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1) OMP: Error #110: Memory allocation failed. c:\users\\anaconda3\lib\site-packages\numpy\_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs: c:\users\\anaconda3\lib\site-packages\numpy\.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll c:\users\\anaconda3\lib\site-packages\numpy\.libs\libopenblas.PYQHXLVVQ7VESDPUVUADXEVJOBGHJPAY.gfortran-win_amd64.dll stacklevel=1)
and it stucks there until I interrupt it by keyboard.
In fact, that it did processed some files to gigaword/processed directory, but as much as I can say that not enough.

when i try numpy 1.17 it explicitly shows errors:

``(virtenv) C:......\prophetnet\src>bash binary.sh
c:....\anaconda3\lib\site-packages\numpy_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs:
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.TXA6YQSD3GCQQC22GEQ54J2UDCXDXHWN.gfortran-win_amd64.dll
stacklevel=1)
Namespace(align_suffix=None, alignfile=None, bpe=None, cpu=False, criterion='cross_entropy', dataset_impl='mmap', destdir='gigaword/processed', empty_cache_freq=0, fp16=False, fp16_init_scale=128, fp16_scale_tolerance=0.0, fp16_scale_window=None, joined_dictionary=False, log_format=None, log_interval=1000, lr_scheduler='fixed', memory_efficient_fp16=False, min_loss_scale=0.0001, no_progress_bar=False, nwordssrc=-1, nwordstgt=-1, only_source=False, optimizer='nag', padding_factor=8, seed=1, source_lang='src', srcdict='vocab.txt', target_lang='tgt', task='translation_prophetnet', tensorboard_logdir='', testpref='gigaword/prophetnet_tokenized/test', tgtdict='vocab.txt', threshold_loss_scale=None, thresholdsrc=0, thresholdtgt=0, tokenizer=None, trainpref='gigaword/prophetnet_tokenized/train', user_dir='prophetnet', validpref='gigaword/prophetnet_tokenized/dev', workers=20)
| [src] Dictionary: 30521 types
c:....\anaconda3\lib\site-packages\numpy_distributor_init.py:32: UserWarning: loaded more than 1 DLL from .libs:
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.IPBC74C7KURV7CB2PKT5Z5FNR3SIBV4J.gfortran-win_amd64.dll
c:....\anaconda3\lib\site-packages\numpy.libs\libopenblas.TXA6YQSD3GCQQC22GEQ54J2UDCXDXHWN.gfortran-win_amd64.dll
stacklevel=1)
Process SpawnPoolWorker-11:
Traceback (most recent call last):
File "c:\users\anaconda3\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "c:\users\anaconda3\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self.kwargs)
File "c:\users\anaconda3\lib\multiprocessing\pool.py", line 110, in worker
task = get()
File "c:\users\anaconda3\lib\multiprocessing\queues.py", line 354, in get
return ForkingPickler.loads(res)
File "c:\users\anaconda3\lib\site-packages\fairseq_cli\preprocess.py", line 13, in
from fairseq import options, tasks, utils
File "c:\users\anaconda3\lib\site-packages\fairseq_init
.py", line 9, in
import fairseq.criterions # noqa
File "c:\users\anaconda3\lib\site-packages\fairseq\criterions_init
.py", line 10, in
from fairseq.criterions.fairseq_criterion import FairseqCriterion
File "c:\users\anaconda3\lib\site-packages\fairseq\criterions\fairseq_criterion.py", line 6, in
from torch.nn.modules.loss import Loss
File "c:\users\anaconda3\lib\site-packages\torch_init
.py", line 136, in
from torch._C import *
ImportError: numpy.core.multiarray failed to import`

I tried go further with first case, but on inference I got some errors again, and I thought it might be because of this step. Also, I need 1.7 to run another technology

If you have questions, please, ask. Any help would be appreciated!

@gireek
Copy link

gireek commented Jun 8, 2020

@yuyan2do Can we get the Colab as you stated earlier. It will be extremely helpful for PyTorch newbies to explore ProphetNet. Thanks

@yuyan2do
Copy link
Member

@gireek Create a Colab tutorial is in our backlog. We will priority this work if more people ask for it.

@gouldju1
Copy link

I think a Colab tutorial would be really valuable. There were a few unknowns, unanswered questions, and hurdles I had to work through to be able to run.

@anish-newzera
Copy link

I also think a Colab tutorial will be really helpful for us to use the model. As the exact steps that need to be performed are slightly unclear

@alexgaskell10
Copy link

I would also like a Colab tutorial please!

@ryzhik22
Copy link

ryzhik22 commented Jul 6, 2020

Is there any news about Colab tutorial? It will be really very helpful! =)

@bertagknowles
Copy link

I too am overwhelmed by so many scripts...a working colab notebook with scripts in correct order would make the task very easy to follow. Already about six users have demanded this. Thanks for prioritizing this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants