Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

python scripts/evaluate.py #10

Closed
qiunlp opened this issue Mar 15, 2020 · 21 comments
Closed

python scripts/evaluate.py #10

qiunlp opened this issue Mar 15, 2020 · 21 comments

Comments

@qiunlp
Copy link

qiunlp commented Mar 15, 2020

Traceback (most recent call last):
File "/home/qwh/acc/scripts/evaluate.py", line 32, in
evaluate_simplifier_on_turkcorpus(simplifier, phase='test')
File "/home/qwh/acc/access/evaluation/general.py", line 30, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/home/qwh/acc/easse/cli.py", line 115, in evaluate_system_output
orig_sents, refs_sents = get_orig_and_refs_sents(test_set, orig_sents_path, refs_sents_paths)
File "/home/qwhacc/easse/cli.py", line 39, in get_orig_and_refs_sents
orig_sents = get_orig_sents(test_set)
File "/home/qwh/acc/easse/utils/resources.py", line 76, in get_orig_sents
return read_lines(TEST_SETS_PATHS[(test_set, 'orig')])
KeyError: ('turk', 'orig')

Process finished with exit code 1

@qiunlp qiunlp closed this as completed Mar 16, 2020
@qiunlp qiunlp changed the title Not Found for url: https://pypi.org/simple/easse/ python scripts/evaluate.py Mar 16, 2020
@qiunlp qiunlp reopened this Mar 16, 2020
@louismartin
Copy link
Contributor

louismartin commented Mar 16, 2020

Please add a more detailed description of your issue, and describe the steps that you already tried to fix the problem.
"Please" or "Thank you" won't hurt either.

@louismartin
Copy link
Contributor

Your version of easse is most likely not up to date.

@qiunlp
Copy link
Author

qiunlp commented Mar 17, 2020

Sorry. Sorry. Thank you for your help. Thank you for your patience.

1:git clone git@github.com:facebookresearch/access.git
git@github.com: Permission denied (publickey).
fatal: Unable to read remote warehouse.
Please confirm that you have the correct access rights and the warehouse exists.

2:Download zip-----unzip---cd access

3:pip install -e .
Collecting easse@ git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2 (from access==0.1)
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://pypi.org/simple/easse/

4: https://github.com/feralvam/easse
Download zip-----Unzip-----cd easse-------Copy directory "easese" to "access"

5:https://github.com/facebookresearch/text-simplification-evaluation
Download zip-----Unzip-----cd text-simlification-------Copy directory "tseval" to "access"

6:python scripts/evaluate.py
Evaluating pretrained model
Downloading...
... 100% - 622 MB - 1.33 MB/s - 468s
Extracting...
Downloading...
... 100% - 623 MB - 1.36 MB/s - 459s
Extracting...
Traceback (most recent call last):
File "scripts/evaluate.py", line 28, in
evaluate_simplifier_on_turkcorpus(simplifier, phase='test')
File "/home/qwh/桌面/access/access/evaluation/general.py", line 32, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/home/qwh/桌面/access/easse/cli.py", line 115, in evaluate_system_output
orig_sents, refs_sents = get_orig_and_refs_sents(test_set, orig_sents_path, refs_sents_paths)
File "/home/qwh/桌面/access/easse/cli.py", line 39, in get_orig_and_refs_sents
orig_sents = get_orig_sents(test_set)
File "/home/qwh/桌面/access/easse/utils/resources.py", line 76, in get_orig_sents
return read_lines(TEST_SETS_PATHS[(test_set, 'orig')])
KeyError: ('turk', 'orig')

@louismartin
Copy link
Contributor

louismartin commented Mar 17, 2020

Ok actually the problem is that the version of EASSE that you use is too recent and introduced breaking changes (we'll fix this soon).
In the meantime please install the version of EASSE that was used at the time of release of ACCESS.
You can do so by running pip install git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2.
Make sure you have the latest pip installed pip install pip --upgrade.

@qiunlp
Copy link
Author

qiunlp commented Mar 17, 2020

pip install git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2
........
Successfully built easse
ERROR: access 0.1 has requirement nltk==3.4.5, but you'll have nltk 3.4.3 which is incompatible.
Installing collected packages: preshed, blis, plac, thinc
Attempting uninstall: preshed
Found existing installation: preshed 3.0.2
Uninstalling preshed-3.0.2:
Successfully uninstalled preshed-3.0.2
Rolling back uninstall of preshed
Moving to /home/qwh/.local/lib/python3.6/site-packages/preshed-3.0.2.dist-info/
from /home/qwh/.local/lib/python3.6/site-packages/~reshed-3.0.2.dist-info
Moving to /home/qwh/.local/lib/python3.6/site-packages/preshed/
from /home/qwh/.local/lib/python3.6/site-packages/~reshed
ERROR: Could not install packages due to an EnvironmentError: [Errno 13] 权限不够: '/usr/local/lib/python3.6/dist-packages/preshed/init.pxd'
Consider using the --user option or check the permissions.

But before I run it, I have uninstall "nltk".
Thank you!

@louismartin
Copy link
Contributor

I think the problem is not due to NLTK, but more to a permission issue (see last part of traceback):

ERROR: Could not install packages due to an EnvironmentError: [Errno 13] 权限不够: '/usr/local/lib/python3.6/dist-packages/preshed/init.pxd'
Consider using the --user option or check the permissions.

@louismartin
Copy link
Contributor

I fixed the problem in ACCESS and EASSE, if you install them again with the latest github version, it should work fine.
Thanks for bringing that up.

@qiunlp
Copy link
Author

qiunlp commented Mar 17, 2020

/usr/bin/python3.6 /home/qwh/桌面/access/scripts/evaluate.py
[nltk_data] Downloading package stopwords to /home/qwh/nltk_data...
[nltk_data] Package stopwords is already up-to-date!
[nltk_data] Downloading package perluniprops to /home/qwh/nltk_data...
[nltk_data] Package perluniprops is already up-to-date!
[nltk_data] Downloading package punkt to /home/qwh/nltk_data...
[nltk_data] Unzipping tokenizers/punkt.zip.
Evaluating pretrained model
BLEU: 76.08
SARI: 41.87
FKGL: 7.22
Quality estimation: {'Compression ratio': 0.94, 'Sentence splits': 1.2, 'Levenshtein similarity': 0.87, 'Exact matches': 0.04, 'Additions proportion': 0.16, 'Deletions proportion': 0.17, 'Lexical complexity score': 7.93}

Process finished with exit code 0

SUCCESS!!THANK YOU VERY MUCH!

@louismartin
Copy link
Contributor

Great, closing the issue :)

@Vinay-Chellwani
Copy link

Hi @louismartin tried exploring the project on google colab and facing the same problem
Am I missing something?
Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 105, in evaluate_system_output
orig_sents, sys_sents, refs_sents = get_sents(test_set, orig_sents_path, sys_sents_path, refs_sents_paths)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 28, in get_sents
orig_sents_path = TEST_SETS_PATHS[(test_set, 'orig')]
KeyError: ('turkcorpus_test_legacy', 'orig')

@louismartin
Copy link
Contributor

Hi, can you run !pip freeze | grep easse please?

@Vinay-Chellwani
Copy link

fairseq==0.6.2

@louismartin
Copy link
Contributor

I'm sorry, I would like to check the easse version, fairseq does not seem to be the problem here

@Vinay-Chellwani
Copy link

  1. !pip install -e /content/access/

  2. !pip install --force-reinstall easse@git+git://github.com/feralvam/easse.git@5dce4474a72baa5a16e3764f1ed4225a1751dbf2

  3. !pip install --force-reinstall fairseq@git+https://github.com/louismartin/fairseq.git@controllable-sentence-simplification

4.import nltk
nltk.download('all')

  1. !pip install nevergrad==0.2.3
    from nevergrad.instrumentation import var

  2. !pip install git+git://github.com/facebookresearch/text-simplification-evaluation.git

  3. !python /content/access/scripts/evaluate.py

  4. !pip freeze | grep fairseq

This is my invoking sequence. Could be of some help to debug

@Vinay-Chellwani
Copy link

I'm sorry, I would like to check the easse version, fairseq does not seem to be the problem here

easse==0.1

@louismartin
Copy link
Contributor

louismartin commented Apr 24, 2020

Ok thanks a lot.
I think I put the wrong version of easse on the README, my mistake.
Can you please try to run:
pip install --force-reinstall easse@git+git://github.com/feralvam/easse.git@580ec953e4742c3ae806cc85d867c16e9f584505
and try again ?

@Vinay-Chellwani
Copy link

Hi @louismartin
Used the above mentioned command, still got some error:

Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 130, in evaluate_system_output
lowercase=lowercase)
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in corpus_bleu
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/utils/preprocessing.py", line 12, in normalize
normalized_sent = sacrebleu.tokenize_13a(sentence)
AttributeError: module 'sacrebleu' has no attribute 'tokenize_13a'

Also,
easse==0.2.1
fairseq==0.6.2

@Vinay-Chellwani
Copy link

Hi @louismartin
Used the above mentioned command, still got some error:

Traceback (most recent call last):
File "/content/access/scripts/evaluate.py", line 27, in
print(evaluate_simplifier_on_turkcorpus(simplifier, phase='test'))
File "/content/access/access/evaluation/general.py", line 31, in evaluate_simplifier_on_turkcorpus
quality_estimation=True)
File "/usr/local/lib/python3.6/dist-packages/easse/cli.py", line 130, in evaluate_system_output
lowercase=lowercase)
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in corpus_bleu
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/bleu.py", line 22, in
sys_sents = [utils_prep.normalize(sent, lowercase, tokenizer) for sent in sys_sents]
File "/usr/local/lib/python3.6/dist-packages/easse/utils/preprocessing.py", line 12, in normalize
normalized_sent = sacrebleu.tokenize_13a(sentence)
AttributeError: module 'sacrebleu' has no attribute 'tokenize_13a'

Also,
easse==0.2.1
fairseq==0.6.2

Tried checking the version for the same
sacrebleu==1.4.7

@louismartin
Copy link
Contributor

Ok thanks, I think that's a different problem.
It seems that the sacrebleu package was reorganized recently.
Can you try with again with pip install sacrebleu==1.4.5?

@Vinay-Chellwani
Copy link

This worked well!!! Thank you so much for your help <3

Got the result:
Evaluating pretrained model
{'bleu': 76.07533495738832, 'sari': 41.24344083480672, 'sari_legacy': 41.866226081519535, 'fkgl': 7.224963716884172, 'quality_estimation': {'Compression ratio': 0.9402640450938302, 'Sentence splits': 1.2000928505106778, 'Levenshtein similarity': 0.86603988608316, 'Exact copies': 0.03899721448467967, 'Additions proportion': 0.15796500942038566, 'Deletions proportion': 0.16669992308336373, 'Lexical complexity score': 7.925871582450909}}

Also I wanted to ask whether we can try the model on custom data? If yes, then is there a guide to do so? This may sound a bit silly but I'm new to NLP and exploring text simplification projects so it would mean a lot if you can share some guidelines for the same.
Thanks a ton!

@louismartin
Copy link
Contributor

You're welcome :)
Yes you can do so by using the python generate.py < my_data.txt script.

louismartin added a commit that referenced this issue Apr 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants