Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fluency classifier issue #44

Open
martiansideofthemoon opened this issue Apr 20, 2022 · 0 comments
Open

Fluency classifier issue #44

martiansideofthemoon opened this issue Apr 20, 2022 · 0 comments

Comments

@martiansideofthemoon
Copy link
Owner

Thanks to Anubhav Jangra for reporting this ---

Email #1

I am unable to replicate the fluency and style accuracy scores. Here are a few numbers I'm getting right now -

Fluency score for AAE Tweets - 10.77 (reported as 56.4 in paper)
Fluency score for Bible. - 8.11 (reported as 87.5 in paper)
Fluency score for Poetry. - 4.22 (reported as 87.5 in paper)
Fluency score for Coha-1810. - 12.33 (reported as 87.5 in paper)
Fluency score for Coha-1890. - 21.16 (reported as 87.5 in paper)
Fluency score for Coha-1990. - 24.04 (reported as 87.5 in paper)

Just to clarify, I'm using the following script to get the evaluation score -
python style_paraphrase/evaluation/scripts/acceptability.py --input datasets/bible/test.input0.txt

(FYI - I also did try to get results for bible/train.input0.txt; it gave a score of 7%)

Also, I got some weird results for style accuracy -

Style Accuracy of "aae/test.input0.txt" against "bible" style - 86.22%
Style Accuracy of "aae/test.input0.txt" against "romantic-poetry" style - 5.15%
Style Accuracy of "aae/test.input0.txt" against "aae" style - 1.48% (reported as 87.6% in paper)

Also, I've checked the paths for the trained cds-classifier and cola-classifier directories, and they do contain the same content as the one you shared in the Google Drive. (I currently believe that these models might be the issue, but not sure.)

Can you tell me what could be the reason? I wish to replicate the results in the paper before I go ahead with other stuff.

Email #2 & 3

I got the datasets/bible/test.input0.txt by using the datasets/bpe2text.py command over the datasets/bible/test.input0.bpe file. No, I'm not passing the BPEs, and the text to the script directly.

So after fiddling with the fluency script a bit, we found that removing punctuation from input text increases the score; and hence we were wondering if we were missing some explicitly preprocessing step on the text input before it gets converted to bpe with the eval python script. (Currently we are feeding the bpe2text version of the input0.bpe files).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant