Awareness Evaluation

This repository implements the Adversarial Awareness Evaluation defined in Elliott (2018).

Dependencies:

Python 3.5+
scipy
numpy
seaborn
matplotlib

Reproducing Table 1 and Figure 2

Unzip the system output data:

for x in `ls data/`; do cd data/$x/; unzip $x.zip; cd -; done

This produces three new directories inside data: decinit, hierattn, and trgmul. Each directory contains the decoded sentences, the segment-level Meteor scores (.scores), and the segment-level language modelling scores (logps-).

Run the awareness.py script on the output .scores files. The output shows you the average Meteor score of the model evaluated with the congruent image data, followed by the average Meteor score of the model evaluated with the incongruent image data, the average Awareness of the model (Eq. 1), and whether we can reject the null hypothesis given the results.

python awareness.py --congruent data/decinit/decinit.val.tok.de.congruent.scores \
                    --incongruent data/decinit/decinit.*.random*.scores \
                    --meteor

Mean congruent score: 0.5852
Mean incongruent score: 0.5824 +- 0.00043

Average awareness: 0.0028 +- 0.00043
Fisher's method Chi-Squared = 32.79, p=0.0003

Run the violin_plots.py script to generate Figure 2.

python violin_plots.py --model1 data/trgmul/trgmul.val*.scores 
                       --model2 data/decinit/decinit.val*.scores 
                       --model3 data/hierattn/hierattn.val*.scores 
                       --meteor

Evaluating your own model

Generate the shuffled image data: python shuffle_images.py --image_order_file val.txt --features val-resnet50-avgpool.npy, for example. This will produce five new .npy files and text files that show the shuffled order of the images.
Generates translations for the different shuffles of the image data using your model.
Score the translations at the sentence-level.
Follow the instructions in Step 2 of Reproducing Table 1 and Figure 2 using the scores files from the previous step.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
awareness.py		awareness.py
shuffle_images.py		shuffle_images.py
violin_plots.png		violin_plots.png
violin_plots.py		violin_plots.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

awareness.py

awareness.py

shuffle_images.py

shuffle_images.py

violin_plots.png

violin_plots.png

violin_plots.py

violin_plots.py

Repository files navigation

Awareness Evaluation

Dependencies:

Reproducing Table 1 and Figure 2

Evaluating your own model

About

Releases

Packages

Contributors 2

Languages

License

elliottd/awareness

Folders and files

Latest commit

History

Repository files navigation

Awareness Evaluation

Dependencies:

Reproducing Table 1 and Figure 2

Evaluating your own model

About

Resources

License

Stars

Watchers

Forks

Languages