form-meaning-associations

Code accompanying the paper "Finding Concept-specific Biases in Form--Meaning Associations" accepted at NAACL 2021.

Install Dependencies

Create a conda environment with

$ conda env create -f environment.yml

Then activate the environment and install your appropriate version of PyTorch.

$ conda install -y pytorch torchvision cudatoolkit=10.1 -c pytorch
$ # conda install pytorch torchvision cpuonly -c pytorch

Parse data

To pre-process the data run:

$ make process_data SPLITS=<split-used>

As detailed in the paper, split-used can be either macroarea or family.

Train models

To train the models run:

$ make train SPLITS=<split-used> CONTEXT=<context>

Or train all seeds in sequence with:

$ python src/h02_learn/train_multi.sh

Context can be:

none: No context used
onehot: OneHot context used (this was the model used in the paper)
word2vec: Word2Vec context used
onehot-shuffle: OneHot context with shuffled meaning ids

Analyse the results

To produce the agregate analysis files, run:

$ make analysis
$ make get_seed_results SPLITS=macroarea
$ make get_seed_results SPLITS=family

This will produce all the result files for macroarea, and a per seed result for the family splits. With these at hand you can run the scripts in src/h05_paper/ directly to get the paper plots and tables.

The family vs macroarea analysis in the paper was made manually using the results files

Extra Information

Citation

If this code or the paper were usefull to you, consider citing the paper:

@inproceedings{pimentel-etal-2021-finding,
    title = "Finding Concept-specific Biases in Form--Meaning Associations",
    author = "Pimentel, Tiago  and
      Roark, Brian  and
      Wichmann, S{\o}ren  and
      Cotterell, Ryan  and
      Blasi, Dami\'{a}n",
    booktitle = "Proceedings of the 2021 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)",
    month = jun,
    year = "2021",
    address = "Virtual",
    publisher = "Association for Computational Linguistics",
}

Contact

To ask questions or report problems, please open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
datasets/asjp		datasets/asjp
results/asjp		results/asjp
src		src
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
activate.sh		activate.sh
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

form-meaning-associations

Install Dependencies

Parse data

Train models

Analyse the results

Extra Information

Citation

Contact

About

Releases

Packages

Languages

License

rycolab/form-meaning-associations

Folders and files

Latest commit

History

Repository files navigation

form-meaning-associations

Install Dependencies

Parse data

Train models

Analyse the results

Extra Information

Citation

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages