Skip to content

tomsherborne/bootstrap

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Bootstrapping a Crosslingual Semantic Parser

Tom Sherborne, Yumo Xu and Mirella Lapata

In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings (pp. 499–517)

Read our paper here

This repository is structures as:

.
├── code 	
│   ├── metrics
│   ├── models
│   └── reader
├── config
└── data
    └── overnight

  • code: Codebase models for the paper (AllenNLP models, dataset readers and sequence_accuracy metric for training)
  • config: AllenNLP training commands for rerunning experiments from the paper (currently only for the MT-Ensemble Transformer model). They currently assume another top-level folder called big which is a local cache for BERT models (i.e. if your GPU machine does not have direct internet access).
  • data/overnight: Chinese and German versions of Overnight (Wang et al. 2015). Training data is machine translated and test/dev data is collected using Turk (see paper for details)

ATIS

The ATIS dataset is subject to LDC licensing requirements and we keep this in a separate repo called bootstrap_atis. If you require access to this then please email me (tom.sherborne@ed.ac.uk). We will ask for evidence of holding relevant LDC licenses (LDC93S5, LDC94S19, and LDC95S26).

Requirements

This project was completed using Python 3.5 with AllenNLP 0.9 (and associated dependencies) and PyTorch 1.2. Much of the relevant framework code is now contained within allennlp-models which can be downloaded as an additional dependency.

Citation

If you use any of the resources from this work - please cite us as follows:

@inproceedings{sherborne-etal-2020-bootstrapping,
address = {Online},
author = {Sherborne, Tom and Xu, Yumo and Lapata, Mirella},
booktitle = {Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: Findings},
month = {nov},
pages = {499--517},
publisher = {Association for Computational Linguistics},
title = {{Bootstrapping a Crosslingual Semantic Parser}},
url = {https://www.aclweb.org/anthology/2020.findings-emnlp.45},
year = {2020}
}

Licence

The software (code) within this repository is an extension of AllenNLP which is covered by the Apache 2.0 Licence (see here).

Data and artefacts created as a result of this project is subject to the University of Edinburgh Open Access Requirement Policy. The authors have applied a Creative Commons Attribution (CC BY) licence to data herewithin this repository.

About

"Bootstrapping a Crosslingual Semantic Parser" - Sherborne, Xu and Lapata - EMNLP2020 Findings

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages