Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
initial commit
- Loading branch information
Yoav Artzi
committed
Jul 3, 2013
0 parents
commit 382e20c
Showing
2,870 changed files
with
1,294,910 additions
and
0 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# [_**Navi Corpus**_](http://yoavartzi.com/navi) | ||
|
||
**Developed and maintained by** [Yoav Artzi](http://yoavartzi.com). Based on data from Chen and Mooney 2011 and MacMahon et al. 2006. | ||
|
||
**It's highly recommended to use the data as available in the [Navi repository](http://yoavartzi.com/navi).** | ||
|
||
## Documentations | ||
|
||
The corpus includes two version: | ||
1. The original segmented corpus as used by Chen and Mooney in 2011. This data is split into 3 folds for cross-validation over the 3 different maps. | ||
2. The cleaned up Oracle version of the corpus (see Artzi and Zettlemoyer 2013 for details about the cleanup process). This data is divided into two randomly selected sets, one for test and one for training and development. The development set is divided into random splits for cross-validation during development. | ||
|
||
The directory `tacl-data` includes the processed version of the two corpora above in the format used in Artzi and Zettlemoyer 2013. The original SAIL corpus is in the `sail` directory. The directory `navi` includes the development of the Oracle corpus. Finally, the directory `pysrc` includes various utilities. | ||
|
||
## Attribution | ||
|
||
When using this corpus, please acknowledge it by citing: | ||
|
||
Artzi, Yoav and Zettlemoyer, Luke. "Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions." In Transactions of the Association for Computational Linguistics (TACL), 2013. | ||
|
||
**Bibtex:** | ||
|
||
@article{artzi-zettlemoyer:2011:TACL, | ||
title={Weakly Supervised Learning of Semantic Parsers for Mapping Instructions to Actions}, | ||
author={Artzi, Yoav and Zettlemoyer, Luke}, | ||
journal={Transactions of the Association for Computational Linguistics}, | ||
volume={1}, | ||
number={1}, | ||
pages={49--62}, | ||
year={2013}, | ||
publisher={Association for Computational Linguistic} | ||
} | ||
|
||
**Also, please cite the original creators of the corpus:** | ||
|
||
@InProceedings{macmahon:aaai06, | ||
title = "Walk the Talk: Connecting Language, Knowledge, and Action in Route Instructions", | ||
author = "Matt MacMahon and Brian Stankiewicz and Benjamin Kuipers", | ||
booktitle = "Proceedings of the 21st National Conference on Artificial Intelligence (AAAI-2006)", | ||
address = "Boston, MA, USA", | ||
month = "July", | ||
year = 2006 | ||
} | ||
|
||
@InProceedings{chen:aaai11, | ||
title = "Learning to Interpret Natural Language Navigation Instructions fro mObservations", | ||
author = "David L. Chen and Raymond J. Mooney", | ||
booktitle = "Proceedings of the 25th AAAI Conference on Artificial Intelligence (AAAI-2011)", | ||
address = "San Francisco, CA, USA", | ||
month = "August", | ||
year = 2011 | ||
} | ||
|
||
|
||
## License | ||
|
||
Navi Corpus | ||
|
||
Copyright (C) 2013 Yoav Artzi | ||
|
||
This program is free software; you can redistribute it and/or modify it under | ||
the terms of the GNU General Public License as published by the Free Software | ||
Foundation; either version 2 of the License, or any later version. | ||
|
||
This program is distributed in the hope that it will be useful, but WITHOUT | ||
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS | ||
FOR A PARTICULAR PURPOSE. See the GNU General Public License for more | ||
details. | ||
|
||
You should have received a copy of the GNU General Public License along with | ||
this program; if not, write to the Free Software Foundation, Inc., 51 | ||
Franklin Street, Fifth Floor, Boston, MA 02110-1301, USA. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
Segmentation included: | ||
Inserting new line characters after each action sequence instructions | ||
Often ',' characters where replaced with new line characters | ||
Obvious typos were corrected | ||
|
||
Stripping source and target position numbers was done by replacing all mentions of destination positions with X and all mentions of source position with Y. |
716 changes: 716 additions & 0 deletions
716
navi/data/instructions/action+sets_annotation/corpus1-instructions.xgoal.dev.txt
Large diffs are not rendered by default.
Oops, something went wrong.
4,089 changes: 4,089 additions & 0 deletions
4,089
navi/data/instructions/action+sets_annotation/corpus1-instructions.xgoal.train-dev.txt
Large diffs are not rendered by default.
Oops, something went wrong.
4,805 changes: 4,805 additions & 0 deletions
4,805
navi/data/instructions/action+sets_annotation/corpus1-instructions.xgoal.train.txt
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
Oops, something went wrong.