Skip to content
No description, website, or topics provided.
Python Jupyter Notebook Shell
Branch: master
Clone or download
Latest commit cb8aa8e Nov 2, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data add editsql Oct 31, 2019
data_util add editsql Oct 31, 2019
eval_scripts add editsql Oct 31, 2019
logs add cosql editsql Nov 2, 2019
model add editsql Oct 31, 2019
LICENSE add editsql Oct 31, 2019
README.md add cosql editsql Nov 2, 2019
logger.py first commit Jun 22, 2019
model_util.py add editsql Oct 31, 2019
parse_args.py add editsql Oct 31, 2019
postprocess_eval.py add editsql Oct 31, 2019
preprocess.py add editsql Oct 31, 2019
requirements.txt first commit Jun 22, 2019
run.py add editsql Oct 31, 2019
run_atis.sh first commit Jun 22, 2019
run_cosql_cdseq2seq.sh add editsql Oct 31, 2019
run_sparc_cdseq2seq.sh add cosql Sep 10, 2019
run_sparc_cdseq2seq_segment_copy.sh add cosql Sep 10, 2019
run_sparc_editsql.sh add editsql Oct 31, 2019
run_spider_editsql.sh add editsql Oct 31, 2019

README.md

EditSQL for Spider, SParC, CoSQL

This is a pytorch implementation of the CD-Seq2Seq baseline and the EditSQL model in the following papers

Please cite the papers if you use our data and code.

@InProceedings{yu2018spider,
    title = "Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task",
    author = "Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, Zilin Zhang, Dragomir Radev",
    booktitle = "Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing",
    year = "2018",
    address = "Brussels, Belgium"
}

@InProceedings{yu2019sparc,
  author =      "Tao Yu, Rui Zhang, Michihiro Yasunaga, Yi Chern Tan, Xi Victoria Lin, Suyi Li, Heyang Er, Irene Li, Bo Pang, Tao Chen, Emily Ji, Shreya Dixit, David Proctor, Sungrok Shim, Jonathan Kraft, Vincent Zhang, Caiming Xiong, Richard Socher, Dragomir Radev",
  title =       "SParC: Cross-Domain Semantic Parsing in Context",
  booktitle =   "Proceedings of The 57th Annual Meeting of the Association for Computational Linguistics",
  year =        "2019",
  address =     "Florence, Italy"
}

@InProceedings{yu2019cosql,
  author =      "Tao Yu, Rui Zhang, He Yang Er, Suyi Li, Eric Xue, Bo Pang, Xi Victoria Lin, Yi Chern Tan, Tianze Shi, Zihan Li, Youxuan Jiang, Michihiro Yasunaga, Sungrok Shim, Tao Chen, Alexander Fabbri, Zifan Li, Luyao Chen, Yuwen Zhang, Shreya Dixit, Vincent Zhang, Caiming Xiong, Richard Socher, Walter Lasecki, Dragomir Radev",
  title =       "CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases",
  booktitle =   "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  year =        "2019",
  address =     "Hong Kong, China"
}

@InProceedings{zhang2019editing,
  author =      "Rui Zhang, Tao Yu, He Yang Er, Sungrok Shim, Eric Xue, Xi Victoria Lin, Tianze Shi, Caiming Xiong, Richard Socher, Dragomir Radev",
  title =       "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions",
  booktitle =   "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
  year =        "2019",
  address =     "Hong Kong, China"
}

Contact Rui Zhang for any question.

Dependency

The model is tested in python 3.6 and pytorch 1.0. We recommend using conda and pip:

conda create -n editsql python=3.6
source activate editsql
pip install -r requirements.txt

The evaluation scripts use python 2.7

Download Pretrained BERT model from here as model/bert/data/annotated_wikisql_and_PyTorch_bert_param/pytorch_model_uncased_L-12_H-768_A-12.bin.

Run Spider experiment

First, download Spider. Then please follow

  • run_spider_editsql.sh. We saved our experimental logs at logs/logs_spider_editsql

This reproduces the Spider result in "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions".

Dev Test
EditSQL 57.6 53.4

Run SParC experiment

First, download SParC. Then please follow

  • use cdseq2seq: run_sparc_cdseq2seq.sh. We saved our experimental logs at logs/logs_sparc_cdseq2seq
  • use cdseq2seq with segment copy: run_sparc_cdseq2seq_segment_copy.sh. We saved our experimental logs at logs/logs_sparc_cdseq2seq_segment_copy
  • use editsql: run_sparc_editsql.sh. We saved our experimental logs at logs/logs_sparc_editsql

This reproduces the SParC result in "Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions".

Question Match Interaction Match
Dev Test Dev Test
CD-Seq2Seq 21.9 - 8.1 -
CD-Seq2Seq+segment copy (use predicted query) 21.7 - 9.5 -
CD-Seq2Seq+segment copy (use gold query) 27.3 - 10.0 -
EditSQL (use predicted query) 47.2 47.9 29.5 25.3
EditSQL (use gold query) 53.4 54.5 29.2 25.0

Run CoSQL experiment

First, download CoSQL from here. Then please follow

  • run_cosql_cdseq2seq.sh. We saved our experimental logs at logs/logs_cosql_cdseq2seq

This reproduces the SQL-grounded dialog state tracking result in "CoSQL: A Conversational Text-to-SQL Challenge Towards Cross-Domain Natural Language Interfaces to Databases".

Question Match Interaction Match
Dev Test Dev Test
CD-Seq2Seq 13.8 13.9 2.1 2.6
EditSQL 39.9 40.8 12.3 13.7

Run ATIS experiment

To get ATIS data and get evaluation on the result accuracy, you need get ATIS data from here, set up your mysql database for ATIS and change --database_username and --database_password in parse_args.py.

Please follow run_atis.sh

This reproduces the ATIS result in "Learning to map context dependent sentences to executable formal queries". We saved our experimental logs at logs/logs_atis

Dev Test
Query Relaxed Strict Query Relaxed Strict
Suhr et al., 2018 37.5(0.9) 63.0(0.7) 62.5(0.9) 43.6(1.0) 69.3(0.8) 69.2(0.8)
Our Replication 38.8 63.3 62.8 44.6 68.3 68.2

Acknowledgement

This implementation is based on "Learning to map context dependent sentences to executable formal queries". Alane Suhr, Srinivasan Iyer, and Yoav Artzi. In NAACL, 2018.

You can’t perform that action at this time.