Skip to content
Neural Network for generating structured queries from natural language.
Branch: master
Clone or download
Xu Xiaojun
Xu Xiaojun Update README
Latest commit 5dfb96e Jan 16, 2018
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
saved_model SQLNet Nov 13, 2017
sqlnet Fix typo Dec 10, 2017
.gitignore SQLNet Nov 13, 2017
LICENSE Add LICENSE Nov 13, 2017
README.md Update README Jan 17, 2018
data.tar.bz2 Rename data file Nov 13, 2017
download_glove.sh SQLNet Nov 13, 2017
extract_vocab.py SQLNet Nov 13, 2017
requirements.txt SQLNet Nov 13, 2017
test.py SQLNet Nov 13, 2017
train.py Remove redundant Jan 4, 2018

README.md

SQLNet

This repo provides an implementation of SQLNet and Seq2SQL neural networks for predicting SQL queries on WikiSQL dataset. The paper is available at here.

Citation

Xiaojun Xu, Chang Liu, Dawn Song. 2017. SQLNet: Generating Structured Queries from Natural Language Without Reinforcement Learning.

Bibtex

@article{xu2017sqlnet,
  title={SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning},
  author={Xu, Xiaojun and Liu, Chang and Song, Dawn},
  journal={arXiv preprint arXiv:1711.04436},
  year={2017}
}

Installation

The data is in data.tar.bz2. Unzip the code by running

tar -xjvf data.tar.bz2

The code is written using PyTorch in Python 2.7. Check here to install PyTorch. You can install other dependency by running

pip install -r requirements.txt

Downloading the glove embedding.

Download the pretrained glove embedding from here using

bash download_glove.sh

Extract the glove embedding for training.

Run the following command to process the pretrained glove embedding for training the word embedding:

python extract_vocab.py

Train

The training script is train.py. To see the detailed parameters for running:

python train.py -h

Some typical usage are listed as below:

Train a SQLNet model with column attention:

python train.py --ca

Train a SQLNet model with column attention and trainable embedding (requires pretraining without training embedding, i.e., executing the command above):

python train.py --ca --train_emb

Pretrain a Seq2SQL model on the re-splitted dataset

python train.py --baseline --dataset 1

Train a Seq2SQL model with Reinforcement Learning after pretraining

python train.py --baseline --dataset 1 --rl

Test

The script for evaluation on the dev split and test split. The parameters for evaluation is roughly the same as the one used for training. For example, the commands for evaluating the models from above commands are:

Test a trained SQLNet model with column attention

python test.py --ca

Test a trained SQLNet model with column attention and trainable embedding:

python test.py --ca --train_emb

Test a trained Seq2SQL model withour RL on the re-splitted dataset

python test.py --baseline --dataset 1

Test a trained Seq2SQL model with Reinforcement learning

python test.py --baseline --dataset 1 --rl
You can’t perform that action at this time.