SemiAuto_Data_Text_SQL

Intro

annotate.py: A command line annotation tool to annotate generated canonical utterances, for convenience in manual annotation.
examine_dataset.py: A python script to look into certain details and statistics in SQL queries of both original SPIDER datasets and our generated datasets.
fetch_db_split.py: A python script to find which databases are in dev set and which are in train set in SPIDER.
generator.py: The core script to generate SQL queries paired with canonical natural language utterances in rule-based random manner.
inspect_tables.py: A python script to check out table stats, retype all columns according to their actual type and assign columns with 'id' in their names as of 'id' type.
json2csv.py: A python script to convert SPIDER-styled generated query pairs into a csv format ready for crowd sourcing.
run.sh: Run this script to generate queries on all databases.
run_dark.sh: Run this script to generate queries on the three put-aside databases, namely: concert_singer, pets_1 and car_1.
survey.html: The survey webpage to use.

First, download SPIDER from https://yale-lily.github.io/spider, and put it in this directory; Then in commmand line type:

conda create -n env3 python=3.7
pip install -r requirements.txt
python inspect_tables.py

bash run.sh
python json2csv.py

After this, SPIDER styled json files for each database and csv file for crowd sourcing can be found in 'saved_results' folder.

bash run_dark.sh
python json2csv.py --dark

After this, SPIDER styled json files and csv files for each dark databases can be found in 'saved_results' folder.

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.idea		.idea
cspider		cspider
dusql		dusql
README.md		README.md
annotate.php		annotate.php
annotate.py		annotate.py
annotate_edge.py		annotate_edge.py
choose_entries_to_correct.py		choose_entries_to_correct.py
copy_cspider_sql_json_to_spider.py		copy_cspider_sql_json_to_spider.py
core.py		core.py
csv2json.py		csv2json.py
examine_dataset.py		examine_dataset.py
examine_dataset_dusql.py		examine_dataset_dusql.py
fetch_db_split.py		fetch_db_split.py
generator.py		generator.py
index.html		index.html
inspect_tables.py		inspect_tables.py
json2comparison.py		json2comparison.py
json2csv.py		json2csv.py
load_db.py		load_db.py
pagerank.py		pagerank.py
phrase_structures.py		phrase_structures.py
requirements.txt		requirements.txt
run.sh		run.sh
run_dark.sh		run_dark.sh
saveData.php		saveData.php
survey.html		survey.html
utils.py		utils.py