Skip to content

Files

Latest commit

author
longxud
Apr 14, 2022
5ae4e36 · Apr 14, 2022

History

History

unified_parser_text_to_sql

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022
Apr 14, 2022

Introduction

This paper introduces UniSAr, which extends existing autoregressive language models to incorporate three non-invasive extensions to make them structure-aware: (1) adding structure mark to encode database schema, conversation context, and their relationships; (2) constrained decoding to decode well structured SQL for a given database schema; and (3) SQL completion to complete potential missing JOIN relationships in SQL based on database schema.

Dataset and Model

Spider -> ./data/spider

Fine-tuned BART model -> ./models/spider_sl (Please download this model by git-lfs to avoid the issue.)

sudo apt-get install git-lfs
git lfs install
git clone https://huggingface.co/dreamerdeo/mark-bart

Main dependencies

  • Python version >= 3.6
  • PyTorch version >= 1.5.0
  • pip install -r requirements.txt
  • fairseq is going though changing without backward compatibility. Install fairseq from source and use this commit for reproducibilty. See here for the current PR that should fix fairseq/master.

Evaluation Pipeline

Step 1: Preprocess via adding schema-linking and value-linking tag.

python step1_schema_linking.py

Step 2: Building the input and output for BART.

python step2_serialization.py

Step 3: Evaluation Script with/without constrained decoding.

python step3_evaluate.py --constrain

Results

Prediction: 69.34

Prediction with Constrain Decoding: 70.02

Interactive

python interactive.py --logdir ./models/spider-sl --db_id student_1 --db-path ./data/spider/database --schema-path ./data/spider/tables.json

Reference Code

https://github.com/ryanzhumich/editsql

https://github.com/benbogin/spider-schema-gnn-global

https://github.com/ElementAI/duorat

https://github.com/facebookresearch/GENRE