BehaviorCheckingSemPar

Welcome to the code base for the Eval4NLP 2021 paper Evaluating Cross-Database Semantic Parsers Using Canonical Utterances

Overview

There are three main ways to use this repo:

You can download our dataset used in the eval4nlp paper
You can generate a new dataset
You can add more SCFG rules to our grammar, to generate even more data!

Presently, this repo contains tests for the following SQL elements:

Basic Clauses
- SELECT
- DISTINCT
- WHERE
- ORDER BY
- GROUP BY
- HAVING
Aggregate Operators
- MIN
- MAX
- SUM
- COUNT
- AVG
Conditional Operators
- Less than / less than or equal to (<, <=)
- Greater than / greater than or equal to (>, >=)
- Equals (=)
- Not equals (!=)
- BETWEEN
Logical Operators
- AND
- OR
- AND + OR together
Bonus features 🎉
- LIKE
- NOT LIKE
- IN
- NOT IN

Quick Start

If you want our test data from the paper, this is located the under eval4nlp/scfg.

If you would like to create your own test data from scratch, or integrate your own SCFG rules, you can follow the instructions below!

Install Dependencies

First, set up a basic environment.

git clone https://github.com/hclent/BehaviorCheckingSemPar.git
cd BehaviorCheckingSemPar

conda create -n sempar python=3.6
conda activate sempar
pip install -r requirements.txt

Prepare Spider Data

Unzip the databases in data/spider, before you get started. The data in this repo is NOT out-of-the-box Spider!

Our version of Spider follows data pre-processing conventions introduced by TabularSemanticParsing. If you'd like to learn more, we recommend downloading this code and following the instructions for "Process Data".

Generate Test Data

To generate test data, simply run:

python grammar.py spider data/spider all random

and then conver these outputs to match the complete Spider format:

python format_grammar.py

Our grammar.py uses the spider databases (data/spider) to generate SCFG test pairs for all categories of SQL elements, listed above. Columns and values in resulting queries are chosen randomly (random seed 27). This code is structured such that, for each database, for each table, all applicable SCFG rules will run. (E.g., if a given table only has <number> variables, the code will not return pairs for the LIKE operator).

How can I add new rules?

First, lets look at an example SCFG rule:

def distinct1(db, table=None, tableLUT=None):
    table1, col1 = get_table_k_columns(db, table=table)
    aliascol = getColAlias(db, table1, col1)

    if tableLUT is not None:  # if there is a tableLUT, update table1 to be the gt name
        lookup = tableLUT[table1]
        gt_alias_table = lookup[0]
        gt_table = lookup[1]
    else:
        gt_alias_table = table1
        gt_table = table1

    nlq = f"{q1} {w_distinct} {aliascol} {w_from} {gt_alias_table}"
    sql = f"SELECT DISTINCT {col1} FROM {gt_table} ;"
    print("*****"*10)
    print(f"/ {nlq}")
    print(f'\\ {sql}')
    print("*****"*10)
    return nlq, sql

Each new SCFG rule should, at minimum, take the parameters db, table=None, tableLUT=None. Extra parameters can also be specified, such as k=3 for number of columns, and so on. The most important part of making a new rule is to return nlq, sql pair, where the nlq uses elements from the global SCFG variables combined with alias names for tables and columns, and the sql uses the original table and column names, joined by SQL syntax.

A good way to sanity check your new rules, is to plug them into the very bottom of the grammar.py, and run python grammar.py spider data/spider one random. This picks a random dataset, to test your one rule on. You may also substitute a database name for random to develop with a certain databse in mind.

Once you are happy with your rule, you can add it to the pipeline for executing all SCFG rules. A rule's place in the pipeline is based on what table, column, and variable restrictions exist for a given SQL element. Start at L#1311, and follow along to see where you should plug in your rule.

Unit testing on downstream models?

Because this test suite outputs data in the Spider format, you will easily be able to "plug-and-play" this test data into other models already supporting Spider.

Citation

@inproceedings{lent-etal-2021-testing,
    title = "Testing Cross-Database Semantic Parsers With Canonical Utterances",
    author = "Lent, Heather  and
      Yavuz, Semih  and
      Yu, Tao  and
      Niu, Tong  and
      Zhou, Yingbo  and
      Radev, Dragomir  and
      Lin, Xi Victoria",
    booktitle = "Proceedings of the 2nd Workshop on Evaluation and Comparison of NLP Systems",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.eval4nlp-1.8",
    pages = "73--83",
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs/bridge		configs/bridge
data/spider		data/spider
eval4nlp/scfg		eval4nlp/scfg
moz_sp		moz_sp
src		src
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
README.md		README.md
eval_spider_official.sh		eval_spider_official.sh
format_grammar.py		format_grammar.py
grammar.py		grammar.py
process_sql.py		process_sql.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BehaviorCheckingSemPar

Overview

Quick Start

Install Dependencies

Prepare Spider Data

Generate Test Data

How can I add new rules?

Unit testing on downstream models?

Citation

About

Releases

Packages

Languages

hclent/BehaviorCheckingSemPar

Folders and files

Latest commit

History

Repository files navigation

BehaviorCheckingSemPar

Overview

Quick Start

Install Dependencies

Prepare Spider Data

Generate Test Data

How can I add new rules?

Unit testing on downstream models?

Citation

About

Resources

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages