Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors in post-perturbation queries #1

Closed
saparina opened this issue Mar 13, 2023 · 6 comments
Closed

Errors in post-perturbation queries #1

saparina opened this issue Mar 13, 2023 · 6 comments

Comments

@saparina
Copy link

Thank you for releasing the dataset!

  1. I notice that some gold queries from DB_schema_abbreviation and DB_schema_synonym fail to execute. I used the official script for calculating test suite execution accuracy with values, which throws an exception on L228.
    For example, gold query #567 from the DB_schema_synonym post-perturbation set
SELECT count(*) FROM flights JOIN airports ON flights.DestAirport = airports.AirportCode JOIN airports ON flights.flight_from = airports.AirportCode WHERE airports.capital = "Ashley" and airports.capital = "Aberdeen"

fails because sqlite cannot resolve the ambiguous column names (airports.capital). The original SQL query uses aliases T1, T2, T3 and thus does not have this problem.

  1. Some integer values in the queries became float after the perturbations, which may lead to incorrect results. Gold query #1032 from the DB_schema_synonym post-perturbation set for the question How many museums were opened after 2013 or before 2008?
SELECT count(*) FROM museum WHERE museum.accessible_from > 2013.0 or museum.accessible_from < 2008.0

includes museums opened in 2013 and 2008 because the column accessible_from has the text format in the database (so here it is string comparison, not number).
The original query performs strict comparison as it uses the integer values 2013 and 2008 and executes with a different result.

@shuaichenchang
Copy link
Contributor

Hi Irina,
Thank you for pointing them out. We fixed the errors in the PR #2.

  1. Table aliases have been added to the self-join queries. This implementation was ad hoc since only three queries in Spider-dev had self-joins. However, it should be incorporated into the SQL parser in the future.
  2. Integer-value floats, such as 100.0, have been converted to integers, for example, 100. This has been done to ensure consistency in data types.

In addition, kindly use this taoyds/test-suite-sql-eval#13 for test suite evaluation prior to merging it into the main repository or remove https://github.com/taoyds/test-suite-sql-eval/blob/master/evaluation.py#L569 in the original test-suite script. The original test-suite scripts might contain inaccuracies for columns and tables that have "value" as a part of their names.

@saparina
Copy link
Author

Thanks for the quick fix! Do these changes affect the leaderboard?

@shuaichenchang
Copy link
Contributor

Thanks for the quick fix! Do these changes affect the leaderboard?

After testing the Picard model on the updated data, we found that the execution accuracy was affected by approximately 0.1 points, and the exact set match accuracy was affected by about 0.5 points for DB_schema_abbreviation and DB_schema_synonym. We will update the leaderboard with these new results, as well as the latest results from other models soon.

@saparina
Copy link
Author

Great, thanks a lot!

@saparina
Copy link
Author

Could you please check the databases 'world_1_0' , 'world_1_3', 'world_1_4' in the DB_schema_synonym post-perturbation set? I think the actual column names in the sqlite_sequence table slightly differ the version provided in tables_post_perturbation.json.

@shuaichenchang
Copy link
Contributor

shuaichenchang commented Mar 15, 2023

Could you please check the databases 'world_1_0' , 'world_1_3', 'world_1_4' in the DB_schema_synonym post-perturbation set? I think the actual column names in the sqlite_sequence table slightly differ the version provided in tables_post_perturbation.json.

Thank you for finding it! The sqlite_sequence table was automatically created by SQLite for its internal use. We will skip the perturbations to the sqlite_sequence table in this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants