Skip to content

Commit

Permalink
Merge pull request #1267: filter: Expose internal Pandas errors from …
Browse files Browse the repository at this point in the history
…`--query`
  • Loading branch information
victorlin committed Jul 28, 2023
2 parents f3dfd0e + d7ca649 commit 4f5559a
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 7 deletions.
2 changes: 2 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,12 @@
* export v2: Previously, when `strain` was not used as the metadata ID column, node attributes might have gone missing from the final Auspice JSON. This has been fixed. [#1260][], [#1262][] (@victorlin, @joverlee521)
* export v1: Added a deprecation warning for this command. [#1265][] (@victorlin)
* export v1: The recently introduced flag `--metadata-id-columns` did not work properly due to the same `export v2` bug that was fixed in this release. Instead of fixing it in `export v1`, drop the broken feature since this command is no longer being maintained. [#1265][] (@victorlin)
* filter: Expose internal Pandas errors from `--query` which may be useful to users. [#1267][] (@victorlin)

[#1260]: https://github.com/nextstrain/augur/issues/1260
[#1262]: https://github.com/nextstrain/augur/issues/1262
[#1265]: https://github.com/nextstrain/augur/pull/1265
[#1267]: https://github.com/nextstrain/augur/pull/1267

## 22.1.0 (10 July 2023)

Expand Down
2 changes: 1 addition & 1 deletion augur/filter/include_exclude_rules.py
Original file line number Diff line number Diff line change
Expand Up @@ -718,7 +718,7 @@ def apply_filters(metadata, exclude_by: List[FilterOption], include_by: List[Fil
UndefinedVariableError = pd.core.computation.ops.UndefinedVariableError # type: ignore
if isinstance(e, UndefinedVariableError):
raise AugurError(f"Query contains a column that does not exist in metadata.") from e
raise AugurError(f"Error when applying query. Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.") from e
raise AugurError(f"Internal Pandas error when applying query:\n\t{e}\nEnsure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.") from e
else:
raise

Expand Down
25 changes: 19 additions & 6 deletions tests/functional/filter/cram/filter-query-errors.t
Original file line number Diff line number Diff line change
Expand Up @@ -12,22 +12,35 @@ Using a pandas query with a nonexistent column results in a specific error.
[2]


Using pandas queries with bad syntax results in a generic errors.
Using pandas queries with bad syntax results in meaningful errors.

This raises a ValueError internally (https://github.com/nextstrain/augur/issues/940):
Some error messages from Pandas may be useful, so they are exposed:

$ ${AUGUR} filter \
> --metadata "$TESTDIR/../data/metadata.tsv" \
> --query "invalid = 'value'" \
> --query "region >= 0.50" \
> --output-strains filtered_strains.txt > /dev/null
ERROR: Error when applying query. Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.
ERROR: Internal Pandas error when applying query:
'>=' not supported between instances of 'str' and 'float'
Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.
[2]

This raises a SyntaxError internally (https://github.com/nextstrain/augur/issues/941):
However, other Pandas errors are not so helpful, so a link is provided for users to learn more about query syntax.

$ ${AUGUR} filter \
> --metadata "$TESTDIR/../data/metadata.tsv" \
> --query "invalid = 'value'" \
> --output-strains filtered_strains.txt > /dev/null
ERROR: Internal Pandas error when applying query:
cannot assign without a target object
Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.
[2]

$ ${AUGUR} filter \
> --metadata "$TESTDIR/../data/metadata.tsv" \
> --query "some bad syntax" \
> --output-strains filtered_strains.txt > /dev/null
ERROR: Error when applying query. Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.
ERROR: Internal Pandas error when applying query:
invalid syntax (<unknown>, line 1)
Ensure the syntax is valid per <https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-query>.
[2]

0 comments on commit 4f5559a

Please sign in to comment.