Skip to content

Commit

Permalink
Merge pull request #1259: Update export v2 Cram tests
Browse files Browse the repository at this point in the history
  • Loading branch information
victorlin committed Jul 26, 2023
2 parents 9ef4711 + 030e95b commit da6a9f2
Show file tree
Hide file tree
Showing 40 changed files with 737 additions and 189 deletions.
6 changes: 6 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,12 @@

## __NEXT__

### Bug fixes

* export v2: Previously, when `strain` was not used as the metadata ID column, node attributes might have gone missing from the final Auspice JSON. This has been fixed. [#1260][], [#1262][] (@victorlin, @joverlee521)

[#1260]: https://github.com/nextstrain/augur/issues/1260
[#1262]: https://github.com/nextstrain/augur/issues/1262

## 22.1.0 (10 July 2023)

Expand Down
18 changes: 10 additions & 8 deletions augur/export_v2.py
Original file line number Diff line number Diff line change
Expand Up @@ -1016,11 +1016,11 @@ def parse_node_data_and_metadata(T, node_data, metadata):
node_attrs = {clade.name: {} for clade in T.root.find_clades()}

# first pass: metadata
for node in metadata.values():
if node["strain"] in node_attrs: # i.e. this node name is in the tree
for metadata_id, node in metadata.items():
if metadata_id in node_attrs: # i.e. this node name is in the tree
for key, value in node.items():
corrected_key = update_deprecated_names(key)
node_attrs[node["strain"]][corrected_key] = value
node_attrs[metadata_id][corrected_key] = value
metadata_names.add(corrected_key)

# second pass: node data JSONs (overwrites keys of same name found in metadata)
Expand Down Expand Up @@ -1074,13 +1074,15 @@ def run(args):

if args.metadata is not None:
try:
metadata_file = read_metadata(
metadata_df = read_metadata(
args.metadata,
delimiters=args.metadata_delimiters,
id_columns=args.metadata_id_columns).to_dict(orient="index")
for strain in metadata_file.keys():
if "strain" not in metadata_file[strain]:
metadata_file[strain]["strain"] = strain
id_columns=args.metadata_id_columns)

# Add the index as a column.
metadata_df[metadata_df.index.name] = metadata_df.index

metadata_file = metadata_df.to_dict(orient="index")
except FileNotFoundError:
print(f"ERROR: meta data file ({args.metadata}) does not exist", file=sys.stderr)
sys.exit(2)
Expand Down
181 changes: 0 additions & 181 deletions tests/functional/export_v2.t

This file was deleted.

1 change: 1 addition & 0 deletions tests/functional/export_v2/cram/_setup.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export AUGUR="${AUGUR:-$TESTDIR/../../../../bin/augur}"
34 changes: 34 additions & 0 deletions tests/functional/export_v2/cram/augur-version-mismatch.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
Setup

$ source "$TESTDIR"/_setup.sh

Node-data JSONs produced from a different major version of augur
are not allowed.

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data2.json" \
> --auspice-config "$TESTDIR/../data/auspice_config3.json" \
> --output dataset.json
ERROR: Augur version incompatibility detected: the JSON .*location_node-data2\.json.* was generated by \{'program': 'augur', 'version': '13.1.2'\}, which is incompatible with the current augur version \([.0-9]+\). We suggest you rerun the pipeline using the current version of augur. (re)
[2]

Skipping validation allows mismatched augur versions to be used without error.
(Note the stderr/stdout output is detailed here, including 2 empty lines)

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data2.json" \
> --auspice-config "$TESTDIR/../data/auspice_config2.json" \
> --output dataset.json \
> --skip-validation
WARNING: You didn't provide information on who is maintaining this analysis.
\s{0} (re)
Skipping validation of produced JSON due to --validation-mode=skip or --skip-validation.
\s{0} (re)
Check the output from the above command against its expected contents
$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset2.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}
28 changes: 28 additions & 0 deletions tests/functional/export_v2/cram/auspice_config1.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
Setup

$ source "$TESTDIR"/_setup.sh

Export with auspice config JSON which defines scale & legend settings

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config1.json" \
> --output dataset.json &>/dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset1.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}

...same but with repeated --node-data options instead of a single multi-valued option

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" \
> --node-data "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config1.json" \
> --output dataset.json &>/dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset1.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}
15 changes: 15 additions & 0 deletions tests/functional/export_v2/cram/auspice_config2.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Setup

$ source "$TESTDIR"/_setup.sh

Export with auspice config JSON with an extensions block

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config2.json" \
> --output dataset.json &>/dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset2.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}
27 changes: 27 additions & 0 deletions tests/functional/export_v2/cram/auspice_config3.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
Setup

$ source "$TESTDIR"/_setup.sh

# auspice_config3.json is the same as auspice_config2.json but with an extra key which the schema does not allow.
# Running without --skip-validation should result in an error
# Message printed: "Validation of "$TESTDIR/../data/auspice_config3.json" failed."

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config3.json" \
> --output dataset.json &>/dev/null
[2]

# Skipping validation gives us the same results as `auspice_config2.json`

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config3.json" \
> --output dataset.json \
> --skip-validation &>/dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset2.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}
17 changes: 17 additions & 0 deletions tests/functional/export_v2/cram/auspice_config4.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
Setup

$ source "$TESTDIR"/_setup.sh

Run export with metadata and external colors TSV that contains zero values.

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config4.json" \
> --metadata "$TESTDIR/../data/zero_value_metadata.tsv" \
> --colors "$TESTDIR/../data/zero_value_colors.tsv" \
> --output dataset.json &> /dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset3.json" dataset.json \
> --exclude-paths "root['meta']['updated']" "root['meta']['maintainers']"
{}
16 changes: 16 additions & 0 deletions tests/functional/export_v2/cram/branch_attrs.t
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
Setup

$ source "$TESTDIR"/_setup.sh

Test that attributes are correctly exported as branch_attrs. Currently this includes branch labels (node_data→branches),
mutations (node_data→nodes) and a historical node_data→nodes→<name>→clade_annotation branch label.

$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/nt_muts_1.json" "$TESTDIR/../data/aa_muts_1.json" "$TESTDIR/../data/branch-labels.json" \
> --maintainers "Nextstrain Team" \
> --output dataset.json > /dev/null

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset-with-branch-labels.json" dataset.json \
> --exclude-paths "root['meta']['updated']"
{}

0 comments on commit da6a9f2

Please sign in to comment.