Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

export v2: "accession" missing from node attributes when used as the metadata ID column #1260

Closed
victorlin opened this issue Jul 24, 2023 · 0 comments · Fixed by #1261
Closed
Assignees
Labels
bug Something isn't working

Comments

@victorlin
Copy link
Member

victorlin commented Jul 24, 2023

Originally noticed in nextstrain/rsv#29 (comment).

Current Behavior

In augur export v2, the accession node attribute gets exported as long as it's an available node attribute. The problem comes when "accession" is used as the metadata ID column - it is no longer available in the current implementation.

I think this is an oversight of 3be4d18.

Run export with metadata that contains "accession", and use "accession" as the ID column.
Currently, this results in losing the accession from the node attributes.
$ ${AUGUR} export v2 \
> --tree "$TESTDIR/../data/tree.nwk" \
> --metadata "$TESTDIR/../data/dataset1_metadata_with_strain_and_accession.tsv" \
> --metadata-id-columns accession \
> --node-data "$TESTDIR/../data/div_node-data.json" "$TESTDIR/../data/location_node-data.json" \
> --auspice-config "$TESTDIR/../data/auspice_config1.json" \
> --maintainers "Nextstrain Team" \
> --output dataset.json > /dev/null
$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset_with_accession.json" dataset.json \
> --exclude-paths "root['meta']['updated']" "root['meta']['maintainers']"
{'dictionary_item_removed': [root['tree']['children'][0]['node_attrs']['accession'], root['tree']['children'][1]['children'][0]['node_attrs']['accession'], root['tree']['children'][1]['children'][1]['node_attrs']['accession'], root['tree']['children'][2]['children'][0]['node_attrs']['accession'], root['tree']['children'][2]['children'][1]['node_attrs']['accession'], root['tree']['children'][2]['children'][2]['node_attrs']['accession']]}

Expected behavior

The "accession" column can both (1) be used as the metadata ID column and (2) exported as a node attribute.

$ python3 "$TESTDIR/../../../../scripts/diff_jsons.py" "$TESTDIR/../data/dataset_with_accession.json" dataset.json \
> --exclude-paths "root['meta']['updated']" "root['meta']['maintainers']"
{}

Possible solution

0ec4cc6

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
No open projects
Development

Successfully merging a pull request may close this issue.

1 participant