Skip to content

Plotting interface lacks an option to prune chart strains to those present in the tree #6

@huddlej

Description

@huddlej

Description

While the tree-annotated-plot flag --prune-tree-to-chart allows strains to exist in the tree that don't exist in the chart, there is no flag to allow strains to exist in the chart that don't exist in the tree. Some older titer plots include data for multiple subtypes in the chart JSON even when the plots are subtype-specific. When I try to plot a subtype-specific tree with these older chart JSONs, the plotting tool errors because strains exist in the chart that do not exist in the tree.

Steps to recreate

  1. Open H1N1pdm individual titers from 2025: https://jbloomlab.github.io/flu-seqneut-2025/human_sera_titers_H1N1_recent_individual_sera.html
  2. Select Altair “…” menu in the top right
  3. Select “Open in Vega Editor”
  4. Select “Export” from the Vega Editor
  5. Download JSON in Vega-lite format as titers.json
  6. In the same directory as the titers.json file, download the corresponding H1N1pdm HA tree from Nextstrain with the following command: curl https://nextstrain.org/groups/blab/kikawa-seqneut-2025-VCM/h1n1pdm --header 'Accept: application/vnd.nextstrain.dataset.main+json' --compressed > h1n1pdm_sep_2025_vcm.json
  7. Run the following tree-annotated-plot command:
tree-annotated-plot \
    --no-strict-version \
    --prune-tree-to-chart \
    --tree h1n1pdm_sep_2025_vcm.json \
    --chart titers.json \
    --output combined.html \
    --chart-strain-field virus \
    --tree-strain-field name \
    --branch-length div

This command produces the following error:

Traceback (most recent call last):
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/bin/tree-annotated-plot", line 6, in <module>
    sys.exit(main())
             ~~~~^^
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/click/core.py", line 1514, in __call__
    return self.main(*args, **kwargs)
           ~~~~~~~~~^^^^^^^^^^^^^^^^^
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/click/core.py", line 1435, in main
    rv = self.invoke(ctx)
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/click/core.py", line 1298, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/click/core.py", line 853, in invoke
    return callback(*args, **kwargs)
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/tree_annotated_plot/cli.py", line 291, in main
    out = _build(tree_path, chart_path, config)
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/tree_annotated_plot/_plot.py", line 171, in _build
    _reconcile_tips_and_strains(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        tree_strains=tip_names,
        ^^^^^^^^^^^^^^^^^^^^^^^
    ...<5 lines>...
        tree_source=tree,
        ^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/jhuddlesfredhutch.org/miniconda3/envs/tree-annotated-plot/lib/python3.13/site-packages/tree_annotated_plot/_plot.py", line 742, in _reconcile_tips_and_strains
    raise ValueError(
    ...<10 lines>...
    )
ValueError: 101 chart strain(s) are not present in the tree (these would be silently dropped if we pruned, so this is always fatal).

Tried: chart_strain_field='virus', tree_strain_field='name'
Sample chart_strain_field values: ['A/Amapa/021563-IEC/2024_H3N2', 'A/Badajoz/18680568/2025_H3N2', 'A/Bangkok/P176/2025_H1N1', 'A/Brisbane/02/2018_H1N1', 'A/BurkinaFaso/3131/2023_H3N2']
Sample tree_strain_field values:  ['A/Abudhabi/13621/2025', 'A/Abudhabi/13978/2024', 'A/Abudhabi/14069/2024', 'A/Abudhabi/15586/2024', 'A/Abudhabi/15664/2024']
Sample chart-only values: ['A/Amapa/021563-IEC/2024_H3N2', 'A/Badajoz/18680568/2025_H3N2', 'A/Brisbane/02/2018_H1N1', 'A/BurkinaFaso/3131/2023_H3N2', 'A/Busan/461/2025_H3N2']

The error occurs because the titers chart JSON file contains H3N2 records in the chart datasets field, even though the original chart displayed only H1N1pdm records.

For backward compatibility with these older charts, it would be nice if the tool provided an option for users to disable the strict strain checking (e.g., --prune-chart-to-tree).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions