# Pango X lineage subgraphs

In this document we display subgraphs for all the main Pango X lineages that have samples present in the _sc2ts_ ARG. Pango designations for both samples and internal nodes were assigned using <a href="https://cov-lineages.org/resources/pangolin/">Pangolin 4.3.1</a> (see main text). For nodes with large numbers of descendants, only a selected sample of roughly 20-50 Pango X samples are shown in these subgraphs. Extra descendants of a node are shown with dotted lines indicating additional immediate children of a node (note that this is not the same as the number of descendant samples). In some cases, additional descendant nodes of different Pango designations (e.g. BA.2) are shown for context.

Sample nodes are shown as squares; internal inferred nodes as circles. Recombination nodes are presented as larger circular nodes, with a Pango designation followed by the breakpoint position(s) surrounded by slashes, e.g. a breakpoint at position 1234 bp is indicated as **/1234/** (but note that Pango X lineages that are not of recombinant origin in *sc2ts* will not have a clear recombination node). Nodes of the focal pango type are plotted in pink, or as a set of alternative colours if multiple Pango designations are plotted in a single subgraph. Mutations within each subgraph (tickmarks along edges) are coloured pink if they are flagged as consensus mutations the focal Pango lineage(s): often such mutations occur in lineages above the Pango X origination node. Alternatively, if there are multiple mutations at the same site within a subgraph (indicating problematic reversion or recurrent mutations) they are plotted in a unique colour. For example, two green mutation tickmarks will represent mutations at the same site. If one is a reversion of a previous mutation (often indicating an unparsimonious reconstruction of topology), then the mutation is emphasised with a solid black outline. Deletion mutations are filled in black, and insertions are given strong emphasis using a magenta fill; where a site experiences multiple deletions or insertions at the same site, the outline is given a site-specific colour: note that reinsertions of a previous large deletion are biologically implausible, and are likely to represent ARG reconstruction artefacts.

In the PDF version of this document, hovering over node names will reveal the sample_id of a node, and hovering over a mutation will reveal the position of the mutation and the inherited vs derived state. E.g. a mouseover label of <code>mut:A1234T</code> denotes a mutation from an A to a T at position 1234 in the genome. Technially this is implemented by faking a URL (this leads to the slightly annoying behaviour that actually clicking on the hover-over text will attempt to open a non-existent URL).

For Pango X lineages that have a recombinant origin in the ARG, the summary copying pattern is also displayed below the subgraph. 

In [None]:
## NOTE: this notebook is not saved with the output cells completed, as this makes the file uncomfortably large
#
# Export only the output cells to PDF via:
# jupyter nbconvert --to webpdf --no-prompt --no-input --PDFExporter.scale_factor=0.8  --TagRemovePreprocessor.remove_cell_tags='{"remove_cell"}' --PDFExporter.margin_left=0.2cm --PDFExporter.margin_right=0.2cm Viridian-PangoX.ipynb
# A fancier option is to use the src/makepdf.py script, which creates hover-over labels, e.g. to save figures/supp_pdf-subgraphs-PangoX.pdf:
#  python src/makepdf.py notebooks/supp_pdf-subgraphs-PangoX.ipynb -o figures 


In [None]:
import collections

import numpy as np
import pandas as pd
import sc2ts
import tskit
import tszip
import warnings
from IPython.display import HTML

import nb_utils
from nb_utils import DATA_DIR

from IPython.display import HTML
HTML("""<style>@page {margin: 0.5cm;}</style>""")  # Allow space for copying patterns in the pdf

In [None]:
# Get the Viridian ARG
ts = tszip.load(DATA_DIR / "sc2ts_viridian_v1.2.trees.tsz")
df = sc2ts.node_data(ts).set_index("sample_id")

# Join with the associated data
ds = nb_utils.load_dataset()
df = df.join(ds.metadata.as_dataframe(["Viridian_pangolin"]))

hide_progress = True  # Set to `True` and rerun the nodebook to get a nicer version for PDF output

In [None]:
# Set which pango designation to use:
# Use "pango" to get the pango designations for all nodes computed by postprocessing the ARG.
# Use "Viridian_pangolin" to use the sample designations provided by Viridian.
Pango = "pango"

In [None]:
dfX = pd.read_csv("../data/pango_x_events.csv")
pango_lineage_samples = df[df.is_sample].groupby(Pango)['node_id'].apply(list).to_dict()
pangoX = np.unique(dfX.root_pango)
display(HTML(
    f'<table><tr><th>{len(pangoX)} pango-X lineages</th></tr>'
    f'<tr><td>{", ".join(pangoX)}</td></tr></table>'
))

In [None]:
print("Consensus mutations for each lineage taken from https://covidcg.org")
lineage_consensus_muts = nb_utils.read_in_mutations("../data/consensus_mutations.json.bz2")

In [None]:
# Load in the ARG to the visualizer - can take a few minutes
arg = nb_utils.D3ARG_viz(ts, df, lineage_consensus_muts, pangolin_field=Pango, progress=not hide_progress)

In [None]:
arg.set_sc2ts_node_labels(progress=not hide_progress)

arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 86456, 'label'] = "Alpha-root"
arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 200039, 'label'] = "Delta-root"
arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 851246, 'label'] = "BA.1-root"
arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 822854, 'label'] = "BA.2-root"
arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 1265302, 'label'] = "BA.4-root"
arg.d3arg.nodes.loc[arg.d3arg.nodes.id == 1189192, 'label'] = "BA.5-root"
arg.set_sc2ts_node_styles()

In [None]:
# Scale all the viz versions for print, so that a standard 750 x 1000 subgraph fits onto one size of A4
display(HTML(
    """<style>
    @media print {.d3arg {zoom: 0.8}}
    .big table.copying-table {font-size: 8px; margin-left: auto !important; margin-right: auto !important; @media print {zoom: 0.8}}
    .small table.copying-table {font-size: 8px; margin-left: auto !important; margin-right: auto !important; @media print {zoom: 0.6}}
    table.copying-table .pattern td {font-size: 0.5em; width:0.7em}
    </style>"""
));

def txt(html, right="33em", top="15em", width="275px"):
    return (
        f'<div style=\"position: absolute; z-index:1; right:{right}; top:{top}; width:{width};'
        f'border:1px solid black; padding: 0.5em;\">{html}</div>'
    )

def copypattern(u, css_class="big", hide_extra_rows=True, hide_labels=True, show_bases=None, **kwargs):
    display(HTML(
        f'<div class="{css_class}">' +
        sc2ts.info.CopyingTable(ts, u).html(hide_extra_rows=hide_extra_rows, hide_labels=hide_labels, show_bases=show_bases, **kwargs) +
        '</div>' 
    ))

class RecordPango:
    # Simply record all the calls to cls.pango()
    pangos = []
    def pango(self, pango_string):
        if type(pango_string) == str:
            self.pangos.append(pango_string)
        else:
            self.pangos += pango_string
        return pango_string
rec = RecordPango()

In [None]:
arg.plot_pango_subgraph(
    rec.pango("XA"),
    y_axis_scale="rank",
    parent_pangos=("B.1.1.7", "B.1.177.18"),
    tree_highlighting=False,
)
copypattern(122444)

In [None]:
# XB has too many samples so we remove the immediate children of the XB root
exclude = np.unique(ts.edges_child[ts.edges_parent==223239])
exclude = exclude[exclude != 223230]

keep_ids = list(df.loc[["SRR19258389", "SRR16145215", "SRR14453168", "SRR16733818", "SRR14898852"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XB"),
    child_levels=0,
    include=keep_ids,
    restrict_to_first=20,
    exclude=exclude,
    parent_pangos=["B.1.631", "B.1.634", "B.1.627"],
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XB", 'node_id']},
    tree_highlighting=False,
)


In [None]:
arg.plot_pango_subgraph(
    rec.pango("XC"),
    parent_levels=5,
    y_axis_scale="rank",
    parent_pangos=["AY.29", "B.1.1.7"],
    oldest_y_label="2020-09",
    tree_highlighting=False,
)
copypattern(414488)

In [None]:
pangoX = ["XE", "XH"]
keep_ids = list(df.loc[["SRR17712953", "ERR9656085", "ERR9657718"], 'node_id']) + [1177107]
cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})


with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        rec.pango(pangoX),
        include=keep_ids,
        restrict_to_first=20,
        parent_pangos=["BA.1.17.2", "BA.2"],
        child_levels=0,
        parent_levels=4,
        y_axis_scale="rank",
        oldest_y_label="2021-11",
        highlight_nodes=cmap,
        tree_highlighting=False,
    )
copypattern(965353)


In [None]:
arg.plot_pango_subgraph(
    rec.pango("XF"),
    parent_pangos=["BA.1", "AY.4"],
    parent_levels=11,
    y_axis_scale="rank",
    oldest_y_label="2020-04",
    tree_highlighting=False,
)
copypattern(946761, css_class="small")

In [None]:
keep_ids = list(df.loc[["ERR9124067"], 'node_id'])


arg.plot_pango_subgraph(
    rec.pango("XG"),
    parent_pangos=["BA.1.17", "BA.2"],
    include=keep_ids,
    y_axis_scale="rank",
    parent_levels=9,
    oldest_y_label="2021-08",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XG", 'node_id']},
    tree_highlighting=False,
)
copypattern(1083412)

In [None]:
keep_ids = [1090786, 2258352]


arg.plot_pango_subgraph(
    rec.pango("XJ"),
    include=[1090786, 2258352],
    y_axis_scale="rank",
    parent_pangos=["BA.1.17.2", "BA.2"],
    parent_levels=5,
    oldest_y_label="2020-06",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XJ", 'node_id']},
    tree_highlighting=False,
)
copypattern(966905)

In [None]:
keep_ids = list(df.loc[["SRR18781053"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XL"),
    include=keep_ids,
    y_axis_scale="rank",
    parent_pangos=["BA.1.17.2", "BA.2"],
    parent_levels=5,
    oldest_y_label="2020-06",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XL", 'node_id']},
    tree_highlighting=False,
)
copypattern(1034619)

In [None]:
pangos = rec.pango(["XM", "XAL"])
arg.plot_pango_subgraph(
    pangos,
    parent_pangos=["BA.1.1", "BA.2"],
    child_levels=0,
    parent_levels=7,
    highlight_nodes={c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangos)},
    y_axis_scale="rank",
    tree_highlighting=False,
)

copypattern(1003220)

In [None]:
pangoX = rec.pango(["XN", "XAU"])
keep_ids = [1230542, 1235915, 1250403, 2546319, 2508083]
cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})

arg.plot_pango_subgraph(
    pangoX,
    include=keep_ids,
    parent_levels=7, child_levels=0,
    parent_pangos=["BA.2"],
    highlight_nodes=cmap,
    oldest_y_label="2020-07",
    y_axis_scale="rank",
    tree_highlighting=False,
)

In [None]:
keep_ids = list(df.loc[["ERR8688510", "ERR8992715", "ERR8758930", "ERR8992748"], 'node_id'])

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        rec.pango("XP"),
        parent_levels=11,
        include=keep_ids,
        child_levels=0,
        parent_pangos=["BA.1.1"],
        highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XP", 'node_id']},
        y_axis_scale="rank",
        oldest_y_label="2021-06",
        tree_highlighting=False,
    )

In [None]:
pangoX = rec.pango(["XQ", "XR", "XU", "XAA", "XAG", "XAM"])
keep_ids = [1216524, 1240312, 1105611, 2534291, 2534290, 1202063, 1158324, 1080162]
cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})

arg.plot_pango_subgraph(
    pangoX,
    include=keep_ids,
    restrict_to_first=10,
    parent_levels=10,
    child_levels=0,
    parent_pangos=["BA.1.1.15", "BA.2.9"],
    highlight_nodes=cmap,
    y_axis_scale="rank",
    oldest_y_label="2021-10",
    tree_highlighting=False,
)

copypattern(1058654)

In [None]:
keep_ids = [881625]
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        rec.pango("XS"),
        include=keep_ids,
        parent_levels=6,
        parent_pangos=["AY.103", "BA.1.1"],
        y_axis_scale="rank",
        oldest_y_label="2021-09",
        highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XS", 'node_id']},
        tree_highlighting=False,
    )

copypattern(1000242, css_class="small")
copypattern(1014313, css_class="small")

In [None]:
arg.plot_pango_subgraph(
    rec.pango("XW"),
    parent_levels=6,
    parent_pangos=["BA.1.1.15", "BA.2"],
    oldest_y_label="2021-11",
    y_axis_scale="rank",
    tree_highlighting=False,
)

copypattern(1159411)

In [None]:
keep_ids = list(df.loc[["ERR8627048"], 'node_id'])


arg.plot_pango_subgraph(
    rec.pango("XY"),
    include=keep_ids,
    parent_levels=6,
    parent_pangos=["BA.1.1", "BA.2"],
    oldest_y_label="2021-11",
    y_axis_scale="rank",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XY", 'node_id']},
    tree_highlighting=False,
)

copypattern(1187989)

In [None]:
pangoX = rec.pango(["XZ", "XAC", "XAD", "XAE", "XAP"])
keep_ids = list(df.loc[["SRR19689888", "ERR8146303", "ERR8163061", "SRR19689888", "SRR17712953"], 'node_id'])

cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})


arg.plot_pango_subgraph(
    pangoX,
    y_axis_scale="rank",
    include=keep_ids,
    parent_levels=3,
    highlight_nodes=cmap,
    oldest_y_label="2021-11",
    tree_highlighting=False,
)

copypattern(964555)

In [None]:
keep_ids = [1177107]

arg.plot_pango_subgraph(
    rec.pango("XAF"),
    include=keep_ids,
    parent_levels=1,
    child_levels=10,
    parent_pangos=["BA.2", "BA.1"],
    y_axis_scale="rank",
    oldest_y_label="2022-01",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XAF", 'node_id']},
    tree_highlighting=False,
)

copypattern(1177107)

In [None]:
keep_ids = [1149781, 1149782, 1272603]


with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        rec.pango("XAJ"),
        include=keep_ids,
        parent_levels=6,
        child_levels=0,
        parent_pangos=["BA.2.12"],
        y_axis_scale="rank",
        oldest_y_label="2021-11",    
        highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XAJ", 'node_id']},
        tree_highlighting=False,
    )

In [None]:
keep_ids = list(df.loc[["ERR9932299", "ERR9615608", "SRR20779844"], 'node_id'])
keep_ids += [1363957, 887654, 863361, 1185810, 2289990]
pangoX = rec.pango(["XAN", "XAV"])

cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})

arg.plot_pango_subgraph(
    pangoX,
    txt(
        'In this non-recombinant subgraph we also plot the separate recent ancestry of a BA.2.5 sample '
        'of the sort suggested as a left parent in Pango designation issues '
        '<a href="https://github.com/cov-lineages/pango-designation/issues/771">#771</a> and '
        '<a href="https://github.com/cov-lineages/pango-designation/issues/911">#911</a>.'
    ),
    parent_levels=2,
    child_levels=0,
    include=keep_ids,
    parent_pangos=["BA.5.1", "BA.5.1.24"],
    highlight_nodes=cmap,
    oldest_y_label="2021-11",
    tree_highlighting=False,
)

In [None]:
with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        rec.pango("XAS"),
        include=[1275206],
        parent_levels=5,
        parent_pangos=["BA.4"],
        oldest_y_label="2021-11",
        y_axis_scale="rank",
        tree_highlighting=False,
    )

In [None]:
keep_ids = list(df.loc[["ERR9810090", "SRR20430410", "SRR21294070",    "ERR9615610", "ERR8974195"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XAZ"),
    txt(
        'In this non-recombinant subgraph we also plot the separate recent ancestry of a B.2.5 sample '
        'of the sort suggested as a left parent (<= position 3358) in Pango designation issue '
        '<a href="https://github.com/cov-lineages/pango-designation/issues/797">#797</a>. '
        'It can be seen that the 3 mutations above the originating XAZ node are shared with this lineage.'
    ),
    include=keep_ids,
    restrict_to_first=20,
    parent_levels=2,
    child_levels=0,
    parent_pangos=["BA.5"],
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XAZ", 'node_id']},
    oldest_y_label="2021-11",
    tree_highlighting=False,
)

In [None]:
keep_ids = list(df.loc[["SRR22136847"], 'node_id'])


arg.plot_pango_subgraph(
    rec.pango("XBB"),
    restrict_to_first=20,
    #include=[1408964, 1396838, 1404568, 1423196, 1398292, 2681617, 1409763],
    include=[1396843] + keep_ids,
    parent_levels=5,
    child_levels=2,
    parent_pangos=["BA.2.10", "BM.1.1.1"],
    oldest_y_label="2022-02",
    y_axis_scale="rank",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBB", 'node_id']},
    tree_highlighting=False,
)
copypattern(1396207)

In [None]:
# This is a weird one: XBB.1 is not a recombinant, but there *is* a recombinant associated with samples
# ERR10937584 & ERR10839902 both of which are labelled XBB.1. We are denoting this XBB.x
# There is another recombination node under XBB which is labelled BA.2, which we show here for contrast

keep_ids = list(df.loc[["ERR10839848", "ERR10839902", "ERR10937584", "ERR10792363", "SRR22059625"], 'node_id'])


arg.plot_pango_subgraph(
    rec.pango("XBB.1"),
    txt(
        "XBB.1 is not a novel Pango recombinant, but a few XBB.1 samples give rise to novel recombinants. "
        "The one to the left with 2 XBB.1 sample descendants, we denote as XBB.x. Its copying pattern is below.",
        right="17em",
        top="70em",
    ),
    restrict_to_first=20,
    include=keep_ids,
    parent_levels=4,
    child_levels=0,
    parent_pangos=["BA.2.10", "BM.1.1.1"],
    oldest_y_label="2022-02",
    y_axis_scale="rank",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBB.1", 'node_id']},
    tree_highlighting=False,
)
copypattern(1429711)

In [None]:
keep_ids = list(df.loc[["ERR9825609", "SRR21382561"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XBD"),
    include=keep_ids,
    y_axis_scale="rank",
    parent_levels=6,
    parent_pangos=["BA.5.2.1", "BA.2.75.2"],
    oldest_y_label="2022-02",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBD", 'node_id']},
    tree_highlighting=False,
)
copypattern(1378208)

In [None]:
keep_ids = list(df.loc[["SRR20747959"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XBE"),
    include=keep_ids + [1348953],
    y_axis_scale="rank",
    parent_levels=5,
    child_levels=0,
    parent_pangos=["BA.5.2"],
    oldest_y_label="2022-01",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBE", 'node_id']},
    tree_highlighting=False,
)

In [None]:
keep_ids = list(df.loc[["SRR21948068"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XBF"),
    include=keep_ids,
    parent_levels=6,
    child_levels=0,
    parent_pangos=["BA.5.2.1", "CJ.1"],
    oldest_y_label="2022-03",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBF", 'node_id']},
    tree_highlighting=False,
)
copypattern(1420385)

In [None]:
arg.plot_pango_subgraph(
    rec.pango("XBG"),
    parent_levels=7,
    parent_pangos=["BA.2.76", "BA.5.2"],
    oldest_y_label="2022-03",
    tree_highlighting=False,
)
copypattern(1291970)

In [None]:
keep_ids = list(df.loc[["SRR21796989"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XBH"),
    parent_levels=5,
    child_levels=3,
    include=keep_ids,
    y_axis_scale="rank",
    parent_pangos=["BA.2.1", "BA.2.75.2"],
    oldest_y_label="2021-12",
    tree_highlighting=False,
)
copypattern(1379419)

In [None]:
pangoX = rec.pango(["XBK", "XBK.1", "XBQ"])
keep_ids = list(df.loc[["ERR10797708", "ERR10791552", "ERR10791704", "ERR10797699", "SRR22136847", "SRR21948068", "ERR10431237", "SRR21886699", "SRR22136951"], 'node_id'])
cmap = {'lightgrey': keep_ids}
cmap.update({c: pango_lineage_samples[pX] for c, pX in zip(arg.colours, pangoX)})

with warnings.catch_warnings():
    warnings.simplefilter("ignore")
    arg.plot_pango_subgraph(
        pangoX,
        include = keep_ids,
        parent_levels=11,
        child_levels=0,
        parent_pangos=["BA.2", "BA.2.5", "BA.2.75", "BA.2.75.3", "BM.2", "BM.1.1.1", "BM.1.1", "CJ.1"],
        highlight_nodes=cmap,
        oldest_y_label="2021-11",
        tree_highlighting=False,
    )

In [None]:
keep_ids = list(df.loc[["SRR21672613", "ERR10770184", "SRR20303898"], 'node_id'])

arg.plot_pango_subgraph(
    rec.pango("XBM"),
    parent_levels=10,
    child_levels=0,
    include=keep_ids,
    y_axis_scale="rank",
    parent_pangos=["BA.2.76", "BF.3"],
    oldest_y_label="2021-11",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBM", 'node_id']},
    tree_highlighting=False,
)
copypattern(1348822)

In [None]:
keep_ids = list(df.loc[["SRR22317465"], 'node_id'])
arg.plot_pango_subgraph(
    rec.pango("XBR"),
    parent_levels=5,
    child_levels=0,
    include=[1405941],
    parent_pangos=["BN.3.1", "BQ.1.25"],
    oldest_y_label="2022-04",
    highlight_nodes={"lightgrey": keep_ids, arg.highlight_colour: df.loc[df.pango=="XBR", 'node_id']},
    tree_highlighting=False,
)
copypattern(1420166)

In [None]:
# Check we have plotted all the expected Pango X lineages
assert set(dfX.root_pango) - set(rec.pangos) == set()