In [None]:
%matplotlib inline

Plotting Sankey diagrams
========================


In this notebook, we will showcase how to use
`moscot.plotting.sankey`{.interpreted-text role="meth"}. We use the HSPC
dataset to demonstrate the usage.


In [None]:
from moscot.datasets import hspc
from moscot.problems.time import TemporalProblem
import moscot.plotting as mpl

adata = hspc()

First, we need to prepare and solve the problem. Here, we set the
[threshold]{.title-ref} parameter to a relative high value to speed up
convergence at the cost of lower quality.


In [None]:
tp = TemporalProblem(adata).prepare(time_key="day").solve(epsilon=1e-2, threshold=1e-2)

As for all plotting functionalities in moscot, we first call the method
of the problem class, which stores the results of the computation in the
`anndata.AnnData`{.interpreted-text role="class"} instance. Let us
assume we want to plot the Sankey diagram across all time points 2, 3,
4, and 7. Moreover, we want the Sankey diagram to visualize flows
between cell types. In general, we can visualize the flow defined by any
column in `anndata.AnnData.obs`{.interpreted-text role="attr"} via the
[source\_groups]{.title-ref} parameter and the
[target\_groups]{.title-ref} parameter, respectively. In this example,
we are interested in descendants as opposed to ancestors, which is why
we choose [forward]{.title-ref} to be [True]{.title-ref}. The
information required to plot the Sankey diagram is provided in
transition matrices, which we would obtain by [return\_data]{.title-ref}
to [True]{.title-ref}. Here, we are only interested in the
visualization.


In [None]:
tp.sankey(source=2, target=7, source_groups="cell_type", target_groups="cell_type", forward=True, return_data=False)

Having called the [sankey]{.title-ref} method of the problem instance,
we now pass the result to the `moscot.plotting`{.interpreted-text
role="mod"} module. Therefore, we can either pass the
`anndata.AnnData`{.interpreted-text role="class"} instance or the
problem instance. We can set the size of the figure via
[dpi]{.title-ref} and set a title via [title]{.title-ref}.


In [None]:
mpl.sankey(tp, dpi=100, title="Cell type evolution over time")

By default, the result of the [sankey]{.title-ref} method of a problem
instance is saved
[anndata.AnnData.uns\[\'moscot\_results\'\]\[\'sankey\'\]\[\'sankey\'\]
and overrides this element every time the method is called. To prevent
this, we can specify the parameter \`key\_added]{.title-ref}, which we
will do to store the results of the following use case.


We can also visualize flows of only a subset of categories of an
`anndata.AnnData.obs`{.interpreted-text role="attr"} column by passing a
dictionary for [source\_groups]{.title-ref} or
[target\_groups]{.title-ref}.


In [None]:
new_key = "subset_sankey"
tp.sankey(
    source=2,
    target=7,
    source_groups={"cell_type": ["HSC", "MasP", "MkP"]},
    target_groups={"cell_type": ["HSC", "MasP", "MkP"]},
    forward=True,
    return_data=False,
    key_added=new_key,
)
mpl.sankey(tp, dpi=100, title="Cell type evolution over time", uns_key=new_key)