# Stability Rank Selection
The methods in this notebook are alternative strategies for rank selection.
We found bicrossvalidation to perform better (see `rank_analysis.html` for those results).

These methods are based on making multiple decompositions from random initialisations and looking at how similar they are. 
As such, if you ran with a low `random_starts` parameter, these results will not be very robust.
Typically we used `random_starts=100` which seems to be sufficient.

Descriptions of the different criteria are available in the package documentation for `copehenetic_correlation`, `dispersion` and `signature_similarity` functions.

In [1]:
from cvanmf import denovo
import plotnine as pn
import pandas as pd

In [None]:
stability_values = pd.read_csv(
    "stability_rank_analysis.tsv",
    sep="\t",
    index_col="rank"
)
# Ensure sorted by rank
stability_values = stability_values.sort_index(ascending=False)

The dashed line inidicates the automatically selection suitable rank.
In all three measures, a higher value is better.
The automatic suggestion is not guaranteed to be the correct rank.

To customise the plot below, open the notebook `stability_rank_analysis.ipynb` in Jupyter and edit using `plotnine` methods.

In [None]:
fig = (
    denovo.plot_stability_rank_selection(
        series=[stability_values[c] for c in stability_values.columns]
    ) 
    + pn.theme(figure_size=(10, 3))
    + pn.guides(color="none")
)
fig