
# Accessing Cached Results in ParTIpy

This tutorial walks through how archetypal analysis (AA) results are cached inside an `AnnData` object and shows how to retrieve those artifacts using the public accessor functions provided by ParTIpy.



## Setup

We start by importing the dependencies we will use. The examples below assume you already configured which embedding to use via `set_obsm` and that you have an `AnnData` object named `adata`.


In [1]:
import anndata as ad
import numpy as np
import partipy as pt
import scanpy as sc

from partipy.datasets import load_hepatocyte_data_2

adata = load_hepatocyte_data_2()

sc.pp.normalize_total(adata)
sc.pp.log1p(adata)
sc.pp.highly_variable_genes(adata)
sc.pp.pca(adata, mask_var="highly_variable")
adata.layers["z_scaled"]= sc.pp.scale(adata.X, max_value=10, copy=True)

pt.compute_shuffled_pca(adata, mask_var="highly_variable")
pt.set_obsm(adata=adata, obsm_key="X_pca", n_dimensions=3)

adata

OMP: Info #276: omp_set_nested routine deprecated, please use omp_set_max_active_levels instead.
100%|██████████| 50/50 [00:09<00:00,  5.04it/s]


AnnData object with n_obs × n_vars = 1999 × 8354
    obs: 'cell_type', 'zone', 'run_id', 'time_point', 'UMAP_X', 'UMAP_Y'
    var: 'highly_variable', 'means', 'dispersions', 'dispersions_norm'
    uns: 'log1p', 'hvg', 'pca', 'AA_pca', 'AA_config'
    obsm: 'X_pca'
    varm: 'PCs'
    layers: 'z_scaled'

## Computing and Caching Results

The high-level ParTIpy routines both perform the computation *and* persist their outputs to `adata.uns`. Each cache entry is keyed by the full `ArchetypeConfig`, so repeated calls with the same settings reuse what was already computed.

| Function | Cached location | Notes |
| --- | --- | --- |
| `compute_archetypes` | `adata.uns['AA_results'][ArchetypeConfig]` | Stores weights (`A`/`B`), archetypes (`Z`), RSS traces, and variance explained when `archetypes_only=False`. |
| `compute_selection_metrics` | `adata.uns['AA_selection_metrics'][ArchetypeConfig]`<br>`adata.uns['AA_results'][ArchetypeConfig]` | Evaluates multiple archetype counts; each fit is cached or reused via `compute_archetypes`. Use `pt.summarize_aa_metrics` to combine them for plotting. |
| `compute_archetype_weights` | `adata.uns['AA_cell_weights'][ArchetypeConfig]` | Saves the cell-by-archetype weight matrix. |
| `compute_bootstrap_variance` | `adata.uns['AA_bootstrap'][ArchetypeConfig]`<br>`adata.uns['AA_results'][ArchetypeConfig]` | Aligns bootstrap archetypes to the cached reference fit, reusing or populating AA results as needed. |

A few practical tips:

- Use `force_recompute=True` on any compute function to refresh a cached entry.
- Keep track of the configuration you ran—filters passed to the getter utilities must uniquely identify one `ArchetypeConfig`.
- Once results are cached, the getter functions (`get_aa_result`, `get_aa_metrics`, `get_aa_cell_weights`, `get_aa_bootstrap`, `summarize_aa_metrics`) provide the recommended read-only interface.

The following cell runs a compact example that populates each cache so you can experiment with the accessors in later sections.

In [2]:

# Run archetypal analysis for three archetypes and cache the full result payload
pt.compute_archetypes(
    adata=adata,
    n_archetypes=3,
    save_to_anndata=True,
    archetypes_only=False,
)

# Evaluate a small grid of archetype counts; metrics are saved in adata.uns["AA_selection_metrics"]
pt.compute_selection_metrics(
    adata=adata,
    n_archetypes_list=[2, 3, 4],
)

# Cache cell weights and bootstrap variance for later inspection
pt.compute_archetype_weights(adata, result_filters={"n_archetypes": 3})

pt.compute_bootstrap_variance(
    adata=adata,
    n_bootstrap=5,
    n_archetypes_list=[3],
    save_to_anndata=True,
)


Applied length scale is 3.12.



## Retrieving Cached AA Results

`get_aa_result` returns the payload that was stored by `compute_archetypes`. You can optionally pass filters to disambiguate between multiple cached configurations. Filters accept any field of `ArchetypeConfig`, for example `n_archetypes`, `delta`, or `optim`.


In [3]:
# Retrieve the only cached AA result and inspect the archetype coordinates
result_payload = pt.get_aa_result(adata, n_archetypes=4)
A = result_payload["A"]
B = result_payload["B"]
Z = result_payload["Z"]
print("Archetypes shape:", Z.shape)
print(Z)

Archetypes shape: (4, 3)
[[-4.1829314   4.8940554  -0.42909184]
 [ 3.4212031  -2.0040917  -5.227825  ]
 [-3.1794775  -3.8219402   1.0898299 ]
 [ 7.8230276   0.9535061   3.1575007 ]]


## Accessing Selection Metrics

Selection diagnostics (variance explained, RSS, etc.) are stored per configuration in `adata.uns["AA_selection_metrics"]`. Use `get_aa_metrics` to retrieve a specific table, or call `pt.summarize_aa_metrics` to concatenate the entries that share the same optimization settings (aside from the number of archetypes).

In [4]:
metrics_df = pt.summarize_aa_metrics(adata)
metrics_df

Unnamed: 0,k,n_archetypes,n_restarts,seed,varexpl,IC,RSS
0,2,2,5,42,0.505511,4112.441228,2965.446045
1,3,3,5,42,0.78019,4045.360678,1318.203125
2,4,4,5,42,0.933783,4357.642608,397.105713



## Accessing Bootstrap Results

Bootstrap runs are stored in `adata.uns["AA_bootstrap"]` as tidy DataFrames. Use `get_aa_bootstrap` to read them back for plotting or downstream analysis. As with other getters, filters ensure the correct configuration is selected.


In [5]:
bootstrap_df = pt.get_aa_bootstrap(adata, n_archetypes=3)
bootstrap_df.head()

Unnamed: 0,X_pca_0,X_pca_1,X_pca_2,archetype,iter,reference,mean_variance,variance_per_archetype
0,-4.239301,4.869498,-0.41505,0,1,False,0.026822,0.028961
1,7.639462,0.079809,-0.067928,1,1,False,0.026822,0.017955
2,-2.752828,-3.9509,0.186937,2,1,False,0.026822,0.03355
0,-3.541122,4.984828,-0.393629,0,2,False,0.026822,0.028961
1,7.669269,-0.178665,-0.022822,1,2,False,0.026822,0.017955



## Accessing Cached Cell Weights

All cell weight matrices computed via `compute_archetype_weights` live in `adata.uns["AA_cell_weights"]`. To retrieve one, call `get_aa_cell_weights`. Setting `return_config=True` also returns the `ArchetypeConfig` key that was matched.


In [6]:
config, weights = pt.get_aa_cell_weights(adata, return_config=True)
print(config)
print("Weights shape:", weights.shape)

obsm_key='X_pca' n_dimensions=(0, 1, 2) n_archetypes=3 init='plus_plus' optim='projected_gradients' weight=None max_iter=500 rel_tol=0.0001 early_stopping=True coreset_algorithm=None coreset_fraction=0.1 coreset_size=None delta=0.0 seed=42 optim_kwargs=()
Weights shape: (1999, 3)



## Tips for Working with Cached Results

- Every getter raises a descriptive `ValueError` if the requested configuration is missing or the cache is empty. Handle these exceptions to provide actionable messages in your pipelines.
- To refresh a cached entry, rerun the corresponding compute function with `force_recompute=True`.
- When you plan to keep several configurations, pass explicit filters (for example, `delta=0.1`, `optim="frank_wolfe"`) to the getter utilities to avoid ambiguity.

With these helpers you can treat the `AnnData` object as the single source of truth for all AA artifacts, keeping reproducible analyses compact and self-contained.
