# Server-Side Coexpression Analysis

This notebook demonstrates how to use the Malva coexpression API to identify
correlated genes, visualise metacell UMAP projections, and explore GO
enrichment — all computed server-side without needing scanpy locally.

## 1. Setup

Connect to the Malva API. If you have configured your token via
`malva_client config`, no arguments are needed.

In [None]:
from malva_client import MalvaClient

client = MalvaClient()

## 2. Discover Datasets

Browse the dataset hierarchy to find a dataset ID for coexpression analysis.

In [None]:
hierarchy = client.get_datasets_hierarchy()
client.print_dict_summary(hierarchy)

## 3. Load UMAP Coordinates

Fetch the base UMAP embedding for a dataset. Replace `"DATASET_ID"`
with a dataset from the hierarchy above.

In [None]:
DATASET_ID = "DATASET_ID"  # <-- replace with your dataset

umap = client.get_umap_coordinates(DATASET_ID)
umap.to_dataframe().head()

In [None]:
umap.plot(color_by='cluster')

## 4. Search for a Gene

Run a standard gene search and inspect the result.

In [None]:
result = client.search("FOXP3")
print(result)

## 5. Run Coexpression Analysis

Pass the search job ID and dataset to the coexpression API. This returns
correlated genes, UMAP scores, GO enrichment, cell-type enrichment, and
tissue breakdown — all computed on the server.

In [None]:
coexpr = client.get_coexpression(result.job_id, DATASET_ID)
print(coexpr)

## 6. UMAP with Expression Scores

Colour the UMAP by the fraction of cells expressing the queried gene.

In [None]:
coexpr.plot_umap(color_by='positive_fraction')

## 7. Correlated Genes

Inspect the most strongly correlated genes.

In [None]:
genes_df = coexpr.genes_to_dataframe()
genes_df.head(20)

In [None]:
coexpr.plot_top_genes(n=20)

## 8. GO Enrichment

Explore Gene Ontology enrichment for the correlated gene set.

In [None]:
go_df = coexpr.go_to_dataframe()
go_df.head(10)

In [None]:
coexpr.plot_go_enrichment(n=10)

## 9. Cell Type & Tissue Breakdown

See which cell types and tissues are enriched for the query.

In [None]:
coexpr.cell_type_enrichment_to_dataframe()

In [None]:
coexpr.tissue_breakdown_to_dataframe()

## 10. Lightweight Query (Genes Only)

If you only need the correlated gene list (no UMAP scores, no GO), use
the lightweight endpoint for a faster response.

In [None]:
quick = client.get_coexpression_genes(result.job_id, DATASET_ID)
quick.get_top_genes(5)

## 11. Compare Two Genes

Search for a second gene and compare their correlated gene sets.

In [None]:
result_cd4 = client.search("CD4")
coexpr_cd4 = client.get_coexpression(result_cd4.job_id, DATASET_ID)

foxp3_genes = set(coexpr.get_top_genes(50))
cd4_genes = set(coexpr_cd4.get_top_genes(50))

shared = foxp3_genes & cd4_genes
print(f"Shared correlated genes (top 50): {len(shared)}")
print(sorted(shared))

## Summary

In this notebook we:

1. Loaded UMAP coordinates for a dataset
2. Ran a full server-side coexpression analysis
3. Visualised the UMAP coloured by expression scores
4. Explored correlated genes and GO enrichment
5. Inspected cell-type and tissue breakdown
6. Used the lightweight genes-only endpoint
7. Compared correlated gene sets between two queries

All analysis was done server-side — no local single-cell processing was
required.