# Ozette Abundance Metric Examples

In this Notebook we're going to use the _Abundance_ metric on three Ozette-embedded studies to find differentially-abundant phenotypes.

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
import pathlib
import sys
import pandas as pd
from cev.widgets import Embedding, EmbeddingComparisonWidget, compare


def get_embedding(folder: str, sample: str):
    return Embedding.from_ozette(
        df=pd.read_parquet(
            pathlib.Path.cwd() / ".." / "data" / f"{folder}" / f"{sample}.parquet"
        )
    )

# Melanoma Study

### Distinct predictive biomarker candidates for response to anti-CTLA-4 and anti-PD-1 immunotherapy in melanoma patients

Subrahmanyam et al., 2018. https://pubmed.ncbi.nlm.nih.gov/29510697/

In this example we're going to compare phenotypes between a pair of unstimulated Pembrolizumab responder and non-responder samples.

In [None]:
non_responder_embedding = get_embedding("subrahmanyam-2018", "OZEXPSMPL_782")
responder_embedding = get_embedding("subrahmanyam-2018", "OZEXPSMPL_804")

melanoma_comparison = EmbeddingComparisonWidget(
    non_responder_embedding,
    responder_embedding,
    titles=["Non-Responder", "Responder"],
    metric="abundance",
    selection="phenotype",
    auto_zoom=True,
    row_height=360,
)
melanoma_comparison

**Phenotype 1:** should be more abundant in `responder` (right) compared to `non-responder` (left)

In [None]:
melanoma_comparison.select(
    "CD8-GranzymeB-CD27+CD3+CD28+CD19-CD57-CD127+CD33-CD45RA-CD4+CD14-HLADR-CD20-CCR7+CD56-IL2-CD16-TNFa-MIP1b-CD154+GMCSF-PDL1-CD107a-IL17-Perforin-CD69+CTLA4-PDL2-PD1-TCRgd-IFNg-CD38-CD25-IL10-IL4-"
)

**Phenotype 2:** should be more abundant in `responder` (right) compared to `non-responder` (left)

In [None]:
melanoma_comparison.select(
    "CD8-GranzymeB+CD27-CD3-CD28-CD19-CD57+CD127-CD33-CD45RA+CD4-CD14-HLADR-CD20-CCR7-CD56+IL2-CD16+TNFa-MIP1b+CD154-GMCSF-PDL1-CD107a-IL17-Perforin+CD69+CTLA4-PDL2+PD1-TCRgd-IFNg-CD38+CD25-IL10-IL4-"
)

**Phenotype 3:** should be more abundant in `responder` (right) compared to `non-responder` (left)

In [None]:
melanoma_comparison.select(
    "CD8-GranzymeB+CD27-CD3-CD28-CD19-CD57+CD127-CD33-CD45RA+CD4-CD14-HLADR-CD20-CCR7-CD56+IL2-CD16+TNFa-MIP1b+CD154-GMCSF-PDL1-CD107a-IL17-Perforin+CD69-CTLA4-PDL2+PD1-TCRgd-IFNg-CD38+CD25-IL10-IL4-"
)

**Phenotype 4:** should be more abundant in `responder` (right) compared to `non-responder` (left)

In [None]:
melanoma_comparison.select(
    "CD8-GranzymeB+CD27-CD3-CD28-CD19-CD57+CD127-CD33-CD45RA+CD4-CD14-HLADR-CD20-CCR7-CD56+IL2-CD16+TNFa-MIP1b-CD154-GMCSF-PDL1-CD107a-IL17-Perforin+CD69-CTLA4-PDL2-PD1-TCRgd-IFNg-CD38+CD25-IL10-IL4-"
)

# Cancer Study

### Extricating human tumour immune alterations from tissue inflammation

Mair et al., 2022. https://www.nature.com/articles/s41586-022-04718-w

In this example we're going to compare phenotypes between a pair of tumor and tissue samples.

In [None]:
tissue_embedding = get_embedding("mair-2022", "OZEXPSMPL_26155")
tumor_embedding = get_embedding("mair-2022", "OZEXPSMPL_26146")

cancer_comparison = EmbeddingComparisonWidget(
    tissue_embedding,
    tumor_embedding,
    titles=["Tissue (Mucosa)", "Tumor"],
    metric="abundance",
    selection="phenotype",
    auto_zoom=True,
    row_height=360,
)
cancer_comparison

**CD8 T-Cell Phenotype** should be more abundant in `tissue` (left) compared to `tumor` (right)

In [None]:
cancer_comparison.select(
    "CD4-CD8+CD3+CD45RA+CD27+CD19-CD103-CD28-CD69+PD1+HLADR-GranzymeB-CD25-ICOS-TCRgd-CD38-CD127-Tim3-"
)

**CD4 T-Cell Phenotype** should be more abundant in `tumor` (right) compared to `tissue` (left)

In [None]:
cancer_comparison.select(
    "CD4+CD8-CD3+CD45RA-CD27+CD19-CD103-CD28+CD69+PD1+HLADR-GranzymeB-CD25+ICOS+TCRgd-CD38-CD127-Tim3+"
)

# ICS Study

### IFN-γ-independent immune markers of Mycobacterium tuberculosis exposure

Lu et al., 2019. https://www.nature.com/articles/s41591-019-0441-3

In this example we're going to compare phenotypes between a pair of disease (LTBI) and resister (RSTR) samples.

In [None]:
diseased_embedding = get_embedding("lu-2019", "OZEXPSMPL_2105")
resister_embedding = get_embedding("lu-2019", "OZEXPSMPL_2136")

comparison = EmbeddingComparisonWidget(
    diseased_embedding,
    resister_embedding,
    titles=["Diseased (LTBI)", "Resister (RSTR)"],
    metric="abundance",
    selection="phenotype",
    auto_zoom=True,
    row_height=360,
)
comparison

**Phenotype 5 from [Fig 3c](https://www.nature.com/articles/s41591-019-0441-3/figures/3)** should be more abundant in `diseased` (Left) compared to `resister` (right)

In [None]:
comparison.select("CD4+CD3+CD8-TNF+CD107a-IL4-IFNg+IL2+CD154+IL17a-")