# Metient-calibrate

### This tutorial is intended for those who want to run Metient in calibrate mode, and have Metient infer (1) the proportions of each clone in each anatomical site, and (2) the labeled clone tree. 

### To run this notebook, you'll need metient installed:

```bash
mamba create -n "met" python=3.8.8 ipython
mamba activate met
pip install metient
```

### Import libraries and setup paths

In [1]:
import os
from metient import metient as met

### Setup paths
# Path to where our input clone trees and tsvs are located
input_dir = os.path.join(os.getcwd(), "inputs")
# Path to save outputs
output_dir = os.path.join(os.getcwd(), "1_outputs")

CUDA GPU: False


### An example of the expected tsv file format for melanoma patient A
**The required columns are:**
`anatomical_site_index, anatomical_site_label, cluster_index, character_index, character_label, ref, var, var_read_prob, site_category` (see `../README.md` for description of each column)

In [2]:
import pandas as pd
df = pd.read_csv(os.path.join(input_dir, "A_SNVs.tsv"), sep="\t")
df

Unnamed: 0,anatomical_site_index,anatomical_site_label,cluster_index,character_index,character_label,ref,var,var_read_prob,site_category
0,0,"Primary, forehead",1,0,ADCY5,213,118,0.380,primary
1,1,Parotid metastasis,1,0,ADCY5,382,27,0.105,metastasis
2,2,"Locoregional skin metastasis 1, forehead",1,0,ADCY5,319,67,0.210,metastasis
3,3,"Locoregional skin metastasis 2, angle jaw",1,0,ADCY5,188,66,0.310,metastasis
4,0,"Primary, forehead",1,1,ZNF148,117,58,0.380,primary
...,...,...,...,...,...,...,...,...,...
535,3,"Locoregional skin metastasis 2, angle jaw",2,133,TECTA,120,55,0.310,metastasis
536,0,"Primary, forehead",0,134,ANKRD17,160,116,0.380,primary
537,1,Parotid metastasis,0,134,ANKRD17,302,33,0.105,metastasis
538,2,"Locoregional skin metastasis 1, forehead",0,134,ANKRD17,278,68,0.210,metastasis


## Step 1: Load filepaths to clone trees and tsv files for each patient

In [3]:
patients = ["A", "C", "E", "G"]

clone_tree_fns = [os.path.join(input_dir, f"{patient}_tree.txt") for patient in patients]
ref_var_fns = [os.path.join(input_dir, f"{patient}_SNVs.tsv") for patient in patients]

## Step 2: Run Metient-calibrate

In [4]:
print_config = met.PrintConfig(visualize=True, verbose=False, k_best_trees=5)
met.calibrate(clone_tree_fns, ref_var_fns, print_config, output_dir, patients, solve_polytomies=True)


Saving results to /lila/data/morrisq/divyak/projects/metient/tutorial/1_outputs/calibrate
Overwriting existing directory at /lila/data/morrisq/divyak/projects/metient/tutorial/1_outputs/calibrate

*** Calibrating for patient: A ***
ordered_sites ['Primary, forehead', 'Parotid metastasis', 'Locoregional skin metastasis 1, forehead', 'Locoregional skin metastasis 2, angle jaw']
calculate_batch_size 320


  6%|▋         | 29/450 [00:20<09:44,  1.39s/it]
KeyboardInterrupt



## Step 3: Use the pickle file outputs for downstream analysis

### In addition to the visualizations that Metient provides, we also save pkl.gz files for each Metient run that contain all the results of the run.

In [None]:
import gzip
import pickle

with gzip.open(os.path.join(output_dir,"calibrate", "A_Primary, forehead.pkl.gz") ,"rb") as f:
    pckl = pickle.load(f)
print(pckl.keys())

# V is the best ancestral labeling
V = pckl['clone_tree_labelings'][0]
# A is the adjacency matrix that is the input clone tree + inferred leaf nodes
A = pckl['full_adjacency_matrices'][0]

# G represents the migration graph
G = met.migration_graph(V, A)
print("\nmigration graph:\n", G)

# Show other information about this patient's inferred migration history
print("\nseeding pattern:", met.seeding_pattern(V, A))
print("seeding clusters:", met.seeding_clusters(V, A))
print("phyleticity:", met.phyleticity(V, A))
print("site clonality:", met.site_clonality(V, A))
print("genetic clonality:", met.genetic_clonality(V, A))