# Interoperability with `scirpy`

It is now possible to convert the file formats between `dandelion>=0.1.1` and `scirpy>=0.6.2` [[Sturm2020]](https://academic.oup.com/bioinformatics/article/36/18/4817/5866543) to enhance the collaboration between the analysis toolkits.

We will download the *airr_rearrangement.tsv* file from here:
```bash
# bash
wget https://cf.10xgenomics.com/samples/cell-vdj/4.0.0/sc5p_v2_hs_PBMC_10k/sc5p_v2_hs_PBMC_10k_b_airr_rearrangement.tsv
```

Gene expression data can also be obtained here

```bash
# bash 
wget https://cf.10xgenomics.com/samples/cell-vdj/4.0.0/sc5p_v2_hs_PBMC_10k/sc5p_v2_hs_PBMC_10k_filtered_feature_bc_matrix.h5
```


<b>Import dandelion module</b>

In [None]:
# import sys
# sys.path.append("C://Users//Amos Choo//Desktop//dandelion")
import os

import dandelion as ddl


# change directory to somewhere more workable

os.chdir(os.path.expanduser("~/Downloads/dandelion_tutorial/"))

ddl.logging.print_versions()

In [None]:
import scirpy as ir
import scanpy as sc


ir.__version__

## `dandelion`

In [None]:
# read in the airr_rearrangement.tsv file
file_location = (
    "sc5p_v2_hs_PBMC_10k/sc5p_v2_hs_PBMC_10k_t_airr_rearrangement.tsv"
)

# read in gene expression data
adata = sc.read_10x_h5(
    "sc5p_v2_hs_PBMC_10k/sc5p_v2_hs_PBMC_10k_filtered_feature_bc_matrix.h5"
)
adata.var_names_make_unique()

vdj = ddl.read_10x_airr(file_location)
vdj

The test file contains a blank `clone_id` column so we run `find_clones` to populate it first.

In [None]:
ddl.tl.find_clones(vdj)

### `ddl.to_scirpy` : Converting `dandelion` to `scirpy`

In [None]:
irdata = ddl.to_scirpy(vdj)
irdata

Conversion to `AnndData` in scirpy format is also available

In [None]:
mudata = ddl.to_scirpy(vdj, to_mudata=False)
mudata

If you have gene expression data, the parameter `gex_adata` supports the gene expression data in `AnnData` format.

Please note that this will slice to the same cell_id that are present in the same in the AIRR data.

In [None]:
irdata = ddl.to_scirpy(vdj, to_mudata=False, gex_adata=adata)
irdata

In [None]:
mudata = ddl.to_scirpy(vdj, to_mudata=True, gex_adata=adata)
mudata

Use scirpy's get functions to retrieve the relevant airr info (https://scirpy.scverse.org/en/latest/generated/scirpy.get.airr.html)

In [None]:
ir.get.airr(irdata, "clone_id")

In [None]:
ir.get.airr(mudata, "clone_id")

Or you can add `transfer = True`, which will perform dandelion's `tl.transfer`.

In [None]:
irdatax = ddl.to_scirpy(vdj, transfer=True)
irdatax

In [None]:
irdatax = ddl.to_scirpy(vdj, transfer=True, to_mudata=False)
irdatax

### `ddl.from_scirpy` : Converting `scirpy` to `dandelion`

Converting `MuData` back to `Dandelion`

In [None]:
vdjx = ddl.from_scirpy(mudata)
vdjx

Converting `AnnData` back to `Dandelion`

In [None]:
vdjx = ddl.from_scirpy(irdata)
vdjx

In [None]:
vdjx.metadata

This time, find clones with `scirpy`'s method.

In [None]:
ir.tl.chain_qc(irdata)
ir.pp.ir_dist(irdata)
ir.tl.define_clonotypes(irdata, receptor_arms="all", dual_ir="primary_only")
irdata

### Visualising with `scirpy`'s plotting tools

You can now also plot `dandelion` networks using `scirpy`'s functions.

In [None]:
ddl.tl.generate_network(vdj, key="junction")

In [None]:
irdata.obs["scirpy_clone_id"] = irdata.obs["clone_id"]  # stash it
ddl.tl.transfer(
    irdata, vdj, overwrite=True
)  # overwrite scirpy's clone_id definition

In [None]:
ir.tl.clonotype_network(irdata, min_cells=2)
ir.pl.clonotype_network(irdata, color="clone_id", panel_size=(7, 7))

to swap to a shorter clone_id name (ordered by size)

In [None]:
ddl.tl.transfer(irdata, vdj, clone_key="clone_id_by_size")
ir.tl.clonotype_network(irdata, clonotype_key="clone_id_by_size", min_cells=2)
ir.pl.clonotype_network(irdata, color="clone_id_by_size", panel_size=(7, 7))

you can also collapse the networks to a single node and plot by size

In [None]:
ddl.tl.transfer(irdata, vdj, clone_key="clone_id_by_size", collapse_nodes=True)
ir.tl.clonotype_network(irdata, clonotype_key="clone_id_by_size", min_cells=2)
ir.pl.clonotype_network(irdata, color="scirpy_clone_id", panel_size=(7, 7))