# Visualize transcriptomes using SWAN

In this notebook, I will show how to generate SWAN reports to visualize long-read transcriptomes, find isoform-switching genes, and discover novel exon skipping and intron retention events.

**note**: Since SWAN_vis works with pandas < 2.0, we should install a separate environment for this part of the analysis.

```
conda env create -f environment.yml
```

Author: Narges Rezaie

## Preparing SWAN object

In [1]:
## Load library

import pandas as pd
import numpy as np
import anndata as ad
import swan_vis as swan

import warnings
warnings.filterwarnings("ignore")

To create a SWAN object, you need two different files:

1. GTF file (`gtf`): It can be a [Gencode GTF file](https://www.gencodegenes.org/) if there is no modification needed in the previous step(s).
2. Transcript expression h5ad file (`tpm_adata`): transcript level TPM matrix generated in the previous step.

Based on the mouse model, you may need to change the value of these two parameters.

**note**: you may also need to adjust the metadata based on the data you have.

In [2]:
## Create SWAN pickle file

gtf = '../data/gencode.vM32.chr_patch_hapl_scaff.annotation.gtf'
tpm_adata = '../data/transcript_exp_tpm.h5ad'


sg = swan.SwanGraph()

sg.add_annotation(gtf)

sg.add_adata(tpm_adata)

sg.set_metadata_colors('Sex', {'F': 'green',
                               'M': 'yellow'})
sg.set_metadata_colors('Age', {'4 months': 'thistle'})
sg.set_metadata_colors('Tissue', {'hippocamous': 'red'})
sg.set_metadata_colors('Genotype', {'5xCLU-h2kbKI-HO': '#FFB6C1', 
                                    '5xFADHEMI':'#FF8DA1', 
                                    '5xFADWT': '#FF7782', 
                                    'CLU-h2kbKI-HO': '#e75480'})

sg.save_graph("SWAN")


Adding annotation to the SwanGraph

Adding abundance for datasets ad003_11616_lig-blk, ad003_11617_lig-blk, ad003_11625_lig-blk, ad003_11627_lig-blk, ad003_11628_lig-blk... (and 7 more) to SwanGraph
Calculating TPM...
Calculating PI...
Calculating edge usage...
Calculating TSS usage...
Calculating TES usage...
Saving graph as SWAN.p


Once you create the SWAN object, you can ask for a swan report for given list of gene(s).

To learn more about how `gen_report()` works, please look at [here](https://freese.gitbook.io/swan/code-documentation/swangraph).

**note**: You need to make sure the gene name matches exactly what is in the GTF file.
**note**: You also need to modify the `datasets` and `metadata_cols`, based on the dataset you have.

In [6]:
# create swan report for a given gene
sg = swan.read("SWAN.p")

gene_names = ["Apoe"]

for gene_name in gene_names:
    sg.gen_report(gene_name,
                  f'figures/{gene_name}',
                  datasets = {'Genotype': ['5xFADHEMI', '5xCLU-h2kbKI-HO', '5xFADWT', 'CLU-h2kbKI-HO']},
                  metadata_cols=['Age', 'Tissue', 'Sex', 'Genotype'],
                  cmap='viridis',
                  transcript_col='tname',
                  novelty=True,
                  indicate_novel=True,
                  layer='tpm')

Read in graph from SWAN.p

Plotting transcripts for ENSMUSG00000002985
Saving transcript path graph for ENSMUST00000174064.9 as figures/Apoe_novel_ENSMUST00000174064.9_path.png
Saving transcript path graph for ENSMUST00000174355.8 as figures/Apoe_novel_ENSMUST00000174355.8_path.png
Saving transcript path graph for ENSMUST00000173739.8 as figures/Apoe_novel_ENSMUST00000173739.8_path.png
Saving transcript path graph for ENSMUST00000167646.9 as figures/Apoe_novel_ENSMUST00000167646.9_path.png
Saving transcript path graph for ENSMUST00000174144.8 as figures/Apoe_novel_ENSMUST00000174144.8_path.png
Saving transcript path graph for ENSMUST00000003066.16 as figures/Apoe_novel_ENSMUST00000003066.16_path.png
Saving transcript path graph for ENSMUST00000172983.8 as figures/Apoe_novel_ENSMUST00000172983.8_path.png
Generating report for ENSMUSG00000002985
