## m6Am call from CROWN-Seq

Jianheng Liu (Fox) @ Jaffrey Lab, May 31st, 2023

Concat: jil4026@med.cornell.edu

**Usage:** Modify `Prepartion` section to fit your computer. Modify `Variable` section before each run.All outputs will be saved in the `workpath`. Logs will be saved in the notebook. Don't close the `Notebook` before it finished (keep the backend at least)!

**Note 1:** In Jupyter notebook, `!` means using `bash` in the cell, and `$` means using python variables.

**Note 2:** This panel notebook should be compatible with `Python 3`. But please use `Python 2` to run the analytic scripts.

**Note 3:** I used Gencode v34 in the article for compatiblity. Here I use Ensembl release 104 for illustration. Newer annotations are normally better.

**Note 5:** This notebook is for paired-end reads. If you are using single-end, switch to the scripts end with `_SE.py`. And don't forget to change the parameters.

Important: We are still using the original folder for de novo TSS call. QC and alignment are skipped.

In [1]:
# local time
from datetime import datetime
now = datetime.now()
current_time = now.strftime("%D %H:%M:%S")
print("Started:", current_time)

# Show notebook directory
!pwd -P

Started: 05/31/23 16:52:54
/home/fox/Projects/CROWN-Seq_example


### 0. Variable here

In [2]:
name = "A549_WT" # file name prefix
folder = "A549_run/"
path = "./" # output path: workpath=path/folder
workpath = path + "/" + folder + "/"
read1 = "A549_rep1_example.R1.fastq.gz" # fastq or fastq.gz
read2 = "A549_rep1_example.R2.fastq.gz"
site_list = "../merge/CROWN_sites.txt"

### 1. Preparation

In [3]:
ref_genome = "/home/fox/Database/GLORI_index/GRCh38_r104/hisat2_index/Homo_sapiens.GRCh38.dna_sm.primary_assembly.fa"

**Software** (For python, use the fixed version)

In [4]:
python = '/home/fox/Software/bin/python'
bowtie2 = '/home/fox/Software/bin/python2'
samtools = '/home/fox/Software/samtools/1.16/bin/samtools'
cutadapt = '/home/fox/Software/bin/cutadapt'
hisat2_path = '/home/fox/Software/hisat2/2.1.0/'
umitools = '/home/fox/Software/bin/umi_tools'
bedtools = '/home/fox/Software/bedtools/2.29.1/bedtools'

**Scripts**

In [5]:
get_ACGU_counts = "../fetch_caps_from_CROWN-seq_PE.py"
get_control_sites = "../find_next_A_as_control.py"
get_control_ACGU_counts = "../fetch_caps_from_CROWN-seq_control_PE.py"

**Create the workpath if not exist, then move to the workpath**

In [6]:
import os
if os.path.isdir(workpath) == False:
    os.mkdir(workpath)
os.chdir(workpath)

### 1. Get m6Am fractions from the BAM file

In [7]:
output_file = "{name}.CROWN.csv".format(name=name)

!$python $get_ACGU_counts -r $ref_genome -l $site_list -o $output_file -b hisat2_genome.sorted.umi.sorted.bam

### 2. Get downstream controls

In [8]:
controls_list = "{name}.control".format(name=name)
controls_pairs = "{name}.control.pairs".format(name=name)
output_controls = "{name}.CROWN.control.csv".format(name=name)

# get control sites
!$python $get_control_sites $site_list $ref_genome $controls_list $controls_pairs

# get control data
!$python $get_control_ACGU_counts -r $ref_genome -l $controls_pairs -o $output_controls -b hisat2_genome.sorted.umi.sorted.bam

### 3. When and where am I

In [9]:
!pwd -P
!ls -thrl

/home/fox/Projects/CROWN-Seq_example/A549_run
total 328M
-rw-rw-r-- 1 fox fox  50M May 31 16:12 read2.cutadapt.fastq
-rw-rw-r-- 1 fox fox  55M May 31 16:12 read1.cutadapt.fastq
-rw-rw-r-- 1 fox fox  54M May 31 16:13 read2.UMI.fastq
-rw-rw-r-- 1 fox fox  53M May 31 16:13 read1.UMI.fastq
-rw-rw-r-- 1 fox fox 7.7M May 31 16:13 hisat2_genome.multimappers.bam
-rw-rw-r-- 1 fox fox  21M May 31 16:13 read2.UMI.unmapped.fastq
-rw-rw-r-- 1 fox fox  20M May 31 16:13 read1.UMI.unmapped.fastq
-rw-rw-r-- 1 fox fox  14M May 31 16:13 hisat2_genome.bam
-rw-rw-r-- 1 fox fox 9.9M May 31 16:13 hisat2_genome.sorted.bam
-rw-rw-r-- 1 fox fox 1.7M May 31 16:13 hisat2_genome.sorted.bam.bai
-rw-rw-r-- 1 fox fox 5.9M May 31 16:13 hisat2_genome.sorted.umi.bam
-rw-rw-r-- 1 fox fox 3.7K May 31 16:13 umi.logs
-rw-rw-r-- 1 fox fox 5.9M May 31 16:13 hisat2_genome.sorted.umi.sorted.bam
-rw-rw-r-- 1 fox fox 1.7M May 31 16:13 hisat2_genome.sorted.umi.sorted.bam.bai
-rw-rw-r-- 1 fox fox 976K May 31 16:14 A549_WT.bed
-rw-r

In [10]:
now = datetime.now()
current_time = now.strftime("%D %H:%M:%S")
print("Finished:", current_time)

Finished: 05/31/23 16:55:25


## Don't forget to save the Notebook.