# DRS Mapping
Use IsoQuant software to align DRS data to the genome, calibrate it based on second-generation sequencing, and then obtain transcript boundary information.

<zh>使用 IsoQuant 软件把 DRS 数据比对到基因组上，并且根据二代测序进行校准，然后获取转录本边界信息。</zh>

In [None]:
import os
import subprocess

from spider_silkome_module import (
    RAW_DATA_DIR,
    INTERIM_DATA_DIR,
    PROCESSED_DATA_DIR,
    PROJ_ROOT
)
from spider_silkome_module import (
    run_shell_command_with_check,
)

from spider_silkome_module import (
    GeneralGFF,
)

[32m2025-10-15 18:54:02.334[0m | [1mINFO    [0m | [36mspider_silkome_module.config[0m:[36m<module>[0m:[36m11[0m - [1mPROJ_ROOT path is: /home/gyk/project/spider_silkome[0m


## Create Genome STAR index

Note: STAR needs to be installed in advance. Download link: https://github.com/alexdobin/STAR

<zh>注意：需要提前安装好 STAR，下载地址：https://github.com/alexdobin/STAR</zh>

In [4]:
genome_file = f"{RAW_DATA_DIR}/spider_genome/Trichonephila_clavata.fa"
spider = "Trichonephila_clavata"
genome_index_dir = f"{INTERIM_DATA_DIR}/star_index/{spider}"
os.makedirs(genome_index_dir, exist_ok=True)
star_index_cmd = f"STAR --runThreadN 70 --runMode genomeGenerate --genomeDir {genome_index_dir} --genomeFastaFiles {genome_file}"
subprocess.run(star_index_cmd, shell=True)

	STAR --runThreadN 70 --runMode genomeGenerate --genomeDir /home/gyk/project/spider_silkome/data/interim/star_index/Trichonephila_clavata --genomeFastaFiles /home/gyk/project/spider_silkome/data/raw/spider_genome/Trichonephila_clavata.fa
	STAR version: 2.7.11b   compiled: 2024-01-25T16:12:02-05:00 :/home/dobin/data/STAR/STARcode/STAR.master/source
Oct 15 18:56:54 ..... started STAR run
Oct 15 18:56:54 ... starting to generate Genome files
Oct 15 18:58:02 ... starting to sort Suffix Array. This may take a long time...
Oct 15 18:58:19 ... sorting Suffix Array chunks and saving them to disk...
Oct 15 19:08:40 ... loading chunks from disk, packing SA...
Oct 15 19:10:04 ... finished generating suffix array
Oct 15 19:10:04 ... generating Suffix Array index
Oct 15 19:14:07 ... completed Suffix Array index
Oct 15 19:14:07 ... writing Genome to disk ...
Oct 15 19:14:09 ... writing Suffix Array to disk ...
Oct 15 19:14:28 ... writing SAindex to disk
Oct 15 19:14:30 ..... finished successfully


CompletedProcess(args='STAR --runThreadN 70 --runMode genomeGenerate --genomeDir /home/gyk/project/spider_silkome/data/interim/star_index/Trichonephila_clavata --genomeFastaFiles /home/gyk/project/spider_silkome/data/raw/spider_genome/Trichonephila_clavata.fa', returncode=0)

## Creat BGI RNA-seq Bam files

**Note:** Maker sure the nextflow was installed in your system.

In this section, we use nf-core/rnaseq to create Bam files for DRS data.

Please prepare `nf-params.json` and `samplesheet.csv` file in `RNA-seq_workflow` directory according the [nf-core/rnaseq document](https://nf-co.re/rnaseq).

Run `nextflow run nf-core/rnaseq -r 3.19.0 -name BGI_RNA-seq -profile docker -params-file nf-params.json` in `RNA-seq_workflow` directory.

## DRS Mapping

In [None]:
bam_file = f"{PROJ_ROOT}/RNA-seq_workflow/results/"
isoquant_output_dir = f"{INTERIM_DATA_DIR}/03.DRS_mapping/isoquant"
os.makedirs(isoquant_output_dir, exist_ok=True)
isoquant_cmd = f"isoquant --reference {genome_file} --bam {bam_file} --data_type nanopore -o {isoquant_output_dir}"