# Get Assemblies String Tie : Ribo Elong, RNA

Assembly of reads into transcripts: to capture the complete transcriptome including both annotated and unannotated transcripts, we generated sample-specific transcriptome assemblies. To this end, we used StringTie v1.3.6 (Pertea et al., 2015) guided by a reference annotation (Ensembl release 88) in RNA-seq and Ribosome Profiling Elongation BAM files.

Input Files
```bash
"""
inputFiles : 
    Path to the deduplicated bam file for Ribo-Elong alignement reads
outputFile : path
    Path to save the gtf file output from StringTie
library : int
    Type of library between fr et rf (For ribo-seq reads this must be equal to 1, to 2 for RNA)
gtfGuide : path
    Path to gtf annotation guide file (GRCh38_Gencode26/gencode.v26.primary_assembly.annotation.gtf)
numberMinReads : int
    Minimun number of reads to the assemblie
minimunLenght : int
    Sets the minimum length allowed for the predicted transcripts. Default: 30
minimunGap : int
    Minimum locus gap separation value. Reads that are mapped closer than this distance are merged together in the same processing bundle
nameTask : string
    qsub name task
saveOutputQsub : path
    Path to save qsub output
logPath : path
    path to save log output
gene_abund : path
    Path to save gene_abund.tab file from stringTie
cov_refs : path
    Path to save cov_refs.gtf file from stringTie
"""
```
Output Files
```bash
"""
AssembledTranscripts.gtf : gtf-like file 
    gtf file that contains all the assembled transcripts detected by stringTie

gene_abund.tab  : tab file
    Gene abundance file generated for stringTie

cov_refs.gtf : gtf-like file
    File that contains the coverage information generated for stringTie
"""
```


In [1]:
%%bash

echo 'Ribo Elong Assemblies'
inputFiles='.../Alignment_Reads_Genome/Ribo/Elong/...DD.bam'
outputFile='.../StringTieAssemblies/RiboElong/RiboElong_AssembledTranscripts.gtf'
library=1
gtfGuide='../../Data_Input_Scripts/GRCh38_Gencode26/gencode.v26.primary_assembly.annotation.gtf'
numberMinReads=3
minimunLenght=30
minimunGap=30
nameTask='AssemblingTranscripts_RiboElong'
saveOutputQsub='.../qsub_outputs/'
logPath='.../logs/'
gene_abund='.../StringTieAssemblies/RiboElong/RiboElong_gene_abund.tab'
cov_refs='.../StringTieAssemblies/RiboElong/RiboElong_cov_refs.gtf'

sh ../../Scripts/3_Trascriptome_Assembly/stringTie_transcriptAssembly.sh $inputFiles $outputFile $library $gtfGuide $numberMinReads $minimunLenght $minimunGap $nameTask $saveOutputQsub $logPath $gene_abund $cov_refs


echo 'RNA Assemblies'
inputFiles='.../Alignment_Reads_Genome/RNA/RNA_Aligned.sortedByCoord.out.bam'
outputFile='.../StringTieAssemblies/RNA/RNA_AssembledTranscripts.gtf'
library=2
gtfGuide='../../Data_Input_Scripts/gencode.v26.primary_assembly.annotation.gtf'
numberMinReads=3
minimunLenght=30
minimunGap=50
nameTask='AssemblingTranscripts_RNA'
saveOutputQsub='.../qsub_outputs/'
logPath='.../logs/'
gene_abund='.../StringTieAssemblies/RNA/RNA_gene_abund.tab'
cov_refs='.../StringTieAssemblies/RNA/RNA_cov_refs.gtf'

sh ../../Scripts/3_Trascriptome_Assembly/stringTie_transcriptAssembly.sh $inputFiles $outputFile $library $gtfGuide $numberMinReads $minimunLenght $minimunGap $nameTask $saveOutputQsub $logPath $gene_abund $cov_refs


Ribo Elong Assemblies
RNA Assemblies


# Separation of RiboElong and RNA assembled Transcripts to intersect with the Start codons Candidates

In [4]:
%%bash

echo 'Separation Ribo Elong -'
grep -w - .../StringTieAssemblies/RiboElong/RiboElong_AssembledTranscripts.gtf > .../StringTieAssemblies/RiboElong/RiboElong_AssembledTranscripts-.gtf

echo 'Separation Ribo Elong +'
grep -w + .../StringTieAssemblies/RiboElong/RiboElong_AssembledTranscripts.gtf  > .../StringTieAssemblies/RiboElong/RiboElong_AssembledTranscripts+.gtf

echo 'Separation RNA -'
grep -w - .../StringTieAssemblies/RNA/RNA_AssembledTranscripts.gtf > .../StringTieAssemblies/RNA/RNA_AssembledTranscripts-.gtf

echo 'Separation RNA +'
grep -w + .../StringTieAssemblies/RNA/RNA_AssembledTranscripts.gtf > .../StringTieAssemblies/RNA/RNA_AssembledTranscripts+.gtf



Separation Ribo Elong -
Separation Ribo Elong +
Separation RNA -
Separation RNA +
