Skip to content

Commit

Permalink
Updating docstring and setting up sqanti rules
Browse files Browse the repository at this point in the history
  • Loading branch information
skchronicles committed Sep 15, 2023
1 parent caab797 commit 5eb3ede
Show file tree
Hide file tree
Showing 2 changed files with 47 additions and 1 deletion.
2 changes: 1 addition & 1 deletion workflow/rules/quant.smk
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ rule flair_collapse:
be concatenated prior to running flair-collapse.
Github: https://github.com/BrooksLabUCSC/flair
@Input:
FLAIR Correct Genomic Alignments in BED12 (scatter)
FLAIR Correct Genomic Alignments in BED12 (gather)
@Output:
High-confidence Isoforms (BED),
High-confidence Isoforms (GTF),
Expand Down
46 changes: 46 additions & 0 deletions workflow/rules/sqanti.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Sqanti related quality-control and filtering rules,
# Sqanti is being used to annotate/characterize novel
# isoforms and to build an even higher-confidence,
# filtered set of unique transcripts from flair.
# The resulting annotation/transcriptome will be
# used to quantify known/novel isoforms.
rule sqanti_qc:
"""
Data-processing step to characterize the input transcriptome
by computing a series of attributes by transcript, which are
written to the classification file, and a series of attributes
by junction, which are written to the junctions file. Please
note although we are running SQANTI3, the actual version of
the tool we are using is 'v5.1.2'. For more information,
please read through sqanti3's documenation:
https://github.com/ConesaLab/SQANTI3/wiki/
Github: https://github.com/ConesaLab/SQANTI3
@Input:
High-confidence Isoforms (FASTA) from flair collapse
@Output:
Sqanti Classification file (TSV),
Corrected Annotation (GTF),
Corrected Transcriptome (FASTA)
"""
pass


rule sqanti_ml_filter:
"""
Data-processing step to filter the sqanti qc output. The auhtor
from sqanti highly recommends filtering its output before using
it in down-stream analysis. Sqanti has a new filtering method
that employs random forest to discriminate potential artifacts
from true isoforms without the need for user-defined rules or
manually-set thresholds (i.e. previous method). For more info,
please read through sqanti3's documenation:
https://github.com/ConesaLab/SQANTI3/wiki/
Github: https://github.com/ConesaLab/SQANTI3
@Input:
Sqanti Classification file (TSV)
@Output:
ML Filtered Sqanti Classification file (TSV),
ML Filtered Corrected Annotation (GTF),
ML Filtered Corrected Transcriptome (FASTA)
"""
pass

0 comments on commit 5eb3ede

Please sign in to comment.