You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for providing TAMA, I'm enjoying the flexibility, filtering options, and ability to track what's happening. This is my first time working with IsoSeq data. My goal is to improve the accuracy of our transcriptome by merging the IsoSeq-derived transcriptome with our existing short-read derived transcriptome.
I'm starting with FLNC reads from IsoSeq3.
When collapsing reads into transcripts in the presence of partially 5' degraded reads using the nocap option, the shorter partially-degraded reads are being assigned to transcript models supported by a single read if the variation occurs 5' to the start of the degraded read.
Here's an alignment to show you the problem. All the reads prefixed with "2" (at the top of the alignment) were assigned to model 2. All the reads prefixed with "3" were assigned to model 3. Model 2 is represented by the majority of the reads. In model 3, the first intron has not been spliced out. The shorter reads could have been assigned to either model with equal confidence.
Model 3 is actually only supported by a single read (m64128_230204_024757/124062102/ccs).
Assigning the partially degraded reads to model 3 makes it very difficult to auotomatically remove the model. It is not removed by remove_single_read_models.py, as it appears to be supported by 17 reads. This problem occurs for many high read depth genes in my dataset (note that model 2 was supported by over 1000 reads).
Can you recommend a way to remove partially-degraded reads prior to collapsing, or a setting in TAMA_collapse which would assign ambiguous reads (which could map to more than one model) to the model with the most reads?
I've attached fasta files containing the reads in the alignment, in case they would be useful.
No worries, I asked in 2 different places and got the same answer months
ago.
In fact,I just presented using TAMA to all Agriculture and Agrifood Canada
bioinformaticians today!
Great software, thank you!
Cathy
That's great that you were able to get some answers! And thank you so much for using TAMA! I'll try to be quicker with my responses next time but feel free to email me if it is urgent.
Thanks for providing TAMA, I'm enjoying the flexibility, filtering options, and ability to track what's happening. This is my first time working with IsoSeq data. My goal is to improve the accuracy of our transcriptome by merging the IsoSeq-derived transcriptome with our existing short-read derived transcriptome.
I'm starting with FLNC reads from IsoSeq3.
When collapsing reads into transcripts in the presence of partially 5' degraded reads using the nocap option, the shorter partially-degraded reads are being assigned to transcript models supported by a single read if the variation occurs 5' to the start of the degraded read.
python ~/bin/tama/tama_collapse.py -s tama_split_20_1.sam -f genome.fa -p tama_split_20_1 -i 99 -x no_cap -a 100 -z 100 -sj sj_priority -lde 1 -sjt 20
Here's an alignment to show you the problem. All the reads prefixed with "2" (at the top of the alignment) were assigned to model 2. All the reads prefixed with "3" were assigned to model 3. Model 2 is represented by the majority of the reads. In model 3, the first intron has not been spliced out. The shorter reads could have been assigned to either model with equal confidence.
Model 3 is actually only supported by a single read (m64128_230204_024757/124062102/ccs).
Assigning the partially degraded reads to model 3 makes it very difficult to auotomatically remove the model. It is not removed by remove_single_read_models.py, as it appears to be supported by 17 reads. This problem occurs for many high read depth genes in my dataset (note that model 2 was supported by over 1000 reads).
Can you recommend a way to remove partially-degraded reads prior to collapsing, or a setting in TAMA_collapse which would assign ambiguous reads (which could map to more than one model) to the model with the most reads?
I've attached fasta files containing the reads in the alignment, in case they would be useful.
Thanks again,
Cathy
G4714.2 reads subset.txt
G4714.3 reads.txt
The text was updated successfully, but these errors were encountered: