-
Notifications
You must be signed in to change notification settings - Fork 102
filter_away_subset.py error #15
Comments
Hi @markopetek , Your count file The file --Liz |
Hi Liz, I thought this incompleteness of the abundance file might be caused by some glitch in the gmap produced sam file so I've mapped the reads using STARlong, sorted and rerun the collapse and filer script again but I get the same erorr. Would you suggest any parameter change when running the collapse script in order to avoid this outcome? Maybe the -c 0.99 -i 0.95 are too strict since I've mapped cultivated potato transcripts to a reference which is a different subspecies - that might be why the gff has more isoforms than the abundance file. |
Hi @markopetek , |
Dear Liz,
this are the fastq and the cluster report file that you've requested:
https://www.dropbox.com/sh/kyn9j8txg6vqlnd/AAB9nLr8_GybALY5N1NKCo9Ia?dl=0
Let me know what you find out.
With kind regards,
Marko
…--------------------
Lep pozdrav
Marko Petek
2017-08-25 20:19 GMT+02:00 Magdoll <notifications@github.com>:
Hi @markopetek <https://github.com/markopetek> ,
Can you email me the input FASTQ file? I need it to run collapse so I can
see if it's a bug in collapse. Whether you use GMAP or STAR or change
parameters should have no effect.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AMN1X0Dvz_3zvZZFrRDwTfXhZyC0R7ZJks5sbxA1gaJpZM4O_9m0>
.
|
Hi @markopetek , I believe I have identified the issue. Somehow some of your IDs had an extra underline in your FASTQ file, but they were not there in the The group txt looked like this:
where you notice some of the IDs had two underlines
(don't be alarmed to see Once I changed the group file to remove the extra underline and ran I've put the fixed group file and filtered results here: |
Thanks for the help Liz. I see that in the fastq file the sequences starting with |
Fixing the HQ FASTQ file header itself is easier because that's the "root" of the problem. Fixing the group.txt is also fine, you just have to remember to do it every time you re-run collapse. What's not clear to me is why for |
For S1 sample the filter_away_subset.py finished without a problem but for the second sample (S2) I get this error:
this is the command I used for running the script (in anaCogent environment):
the input and generated output files are available here:
https://www.dropbox.com/sh/z4wps7y6gvyg4w2/AACDiAk4X-SyyoeWYwTqL0i2a?dl=0
The text was updated successfully, but these errors were encountered: