-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pychopper? #43
Comments
Dear @fburdet Personally, I have never used pychopper on my datasets. IsoQuant does not strongly depend on read directions or adapters, it only checks for polyA tails. I also presume that alignment is not significantly affected by adapters. However, if you happen to run IsoQuant on pychopper-processed data, it would be interesting to compare the results. 40% does seem a bit low. I cannot recall exact numbers, but ONT reads do tend to have a lot of secondary alignments, some of which appear to be correct. Does results of IsoQuant seem reasonable though? You can also send me the log if you'd like to. Best |
Dear Andrey,
Thanks for your answer !
Yes, the results seem reasonable. But it could be that more reads could be used indeed.
OR I didn’t count the uniquely mapped correctly. Unfortunately, in the counts generated by isoquant, there seems to be only the ambiguous and no_feature numbers. Is it maybe possible to add more of the stats that are in the log?
For now, I didn’t keep all the logs, so the only way I found to calculate the uniquely mapped without re-running isoquant was to count the number of reads in read_assignments using
grep unique ***.fq.read_assignments.tsv | wc -l
I’ll have to run pychopper anyhow so I can re-run isoquant let you know if it increases some of the mapping stats / changes the results.
Best,
Fred
From: Andrey Prjibelski ***@***.***>
Date: Sunday, 16 October 2022 at 13:30
To: ablab/IsoQuant ***@***.***>
Cc: Frederic Burdet ***@***.***>, Mention ***@***.***>
Subject: Re: [ablab/IsoQuant] pychopper? (Issue #43)
Dear @fburdet<https://github.com/fburdet>
Personally, I have never used pychopper on my datasets. IsoQuant does not strongly depend on read directions or adapters, it only checks for polyA tails. I also presume that alignment is not significantly affected by adapters. However, if you happen to run IsoQuant on pychopper-processed data, it would be interesting to compare the results.
40% does seem a bit low. I cannot recall exact numbers, but ONT reads do tend to have a lot of secondary alignments, some of which appear to be correct. Does results of IsoQuant seem reasonable though? You can also send me the log if you'd like to.
Best
Andrey
—
Reply to this email directly, view it on GitHub<#43 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACSOZWLKWZM4TLNTLDDAAJTWDPROLANCNFSM6AAAAAARFA43RM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Ah, and here’s the log!
Cheers
From: Frederic Burdet ***@***.***>
Date: Monday, 17 October 2022 at 10:39
To: ablab/IsoQuant ***@***.***>, ablab/IsoQuant ***@***.***>
Cc: Mention ***@***.***>
Subject: Re: [ablab/IsoQuant] pychopper? (Issue #43)
Dear Andrey,
Thanks for your answer !
Yes, the results seem reasonable. But it could be that more reads could be used indeed.
OR I didn’t count the uniquely mapped correctly. Unfortunately, in the counts generated by isoquant, there seems to be only the ambiguous and no_feature numbers. Is it maybe possible to add more of the stats that are in the log?
For now, I didn’t keep all the logs, so the only way I found to calculate the uniquely mapped without re-running isoquant was to count the number of reads in read_assignments using
grep unique ***.fq.read_assignments.tsv | wc -l
I’ll have to run pychopper anyhow so I can re-run isoquant let you know if it increases some of the mapping stats / changes the results.
Best,
Fred
From: Andrey Prjibelski ***@***.***>
Date: Sunday, 16 October 2022 at 13:30
To: ablab/IsoQuant ***@***.***>
Cc: Frederic Burdet ***@***.***>, Mention ***@***.***>
Subject: Re: [ablab/IsoQuant] pychopper? (Issue #43)
Dear @fburdet<https://github.com/fburdet>
Personally, I have never used pychopper on my datasets. IsoQuant does not strongly depend on read directions or adapters, it only checks for polyA tails. I also presume that alignment is not significantly affected by adapters. However, if you happen to run IsoQuant on pychopper-processed data, it would be interesting to compare the results.
40% does seem a bit low. I cannot recall exact numbers, but ONT reads do tend to have a lot of secondary alignments, some of which appear to be correct. Does results of IsoQuant seem reasonable though? You can also send me the log if you'd like to.
Best
Andrey
—
Reply to this email directly, view it on GitHub<#43 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACSOZWLKWZM4TLNTLDDAAJTWDPROLANCNFSM6AAAAAARFA43RM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Dear @fburdet
Thank you for the suggestion, will do.
This counts unique read-to-isoform assignments. Uniquely assigned read may not necessary be uniquely mapped, e.g. if its secondary alignment is mapped to intergenic region. Vise versa, ambiguous assignments may come from uniquely mapped reads, i.e. when the read covers only part of the gene and it is not clear which isoform it is (quite typical for truncated ONT reads). Thus, 40% for uniquely assigned reads seems to be ok. I still don't see the log (maybe it was not attached though email,but anyway, I don't think it contains a lot of useful information, except maybe read assignment statistics in the end. Don't hesitate to ask other questions if needed. Best |
Hello Andrey,
Thanks for looking into it!
So what would be a better stats of reads that could actually be used for counting?
Weird that the log didn’t go through. I’ll paste the end of it below.
Thanks in advance!
Best,
Fred
2022-07-02 14:22:10,311 - INFO - Read assignment statistics
2022-07-02 14:22:10,311 - INFO - ambiguous: 1405194
2022-07-02 14:22:10,312 - INFO - inconsistent: 3144351
2022-07-02 14:22:10,312 - INFO - noninformative: 410271
2022-07-02 14:22:10,312 - INFO - unique: 3820241
2022-07-02 14:22:10,312 - INFO - unique_minor_difference: 559669
2022-07-02 14:22:10,312 - INFO - Transcript model file ./00_WT_pod_p17.fq/00_WT_pod_p17.fq.transcript_models.gtf
2022-07-02 14:22:10,312 - INFO - Transcript model statistics
2022-07-02 14:22:10,312 - INFO - known: 23951
2022-07-02 14:22:10,312 - INFO - novel_in_catalog: 4884
2022-07-02 14:22:10,312 - INFO - novel_not_in_catalog: 2439
2022-07-02 14:22:13,158 - INFO - Processed sample 00_WT_pod_p17.fq
2022-07-02 14:22:13,158 - INFO - Processed 1 sample
2022-07-02 14:22:13,158 - INFO - === IsoQuant pipeline finished ===
From: Andrey Prjibelski ***@***.***>
Date: Monday, 17 October 2022 at 14:36
To: ablab/IsoQuant ***@***.***>
Cc: Frederic Burdet ***@***.***>, Mention ***@***.***>
Subject: Re: [ablab/IsoQuant] pychopper? (Issue #43)
Dear @fburdet<https://github.com/fburdet>
Unfortunately, in the counts generated by isoquant, there seems to be only the ambiguous and no_feature numbers. Is it maybe possible to add more of the stats that are in the log?
Thank you for the suggestion, will do.
grep unique ***.fq.read_assignments.tsv | wc -l
This counts unique read-to-isoform assignments. Uniquely assigned read may not necessary be uniquely mapped, e.g. if its secondary alignment is mapped to intergenic region. Vise versa, ambiguous assignments may come from uniquely mapped reads, i.e. when the read covers only part of the gene and it is not clear which isoform it is (quite typical for truncated ONT reads). Thus, 40% for uniquely assigned reads seems to be ok.
To count uniquely mapped reads it's best to use original BAM file and samtools.
I still don't see the log (maybe it was not attached though email,but anyway, I don't think it contains a lot of useful information, except maybe read assignment statistics in the end.
Don't hesitate to ask other questions if needed.
Best
Andrey
—
Reply to this email directly, view it on GitHub<#43 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ACSOZWJIUCGG2ZMXWZ3WDVDWDVB3DANCNFSM6AAAAAARFA43RM>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
The log looks normal, proportions of unique/ambiguous/inconsistent reads seem reasonable for ONT datasets. By default, unique and ambiguous reads are used for quantification. You may also use set different quantification strategies (i.e. unique only) if needed. Best |
I'll close this issue for now, please re-open or open a new one if any other questions arise. Best |
Hi~ Andrey,
Is it possible to output the statistical information of minimap results in the log? Best |
Hi @zpliu1126 I will add it on my TODO list, this could be informative for the user. At the moment you can use, for example, P.S. You can create new issues even for minor question since comments in the closed topics might be missed. Best |
Hello,
I successfully ran IsoQuant using the ONT reads directly from the sequencing facility.
For many other similar programs, it is recommended to run pychopper first. It seems to remove the primers and correct the direction of the reads.
Is it needed for IsoQuant? I seem to get about 40% reads uniquely mapped, this sounds a bit low (as compared to short reads where I have more experience). Is it?
Thanks in advance!
The text was updated successfully, but these errors were encountered: