-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3 things: iteritems() vs items(); rect to polar; sorting errors #26
Comments
@dmcgoldrick , |
Thanks Kelly --
The current sorting of the data is not working for us ether :-/ i.e. when
you return sorted(foo...). Also is this code robust to encoding for
latin-1 vs utf-8? We get issues with that in our pipelines. So we just had
a conversion failure with a batch of gtc's that completed with the gcf2vcf
tool (which then can't handle indels and made outdated vcf). 1300 samples.
https://github.com/freeseek/gtc2vcf
We ran yours and found that it was having issues with modern gtc data using
AutoConvert directly from Illumina. We don't like miniconda. Also version
3+ is fairly standard now - time has come - 2.7+ has problems with
interfaces/syntax/encodings and interactions with htslib, pysam, pyvcf...
Weird but the python imterpreter doesn't like your tuple syntax but it
should be able to take a tuple :-)
Your competitor tool gtc2vcf makes vcf version 3 - (yes yuck). Then the
output says run vcf-convert to get to the current 4.0 spec which is using
perl module Vcf.pm so then we need to put in a perl dependency... geesh
We need a *simple* and robust backward compatibile and up-to-date solution
we can rely on.
So none of these solutions are quite there for us for a real genome
facility process? If we were processing 100,000 genomes and chips both code
bases would be a nightmare for our pipeline with all the post processing,
dependency webs and lack of informative errors.
The latest issue was the vcf Writer class failed and did not dump to vcf
with a new batch of gtc files and the entry objects are not correct for the
gtc we have - i put in some code to show me what it made and one batch is
parsing and the other batch of gtc is not. So it is unstable depending on
the manifest and gtc but *it does not help us find or tell us the issues.*
I will try to look at the pull request if you like?
So.... we don't yet have a complete up-to-date solution implemented.
best,
Daniel
…On Tue, Nov 27, 2018 at 8:04 PM KelleyRyanM ***@***.***> wrote:
@dmcgoldrick <https://github.com/dmcgoldrick> ,
At the moment, this tool is only compatible with python 2.7.x. The three
issues you've noted above are caused by Python 2 vs 3 differences. There is
an open pull request (#24 <#24>)
with fixes for python 3 compatibility; however, I was still following up on
the syntax error for the rect_to_polar call. As currently, this is a
function that takes a single argument, which is a tuple.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ARa6I4nUc__uaLH3aZvA77bNZW3E7mTgks5uzgsYgaJpZM4YeX7h>
.
--
*Daniel J McGoldrick Ph.D*
*CMG and GRC *
*UW Genome Sciences Center*
*Box 355065Seattle, WA 98195(206) 685-7342*
|
If python 3 compatibility is a critical need, it should be possible to support. I created a new branch at https://github.com/Illumina/GTCtoVCF/tree/python3_compatibility and currently all regression and unit tests are passing with an install of python 3.5.6. It may be worthwhile to see if the changes in that branch address your issues before it is merged back into develop. This tool supports GTC files written by the latest version of AutoConvert. If have some files where that is not the case, I could try to help troubleshoot. |
Got it ! --
I really appreciate that new branch - great news - I will check it out
immediately and get back to you with what we are getting :-).
best,
Daniel
…On Wed, Nov 28, 2018 at 8:16 PM KelleyRyanM ***@***.***> wrote:
If python 3 compatibility is a critical need, it should be possible to
support. I created a new branch at
https://github.com/Illumina/GTCtoVCF/tree/python3_compatibility and
currently all regression and unit tests are passing with an install of
python 3.5.6.
It may be worthwhile to see if the changes in that branch address your
issues before it is merged back into develop. This tool supports GTC files
written by the latest version of AutoConvert. If have some files where that
is not the case, I could try to help troubleshoot.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ARa6I3bmiaY0fukBqzMyhihxKWoRtPwoks5uz19fgaJpZM4YeX7h>
.
--
*Daniel J McGoldrick Ph.D*
*CMG and GRC *
*UW Genome Sciences Center*
*Box 355065Seattle, WA 98195(206) 685-7342*
|
Hi Ryan --
This new update on your branch ran to completion on a repeat of your
initial regression test, and 1300 samples and the output is being evaluated
against a control. We are not getting the errors we had before! :-) and so
far so good.
Outstanding work! I think the issue we had in not getting a parse at all
has been fixed with the update to python 3.5.6.
We will let you know if we discover any other issues - but I think you
slam-dunked this one.
best,
Daniel
…On Wed, Nov 28, 2018 at 8:16 PM KelleyRyanM ***@***.***> wrote:
If python 3 compatibility is a critical need, it should be possible to
support. I created a new branch at
https://github.com/Illumina/GTCtoVCF/tree/python3_compatibility and
currently all regression and unit tests are passing with an install of
python 3.5.6.
It may be worthwhile to see if the changes in that branch address your
issues before it is merged back into develop. This tool supports GTC files
written by the latest version of AutoConvert. If have some files where that
is not the case, I could try to help troubleshoot.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#26 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ARa6I3bmiaY0fukBqzMyhihxKWoRtPwoks5uz19fgaJpZM4YeX7h>
.
--
*Daniel J McGoldrick Ph.D*
*CMG and GRC *
*UW Genome Sciences Center*
*Box 355065Seattle, WA 98195(206) 685-7342*
|
Thank you for your feedback in improving the tool. |
LocusEntryFactory.py: for _, value in position2record.iteritems():
2.1)LocusEntryFactory.py: return sorted(result, key=lambda entry: (self._chrom_sort_function(entry.vcf_record.CHROM), entry.vcf_record.POS, entry.vcf_record.REF))
2.2) ReaderTemplateFactory.py: sorted(contigs.items(), key=lambda x: self._chrom_sort_function(x[0])))
e.g.:
Traceback (most recent call last):
if limit >= 0:
TypeError: unorderable types: AttributeError() >= int()
def rect_to_polar((x,y)):
this should be def rect_to_polar(x,y):? does the code even use this function ? Anyhow it's a syntax error for me
The text was updated successfully, but these errors were encountered: