New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sequence Complement does not return the same results on Python3.3.3 as Python3.4 #518
Comments
This sounds very serious, but likely very difficult to identify without more information. Have you compared with Python 2.6 or 2.7? That might help work out if this is related to the string/unicode changes. You said it is not random? Otherwise I might have guessed it was faulty RAM to blame. |
Not random in that multiple runs always yield the same result. My code will not run on Python 2.x so I cannot test that without some gymnastics. I have been chasing this for about a week now. When I run the code block below with a ref_seq of say 10,000 nucleotides and an alt_seq of say 10,000 also and filtering them so I know ref != alt; Python 3.3.3 I might get a count of 1,000 and Python 3.4 it will be 1,300. If I make the complement myself like in the last code block the results are the same. Both Python 3.3 and 3.4 should treat strings/unicode the same.
|
I did some checking to try and find the root of the problem. I copied your code (shown below) to do the complementing and tested it outside of Biopython with both versions of Python. Worked like a charm. I also checked and the version test in your code and it is working fine. I am stumped at what else to check at this point.
|
I was suggesting comparing Python 2 and Python 3 (just in case this is Python 3 specific which would be a useful clue). |
Can you verify that your Python 3.3.3 and Python 3.4 installations are both using the same version of Biopython? The Seq equality operation in Biopython recently changed semantics to compare string values instead of object identity. Assuming that's all fine, then if |
The Biopython versions are the same (1.65). I was using "is", did not know On Thu, Apr 16, 2015 at 7:23 PM, Eric Talevich notifications@github.com
|
The program is woven together in a way that cutting it down would be a big On Fri, Apr 17, 2015 at 9:03 AM, Peter Cock notifications@github.com
|
I'm going to close this issue now - I hope the problem went away, but without more information I don't see what else we can do. |
When complementing DNA during a search for mutations I found that when run on Python 3.3.3 I got about 3% fewer hits than on Python 3.4 using Seq(str(ref_seq+alt_seq)).complement(). When I wrote my own simple code to generate the complement this difference went away. The error is rare and not random. I have not been able to capture the actual sequence that is doing this. The difference in the two are significant.
The text was updated successfully, but these errors were encountered: