-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix inconsistent subtype function #10
Comments
When you mean inconsistent, is that because in |
Yes, so the inconsistent subtypes would be 2.1 and 2.2. I originally had the function try to find the highest resolution subtype (i.e. longest string of numbers) and use that as the reference to compare against, but in cases where the highest resolution subtypes are equal length, the function fails to report inconsistencies. |
I think I found an issue. If both strings are same length, then >= 1 would add both to the |
Sample SRR1958005 returns multiple subtype calls
2.1; 2.2
with all subtype calls being2; 2.1; 2.2
. Subtypes2.1
and2.2
are not consistent with each other.Testing
bio_hansel -s enteritids
on the Enterobase genome assembly of SRR1958005 (46121.fasta
) produced the following results:Other samples which should have been
are_subtypes_consistent == False
include:all_subtypes == "2; 2.1; 2.2.2.1.1"
all_subtypes == "2; 2.1; 2.2.2.1.1"
all_subtypes == "2; 2.1; 2.2.2"
all_subtypes == "2; 2.1; 2.2.2"
See function here:
https://github.com/phac-nml/bio_hansel/blob/master/bio_hansel/utils.py#L80
Also, would be a good idea to add a test for that function.
The text was updated successfully, but these errors were encountered: