-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Github issue 843 - Genebank parser #919
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @MaxGreil ! this is great progress.
Adding the full 14MB genbank file is a bit too much for a unit test. We should avoid adding a lot of data if not needed. The ideal unit test is a test that goes through a few lines only and asserts that the behaviour is as expected. For instance you can get the few lines excerpt from the genbank record to demonstrate the issue. You can even keep the few lines as a string constant inside the test.
Hi @josemduarte , I reduced the size of file NC_018080.gb to contain only one relevant case with qualifier anticodon. |
Is the fix for (edit) Here is the doc, appears that the syntax is similar to that for
|
ور
Get Outlook for Android<https://aka.ms/ghei36>
…________________________________
From: Michael L Heuer <notifications@github.com>
Sent: Wednesday, March 3, 2021 1:49:32 AM
To: biojava/biojava <biojava@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Subject: Re: [biojava/biojava] [WIP] Github issue 843 (#919)
Is the fix for transl_except as simple? Though here aren't any examples of these kind of failing lines in the linked issue
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#919 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AA35TI2BI3RGNUQMYOGUNJDTBUQKZANCNFSM4YNA2ZVA>.
|
I added a fix for qualifier @heuermh thank you for posting the documentation for qualifier @josemduarte does it make sense to split this PR for issue #843 into 2 separate PRs, i.e. one PR for the genebank parser and one PR for the embl parser? |
for (Iterator<FeatureInterface<AbstractSequence<NucleotideCompound>, NucleotideCompound>> it = dna.getFeaturesByType("tRNA").iterator(); it.hasNext();) { | ||
FeatureInterface<AbstractSequence<NucleotideCompound>, NucleotideCompound> tRNAFeature = it.next(); | ||
String anticodon = tRNAFeature.getQualifiers().get("anticodon").get(0).getValue(); | ||
assertFalse(anticodon.contains(" ")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about also asserting that the string is exactly as expected? If I understand it right, it should be: (pos:complement(1123552..1123554),aa:Leu,seq:caa)
I would simply fabricate an example for transl_except with wrapping in the same gb file excerpt text you have added.
Do we already have an EMBL file parser somewhere in BioJava? Sorry I don't know well this part of BioJava. If we do and it's only about fixing it, then please go ahead and make it part of this PR. It will only need an additional unit test (again, it's ok to have fabricated data). |
I added a line in file However, I have an additional question: I had a look at the EMBL file parser with test @josemduarte , should I open a new issue for this problem? |
Great, thanks @MaxGreil ! I think this looks good now. @heuermh could you double check it too?
Thanks for investigating. Indeed that deserves another issue separate from #843 |
I'll merge this by the end of the week if there's no other comments |
Reference Issue
Partly fixes #843
What does this implement/fix? Explain your changes.
Additional comments
@josemduarte , can you please tell me where I can get NC_018080 in .embl file format? I couldn't find it.