-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Transcript.coding_sequence didn't work for non-coding transcripts #136
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -82,7 +82,12 @@ def _parse_fasta_dictionary(self): | |
fasta_dictionary = {} | ||
sequence_type = self.sequence_type | ||
for seq_entry in read(self.fasta_path, format="fasta"): | ||
seq_id = seq_entry.metadata["id"] | ||
# annoyingly Ensembl83 reformatted the transcript IDs of its | ||
# cDNA FASTA to include sequence version numbers | ||
# .e.g. | ||
# "ENST00000448914.1" instead of "ENST00000448914" | ||
# So now we have to parse out the identifier | ||
seq_id = seq_entry.metadata["id"].split(".")[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you add a test that would break if you didn't have this? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I did, check out There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oops, somehow didn't think that would cover it, cool |
||
fasta_dictionary[seq_id] = sequence_type(seq_entry) | ||
return fasta_dictionary | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remind me what happens/breaks if you don't have this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we search of a Transcript whose biotype isn't in these lists then we get an error.
Totally open to doing away with this "validation", since Ensembl keeps expanding the lists of valid biotypes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Fine for now