Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix that causes warning messages when input exons are full UTR int… #404

Conversation

duartemolha
Copy link

…roduced in "ENSCORESW-2545"

Description

The commit here:
74e499a

was trying to correct some discrepancies in the softmasking on non-coding sequences
However, completely non-coding exons have a undefined $ex->coding_region_start resulting in warning messages

Use of uninitialized value in numeric gt (>) at ...EnsEMBL/Transcript.pm

After setting all the sequence to lower case at line 837 $exon_seq = lc($exon_seq)
any exons that does not have a defined coding start

the if statements
if ($ex->coding_region_start($self) > $ex->start()) {
and
if ($ex->coding_region_end($self) < $ex->end()) {
should never be done since both coding_region_start and/or coding_region_end will be undefined if
if (!defined ($ex->coding_region_start($self))) is true

Use case

This bug will output warning messages for completely UTR exons where the optional softmask has been set

for example for gene DDR2 , transcript id ENST00000367921

even though the output softmasking is correct before and after my code change, in the updated code we do not get warning messages such as
Use of uninitialized value in numeric gt (>) at .../ensembl/modules/Bio/EnsEMBL/Transcript.pm line XXX.

Benefits

The change I made makes it so that when checking a complete UTR exon (when soft_masking is requested) is all lowercase, and then the comparisons with coding start and coding end with the start and end of the exon are ignored,

Those comparisons are only done IF $ex->coding_region_start is defined
if (!defined ($ex->coding_region_start($self))) {
$exon_seq = lc($exon_seq);
}else{
if ($ex->coding_region_start($self) > $ex->start()) {
...
}

      if ($ex->coding_region_end($self) < $ex->end()) {
        ...
     }

}
$seq_string .= $exon_seq;

Possible Drawbacks

none that I can see

Testing

No. I have not created tests for this. The current tests available test to see if the boundaries between lowerCase and upperCase match.
For exons that are completely UTR, both the previous code and the new code would make the entire exon sequence lowercase. The only difference is that after my change there is no invalid if comparisons with undefined values and therefore no warning messages.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 81.467% when pulling 85116e3 on duartemolha:hotfix/Bug_on_softmaskig_issueENSCORESW-2545 into 3eda8b4 on Ensembl:master.

$exon_seq = lc (substr($exon_seq, 0, $forward_length)) . substr($exon_seq, $forward_length);
} else {
$exon_seq = substr($exon_seq, 0, $reverse_length+1) . lc(substr($exon_seq, $reverse_length+1));
}else{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As long as we are making changes here, it might be nice to flip the if-else case to make this more readable. Make the test if (defined ($ex->coding_region_start($self))) and switch the code in the if and else cases.

my $forward_length = $ex->coding_region_start($self) - $ex->start();
my $reverse_length = $ex->end() - $ex->coding_region_start($self);
if ($ex->strand == 1) {
$exon_seq = lc (substr($exon_seq, 0, $forward_length)) . substr($exon_seq, $forward_length);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Coding style issues: please add a space between brackets and else, also there's some trailing whitespace.

Copy link
Contributor

@mkszuba mkszuba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I insist on adding a test case confirming we get no unexpected warnings.

@duartemolha
Copy link
Author

duartemolha commented Aug 7, 2019 via email

@duartemolha
Copy link
Author

duartemolha commented Aug 7, 2019 via email

Copy link
Contributor

@tgrego tgrego left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually tried the method with ENST00000367921 and could not replicate the error.
A test case exposing the error is required, or a bug report so that it can be properly fixed.

@mkszuba
Copy link
Contributor

mkszuba commented Sep 16, 2019

Closing due to inactivity. By all means do go ahead and reopen if you can provide a test case which will allow us to replicate the error.

@mkszuba mkszuba closed this Sep 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants