Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features which cover an entire circular sequence are not correctly shifted #92

Closed
amijalis opened this issue Dec 13, 2022 · 4 comments
Closed

Comments

@amijalis
Copy link
Contributor

amijalis commented Dec 13, 2022

I noticed today that a feature spanning an entire circular sequence is not properly shifted with Dseqrecord.shifted().

Reproduce with:

test_seq = 'ATCGATCGATCGATCGATCGATCGATCGATCGATCG'

# length of test_seq is 36.

test_seq_dseqrecord = Dseqrecord(test_seq).looped()
test_seq_dseqrecord.add_feature(0,36,type_='test', label='test')

print(f'Features before shifting: {test_seq_dseqrecord.features}')

test_seq_shifted = test_seq_dseqrecord.shifted(10)

print(f'Features after shifting: {test_seq_shifted.features}')

Output:

Features before shifting: [SeqFeature(SimpleLocation(ExactPosition(0), ExactPosition(36), strand=1), type='test', qualifiers=...)]

Features after shifting: []

The expected behavior is observed if the feature is annotated on the interval from [1: len(seq)]:

from pydna.dseqrecord import Dseqrecord
test_seq = 'ATCGATCGATCGATCGATCGATCGATCGATCGATCG'

# length of test_seq is 36.

test_seq_dseqrecord = Dseqrecord(test_seq).looped()
test_seq_dseqrecord.add_feature(1,36,type_='test', label='test')

print(f'Features before shifting: {test_seq_dseqrecord.features}')

test_seq_shifted = test_seq_dseqrecord.shifted(10)

print(f'Features after shifting: {test_seq_shifted.features}')

Output:

Features before shifting: [SeqFeature(SimpleLocation(ExactPosition(1), ExactPosition(36), strand=1), type='test', qualifiers=...)]

Features after shifting: [SeqFeature(CompoundLocation([SimpleLocation(ExactPosition(27), ExactPosition(36), strand=1), SimpleLocation(ExactPosition(0), ExactPosition(26), strand=1)], 'join'), type='test', location_operator='join', qualifiers=...)]

The issue seems to be in dseqrecord.py where the newstart and newend variables are computed. If the feature length is equal to the sequence length, newstart and newend end up being the same, and the feature isn't added.

pydna/src/pydna/dseqrecord.py

Lines 1092 to 1094 in f8d4850

for location in shiftedparts:
newstart = location.start % ln
newend = location.end % ln

@BjornFJohansson
Copy link
Owner

Thank you for your input. There are a couple of open bugs around features at the moment. Ill look into these next week.

@BjornFJohansson
Copy link
Owner

I think I have this solved. look out for a new release.

@amijalis
Copy link
Contributor Author

Thank you! looking forward to it!

@BjornFJohansson
Copy link
Owner

Solved in 5.2.0 I hope. Feel free to reopen if not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants