Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

annotations crossing overhangs lost #8

Open
aaroncooper opened this issue Jan 24, 2021 · 7 comments
Open

annotations crossing overhangs lost #8

aaroncooper opened this issue Jan 24, 2021 · 7 comments

Comments

@aaroncooper
Copy link

Just started playing around with DnaCauldron-- super cool module! One thing that isn't a showstopper but would be nice if resolved is that annotations crossing overhangs seem to be lost. it'd be nice if all of those were kept. is this a possibility? I looked for an option in the source but didn't see anything.

@veghp
Copy link
Member

veghp commented Jan 24, 2021

Hi, thanks for the interest in DnaCauldron! Yes, this is not an option at the moment. One could argue that an annotation spanning a site&overhang gets "destroyed" during restriction and is not valid anymore, but I see why this feature would be useful. A possible solution would be a script that preprocesses the seq records:

  • search for site&overhang patterns and note direction and start/stop positions
  • find annotations spanning these positions
  • for each of these, add new annotations that start from the overhang

@Zulko
Copy link
Member

Zulko commented Jan 24, 2021

Destroying a feature when is it getting cut is actually a "feature" of Biopython, but DnaCauldron should fix this via its crop_record_with_saddling_features method, which was specifically written to conserve overhang-crossing features and is used in both StickyEndFragment.list_from_record_digestion and HomologousFragment. From memory it used to work well, but I never wrote a unit test for list_from_record_digestion so we don't even know if it is currently broken, my bad 😬 . @aaroncooper do you have any minimal example you could provide?

One could argue that an annotation spanning a site&overhang gets "destroyed" during restriction and is not valid anymore

That's true and probably a good call from Biopython, but in most assemblies the overhangs will be flanking the part in the final construct, so if the feature was just limited to the part and its overhangs it will appear identical in the assembly record.

For all the assembly constructs that you've already made and which have "dropped" part features, there is actually an a-posteriori remediation via the copy_features_between_common_block from the Geneblocks library. This feature was written exactly for the purpose of adding back part features at the time where DnaCauldron was drop cross-overhang part features (which it shouldn't be doing anymore, grmbl grmbl):. Just copying the README example:

from geneblocks import CommonBlocks, load_record
part = load_record('part.gb', name='insert')
plasmid = load_record('part.gb', name='plasmid')
blocks = CommonBlocks.from_sequences([part, plasmid])
new_records = blocks.copy_features_between_common_blocks(inplace=False)
annotated_plasmid = new_records['plasmid'] # Biopython record

@aaroncooper
Copy link
Author

Thanks for the quick response. I just walked through the code myself to see what you were describing-- it makes sense to me. I made a small example that shows what I'm seeing.

test_dnacauldron.zip

@veghp
Copy link
Member

veghp commented Jan 24, 2021

Thanks for the clarification and correction. I'll have a look into this method and test it.

@veghp
Copy link
Member

veghp commented Jan 25, 2021

I tested the fragments with CUBA and can confirm that annotations that overlap with an overhang are lost. This was tested using added annotations that span or partially overlap with the overhang from either sides.
I will have a look into the code to find the problem.

@AubinF
Copy link

AubinF commented Jun 2, 2022

Hi, super cool package indeed, although I just noticed the same issue as Aaron with version 2.0.6. The workaround provided by Zulko works perfectly though 👌 Any ETA for the fix @veghp ? Thanks!

@veghp
Copy link
Member

veghp commented Jun 2, 2022

Thanks for the feedback & the patience, I'm currently updating it to work on Python 3.9, but the latest biopython causes an issue with BioBrickStandardAssembly (there is an extra B' in the sticky sequence, possibly due to how the Seq class stores the data now). Once that's fixed, I can look into this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants