Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Palindromes in Golden Gate assembly, only want one construct #22

Open
stephenturner opened this issue Jun 5, 2024 · 5 comments
Open

Comments

@stephenturner
Copy link

Hello. I have a palindrome overhang in one of my parts. This is causing two constructs to get produced. I have to use this palindrome for my application.

This gives me two assemblies: a correct one with the vector and all the parts, and an incorrect one with the vector placed twice with several parts repeated. I tried creating a simple assembly plan, but I still get the same result.

I can set max_constructs=1, and at least for this example I get the shorter/simpler assembly that I want. But I'm not sure I can guarantee this is how it will behave all the time. Or does it?

Thanks for any help you can provide!

image

@Zulko
Copy link
Member

Zulko commented Jun 5, 2024

Yeah DNACauldron was made to catch quirks such as palyndromic overhangs and be very vocal and stubborn about them.

Just thinking out loud, but I can't be certain that max_constructs=1 will return the valid construct every time (although it could be possible - might depend on how networkx lists cycles in a graph).

I think your best shot (most explicit and robust) would be to simply inspect the constructs returned:

def pick_the_one_valid_record(records, expected_parts_list):
     for record in records:
           parts_list = [f.qualifiers["source"] for f in record.features if "source" in f.qualifiers]
           if parts_list == expected_parts_list:
               return record

records = # ... compute the records based on your assembly   
valid_record = pick_the_one_valid_record(records, expected_parts_list)

There might also be a way to do this through the DnaCauldron API (using mix.compute_circular_assemblies with fragments_set_filter=) but not sure you want to go there (and it won't work with assembly plans)/

@stephenturner
Copy link
Author

Many thanks for the quick response! This gets the records I want. Excuse the naive question here, but now that I've filtered down to valid records, how can I filter down the simulation object I use to write the report? Following the workflow in the docs:

simulation=assembly.simulate(sequence_repository=repository)

# valid record picking here
# something else here

# show stats
simulation.compute_summary_dataframe()

# Write output
report_writer = dc.AssemblyReportWriter(
    include_fragment_plots='on_error',
    include_assembly_plots=True,
    include_mix_graphs=True, 
    include_part_plots=False,
    include_pdf_report=True
)
simulation.write_report(
    target="output-group1",
    report_writer=report_writer,
)

@stephenturner
Copy link
Author

Nevermind. I think I answered my own question. Simply replacing the construct_records with a single element list of valid records does the trick.

simulation.construct_records=[valid_record]

@veghp
Copy link
Member

veghp commented Jun 6, 2024

Thanks for posting the code. What do you have in mind for valid record picking? One option I can think of is filtering by expected length (maybe using the sizes of the valid fragments, i.e. the ones with 2 overhangs and no enzyme sites), the other option is checking for the presence of exactly one "From X" etc feature annotation from each part.

@stephenturner
Copy link
Author

In my case, I'm performing many simple GG assemblies with a known number of parts and a vector backbone. I've designed overhangs such that the only valid record is the one that includes all the parts. That code above solves my issue. Thanks again!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants