Feature: Output Pangenome sequence #49

josiahseaman · 2019-10-08T15:11:03Z

Would you be so kind as to modify ODGI bin at 1bp bins to output a pangenome sequence? It'd be the concatenation of all node sequences in the order that you already sorted them. A single long string.

#Python pseudocode for pangenome matrix sequence
with open(bin_output_file, 'w') as out:
	out.write(regular_bin_output(my_sort_order))
	if bin_size == 1:
		out.write("'Pangenome Sequence':")
		pangenome = ''.join(node.seq for node in my_sort_order)
		out.write(pangenome)

This could potentially be triggered at every bin level or only bin_size = 1bp. Sequence length with real data tend to be 120% the size of the starting genome, so 100s of MB, not 100s of GB. Perhaps a command line flag? --emit_sequence

The text was updated successfully, but these errors were encountered:

ekg · 2019-10-08T16:24:17Z

I think this should be the default behavior. I can make that so it ends up in the bin output.

…

On Tue, Oct 8, 2019 at 5:11 PM Josiah Seaman ***@***.***> wrote: Would you be so kind as to modify ODGI bin at 1bp bins to output a pangenome sequence? It'd be the concatenation of all node sequences in the order that you already sorted them. A single long string. #Python pseudocode for pangenome matrix sequencewith open(bin_output_file, 'w') as out: out.write(regular_bin_output(my_sort_order)) if bin_size == 1: out.write("'Pangenome Sequence':") pangenome = ''.join(node.seq for node in my_sort_order) out.write(pangenome) This could potentially be triggered at every bin level or only bin_size = 1bp. Sequence length with real data tend to be 120% the size of the starting genome, so 100s of MB, not 100s of GB. Perhaps a command line flag? --emit_sequence — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#49>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AABDQEPASZCWM76BHAJLYPTQNSPIPANCNFSM4I6TC2VQ> .

write bin sequences in bin -j output to resolve #49

ekg closed this as completed in 38b8535 Oct 16, 2019

ekg added a commit that referenced this issue Oct 16, 2019

Merge pull request #53 from vgteam/bin

463ba5b

write bin sequences in bin -j output to resolve #49

josiahseaman mentioned this issue Mar 19, 2020

Special case for bin_width=1 sequence output #88

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Output Pangenome sequence #49

Feature: Output Pangenome sequence #49

josiahseaman commented Oct 8, 2019

ekg commented Oct 8, 2019 via email

Feature: Output Pangenome sequence #49

Feature: Output Pangenome sequence #49

Comments

josiahseaman commented Oct 8, 2019

ekg commented Oct 8, 2019 via email