Annotation file: Label (in numeric form) not displaying default color options #53

jrdickey9 · 2023-04-17T22:35:55Z

Howdy there,

I am using VizBin to visualize and manually bin mags for host associated bacterial populations. I have a single .fasta file that contains scaffolds corresponding to samples. In other words, multiple sample .fasta files were combined into a single .fasta file so that I can bin genomes across all samples of interest. My goal here was to create an annotation file to reflect this. Each sample, or label (#1-9), corresponding to a color. Any color, really doesn't matter. I created the annotation file from the combined fasta file in efforts to maintain the scaffold order and find other interesting properties such as length and gc content.

The issue I am having is that the annotation file is working, I think, but the labels are not being read. To explain further, it appears that the size of each point is changing based on length. That is helpful somewhat.

In the future I would like to add a reference genome of my "bacteria of interest" as a marker to aid my binning (receive more complete bins and avoid contamination as much as possible). I have yet to add this to my annotation file since the labels aren't being read.

Beyond that -- I have MANY scaffolds that I am inputting into VizBin (>4mil). I have set minimum contig length to 2Kb or 3Kb. The annotation file and fasta contain the same number of entries prior to input into VizBin. The minimum contig length does toss out plenty of scaffolds, but not so much that only one label is left.

Below is the head and tails of both my annotation file and the .fasta file.

A) head annotation file
label,length,gc
1,134077,45.42
1,87175,45.16
1,65686,45.71
1,52865,45.92
1,44948,34.86
1,42530,45.30
1,42475,46.38
1,40293,45.94
1,29404,48.00

B) tail annotation file
9,200,56.50
9,200,55.50
9,200,60.00
9,200,53.00
9,200,52.50
9,200,53.00
9,200,42.00
9,200,42.50
9,200,34.00
9,200,57.00

C) head fasta file

D0_SEK2_2_scaffold_1_c1
CAATCGATACGACCCCGGAGAGCGGCTTTTGCTAAAACTCGAGCAGTTTCTTGAAAACTT
GCTTCTGATATGAAACTTTGAGTATTTAGAGATGCTTTCGTTATTCCCAATAAGATGGCT
CGATAACAGATCGCTTCTTCCAAAGCACGCCCTGTTCGTTCCGCCCGCAACAATTCAATT
AGTTCTCCGGGTGAAAAAACATTAGACATTCTATCTTCTGAAACCAACACTTTTGATGTT
ATTTGACGCACAATAATCTCTATATGTCGATTATGAATCTGCACTCCCTGAGATCGATAA
ACTTTTTGGATCTTATTAACCAAAGAGATACGACTTTGCACTATAGTTAGCTCAGCACCA
ATCAAGAATCCCCAAGGAATTCCAAGAATTTTTGCTATACGCTCGTTCCAACCCTCAATC
CTCTTTTCTAGGTTCATCGATATTGAATCAATCGAACGAACTTCTAACACTTGTTCCACT
TTTGGAAGACCTTGCGTTATATCTCCAGATCTCTATTTTTCATATATAAATGTAACTAAC

D) tail fasta file

W_SEK2_D15_scaffold_211474_c1
ATATTCCACTCCCATATCTGTGTTATTATTCGTTGCAATAAACCTTCATTACACTTTATA
TGATAGCACGACCATATTCCACTCCCATATCTGTGTTATTATTCGTTGCAATAAACCTTC
ATTACACTTTATATGATAGCACGACCATATTCCACTCCCATATCTGTGTTATTATTCGTT
GCAATAAACCTTCATTACAC
W_SEK2_D15_scaffold_113313_c1
CAGCAGACCGTGATGTCTTACGCCTGTGTTGCCCTCTACCGCTATGCGGTTGGTAAGCCA
GTGCCAGGGTTCGACCCAACGGCTATGCAGGGAGCGTTCCGAGTGAAGAAGCAGAAGTTC
ACCGGACAAGCCGGAGCCTAATTAGCGCCTAGGGCCACTCCGCGAACGAGAGCCTTCTGG
AAGTTCAGGTAAATGAACAC

note: D0_SEK2_2 is a sample name that I replaced with 1 in the label column of the annotation file. I thought potentially this software wasn't reading the labels correctly due to the underscores or the combination of letters and numbers. However, when it is just numbers, I am failing to get anything.

Any help would be great,

Jonathan
Post Doc, UCSD

jrdickey9 · 2023-04-17T22:55:59Z

PS: Here is the png from the output with a csv formatted annotation file and the fasta input.

jrdickey9 · 2023-04-20T19:02:51Z

Resolved - size filter fasta file before input into VizBin. Make annotation file from size selected fasta. Selected same size filter and proceed.

claczny · 2023-04-24T09:40:40Z

Hi Jonathan,

thank you for the issue and great to see that you have been able to resolve it.
I was off for a week, so could only reply now.

Indeed, this is the way that I'd have suggested to you too.
It is a point where the UX could surely be improved, but, unfortunately, I currently do not have resources available that I could dedicate to this.

As you mentioned this to be host-associated, my "suspicion" is that the big cluster in the middle might be genomic fragments from the host. Or maybe the distorted "C" shape cluster at 12 o'clock 🤔
Unless you filtered out reads prior already, than this is a different story 😉

Should you have further questions, please do not hesitate to ask.

Best wishes and stay safe,

Cedric

jrdickey9 · 2023-04-24T16:18:49Z

Thanks Cedric! Cheers, J

…

On Apr 24, 2023, at 2:40 AM, Cedric Laczny ***@***.***> wrote: Hi Jonathan, thank you for the issue and great to see that you have been able to resolve it. I was off for a week, so could only reply now. Indeed, this is the way that I'd have suggested to you too. It is a point where the UX could surely be improved, but, unfortunately, I currently do not have resources available that I could dedicate to this. As you mentioned this to be host-associated, my "suspicion" is that the big cluster in the middle might be genomic fragments from the host. Or maybe the distorted "C" shape cluster at 12 o'clock 🤔 Unless you filtered out reads prior already, than this is a different story 😉 Should you have further questions, please do not hesitate to ask. Best wishes and stay safe, Cedric — Reply to this email directly, view it on GitHub <#53 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AINUKN4EEKE6OYIRYYCX5ULXCZDCFANCNFSM6AAAAAAXBZ2LKA>. You are receiving this because you modified the open/close state.

jrdickey9 closed this as completed Apr 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annotation file: Label (in numeric form) not displaying default color options #53

Annotation file: Label (in numeric form) not displaying default color options #53

jrdickey9 commented Apr 17, 2023

jrdickey9 commented Apr 17, 2023

jrdickey9 commented Apr 20, 2023

claczny commented Apr 24, 2023

jrdickey9 commented Apr 24, 2023 via email

Annotation file: Label (in numeric form) not displaying default color options #53

Annotation file: Label (in numeric form) not displaying default color options #53

Comments

jrdickey9 commented Apr 17, 2023

jrdickey9 commented Apr 17, 2023

jrdickey9 commented Apr 20, 2023

claczny commented Apr 24, 2023

jrdickey9 commented Apr 24, 2023 via email