Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation file #42

Closed
ShrutiKetan opened this issue Apr 25, 2019 · 7 comments
Closed

Annotation file #42

ShrutiKetan opened this issue Apr 25, 2019 · 7 comments
Assignees

Comments

@ShrutiKetan
Copy link

Annotation file
After maxbin, the binned files were concatenated into one to be an input file for Vizbin. Headers being contig ids. (eg >k123, >k99, etc)
Annotation file was made starting with labels as the first line and then 1-11 as categorical variables. But after multiple tries, it still gives just blue and red color. Can you please help me figure it out ?

Thank you very much

@claczny
Copy link
Owner

claczny commented Apr 25, 2019

HI @ShrutiKetan,

could you please paste the first 20-or-so lines of your annotation file?

Thank you!

@claczny claczny self-assigned this Apr 25, 2019
@ShrutiKetan
Copy link
Author

ShrutiKetan commented Apr 25, 2019 via email

@claczny
Copy link
Owner

claczny commented Apr 25, 2019

Hmmm... 🤔

The first lines look about right...

Some thoughts:

  • Is the number of lines in the annotation files equal to number-of-contigs-in-contig-fa + 1?
  • If you do something like echo annotation.csv | sort | uniq, how many entries do you see? I would expect:
001
002
003
...
...
011
label
  • Are all your contigs >= minimum length you specify in VizBin (default: >= 1,000 nt)?
  • If you right-click in the visualization and show the legend, what do you see there? This should list 001, 002, ..., 011 in your case.

Alternatively, I would be happy to have a look at your files, if it would be possible for you to share them confidentially.

Best,

Cedric

@ShrutiKetan
Copy link
Author

ShrutiKetan commented Apr 25, 2019 via email

@claczny
Copy link
Owner

claczny commented Apr 25, 2019

Are all your contigs >= minimum length you specify in VizBin (default: >= 1,000 nt)? About this, not really! I have varied lengths. I am sorry i assumed than Vizbin would just ignore contigs below the set threshold.

VizBin does ignore contigs below the threshold. However, when using an annotation file, the sequences are ignored, but not the respective annotations. This could be improved for sure in VizBin.

Hence, your issue might be due to this.
To overcome this, you might

  1. filter the individual MaxBin bins according to length (e.g., >= 1,000 nt)
  2. create per-bin annotation files, i.e., containing you categorical label (no header) as often as there are sequences in the filtered bin
  3. concatenate the filtered bins -> contigs.size_selected.fa
  4. concatenate the annotation files -> annotation.csv
  5. put a "label" header into annotation.csv

I have attached an image of what i see in the visualization window.

I could unfortunately not see it attached. I assume this is because you replied via email and maybe GitHub's Issue system does not support this. Feel free to add the image to this issue (#42).

Depending on the size of contig.fa you might also give BusyBee Web a try. It complements some of the shortcomings of VizBin, it is web-based and thus has no user-level dependencies, but it provides less flexibility and has a maximum upload limit.

Hope this helps.

If you issue is solved, please do not forget to close it.
Should you have further questions, kindly let me know.

Best,

Cedric

@ShrutiKetan
Copy link
Author

ShrutiKetan commented Apr 26, 2019 via email

@claczny
Copy link
Owner

claczny commented Apr 26, 2019

Hi @ShrutiKetan,

that's great to hear!

Could you please share briefly how you solved your issue?
I will also close the issue, but feel free to post related questions after that, if needed.

Regarding binning of fungal genomes, I have to admit that I have not much experience in this.
I know from experiences shared by others that fungal bins frequently separate rather clearly from bacterial bins, e.g., in the case of bacterial "symbionts" in/on fungi, but in these cases, the fungal organism was the host.

I assume that in your case, you have some microbiome derived from some "complex" host/environment (e.g., from an insect, mammal, or waste water)?
It also depends on the complexity/diversity of the microbiome.
Unfortunately (or maybe fortunately, depending on one's point-of-view), there exists currently no single-best binning solution, i.e., one should try multiple tools.
This is also why "binning consolidators", such as DAS tool (https://www.nature.com/articles/s41564-018-0171-1) have been developed.

Best,

Cedric

@claczny claczny closed this as completed Apr 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants