Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jp documentation #1470

Merged
merged 177 commits into from
Aug 23, 2020
Merged

Jp documentation #1470

merged 177 commits into from
Aug 23, 2020

Conversation

Jessica-Pan
Copy link
Contributor

Feel free to edit anything and everything, or suggest feedback!

Anvi'o documentation (of v6.2) over the course of summer 2020.

@@ -0,0 +1,3 @@
This file contains **the frequency of each amino acid in each gene in your %(contigs-db)s.**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest that any artifact describing text files (especially input text files) includes an example of just a few lines in the file, since it is easier and faster to understand the format by looking at it rather than reading the description.

It is not really necessary for output files, but could still be useful so that people understand what to expect (and perhaps plan how they might use the file downstream).

### Pangenomic Workflows
You can also use bins to group together gene clusters. This is useful if you want a specific group of contigs to remain together through your entire analysis. Just provide your %(internal-genomes)s file to %(anvi-gen-genomes-storage)s.

Wow, this binning thing seems BINcredible! (not sorry)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂

--full-report kinase_information.txt
--include-sequences
--verbose
{{ codestop }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of these code blocks are missing new line escape characters (this one: \) at the end :)

No one will do this, but in theory if someone were to copy paste this line to their terminal and press enter:

anvi-search-functions -c %(contigs-db)s \
            --search-terms kinase
            --full-report kinase_information.txt
            --include-sequences
            --verbose

It would run each of these lines separately:

anvi-search-functions -c %(contigs-db)s --search-terms kinase

--full-report kinase_information.txt

--include-sequences

--verbose

But if it looked like this:

anvi-search-functions -c %(contigs-db)s \
            --search-terms kinase \
            --full-report kinase_information.txt \
            --include-sequences \
            --verbose

Then copy-pasting it to the terminal would have run it this way:

anvi-search-functions -c %(contigs-db)s --search-terms kinase --full-report kinase_information.txt  --include-sequences --verbose

which is the desired behavior :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops! I'll be sure to fix that.


Clustal programs do a great job of visualizing this data, by color coding it. Here is an example from Anvi'o's pangenome display:

A lovely clustal-like alignment from the anvi'o pangenome display
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a placeholder for an image? :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep! Andrea said she'd send me an example from the pangenome display soon, so I put that there as a placeholder for the time being.

@@ -0,0 +1,11 @@
This artifact **contains information about the functions of the contigs in your %(contigs-db)s.**

It is less a separate file, and more an extension of your %(contigs-db)s that contains this information.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotated functions are part of the contigs database; specifically, they go in the gene functions table of the database. So I wouldn't describe them as being in a file, per se; that might confuse some people into thinking they could find said file somewhere on their computer, but they cannot because this information is internal to the database.

This is definitely one of the more abstract artifacts :)

Perhaps a better way to describe functions is "a table in your contigs database that contains functional annotations for genes".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sorry! This was one of the first ones I wrote, when I thought that all of the artifacts were file types. I'll fix that now.


To get one of these for your %(contigs-db)s, you can either import it (using %(anvi-import-functions)s) or make one yourself by running your contigs against one of two databases available in anvi'o:
* NCBI [COGs database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC102395/) -- see %(anvi-run-ncbi-cogs)s for instructions
* EBI's [Pfam database](https://pfam.xfam.org/) -- see %(anvi-run-pfams)s for instructions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think anvi-run-kegg-kofams should be in this list, too. It also updates the gene functions table with annotations (in fact, it is essentially the same code as anvi-run-pfams except that it uses the KOfam database (which is downloadable from https://www.genome.jp/tools/kofamkoala/ or more specifically ftp://ftp.genome.jp/pub/db/kofam/)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that means that functions should be in the provides list for anvi-run-kegg-kofams 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done :)


This is used to run %(anvi-analyze-synteny)s.

You can also use %(anvi-export-functions)s to view the contents of this file through a %(functions-txt)s artifact.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of 'view the contents of this file', I would say 'obtain a file containing functional annotation information' :)

@meren
Copy link
Member

meren commented Aug 23, 2020

@Jessica-Pan, thank you very much for this monumental effort. I am merging this to master and will update the web page to see how it goes.

I will also add you as a collaborator to anvi'o project, hence, you will be able to directly edit the master branch, or start new PRs directly within anvi'o project.

@meren meren merged commit 0729df4 into merenlab:master Aug 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants