Skip to content

Commit

Permalink
added collapse use cases
Browse files Browse the repository at this point in the history
  • Loading branch information
qiyunzhu committed Aug 3, 2021
1 parent 634535d commit 347301f
Showing 1 changed file with 12 additions and 10 deletions.
22 changes: 12 additions & 10 deletions doc/collapse.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,18 @@ woltka tools collapse -i input.biom -m mapping.txt -o output.biom

With this tool one can achieve the following goals:

1. Translate feature IDs into names or descriptions.
- Example: Translate taxonomic IDs to taxon names.
- Example: Translate [UniRef](https://www.uniprot.org/help/uniref) IDs to protein names, while **merging** same names.

2. Group lower features into higher categories.
- Example: Convert genera to families.

3. Convert lower features into higher ones, where each lower feature may correspond to **multiple** higher features.
- Example: Convert KEGG [orthologs](https://www.genome.jp/kegg/ko.html) to [pathways](https://www.genome.jp/kegg/pathway.html).
- Example: Convert [GO](http://geneontology.org/docs/ontology-documentation/) terms to [GO Slim](http://www-legacy.geneontology.org/GO.slims.shtml) terms.
1. Translate feature IDs into names or descriptions. Examples:
- Translate taxonomic IDs to taxon names.
- Translate ORF IDs to gene IDs, while **dropping** the unannotated.
- Translate [UniRef](https://www.uniprot.org/help/uniref) IDs to protein names, while **merging** same names.

2. Group lower features into higher categories. Examples:
- Group genera into families, then into orders.
- Group chemical structures by chemical ontology.

3. Convert lower features into higher ones, where each lower feature may correspond to **multiple** higher features. Examples:
- Convert KEGG [orthologs](https://www.genome.jp/kegg/ko.html) to [pathways](https://www.genome.jp/kegg/pathway.html).
- Convert [GO](http://geneontology.org/docs/ontology-documentation/) terms to [GO Slim](http://www-legacy.geneontology.org/GO.slims.shtml) terms.

The last usage is an important complement to the main classification workflow, which currently relies on a tree structure and does not support one-to-many mapping. This can be achieved by using the profile collapsing function (although one can only move up one level per run).

Expand Down

0 comments on commit 347301f

Please sign in to comment.