Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix documentaion #7

Merged
merged 7 commits into from
Aug 31, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
/Manifest.toml
/docs/build/
/docs/Manifest.toml
/db
/.ipynb_checkpoints
/notebook
/notebook
1 change: 1 addition & 0 deletions .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ jobs:
julia --project=docs -e '
using Pkg;
Pkg.develop(PackageSpec(path=pwd()));
pkg"add Taxonomy#master";
Pkg.instantiate();
using Documenter;
using Taxonomy;
Expand Down
116 changes: 0 additions & 116 deletions docs/Manifest.toml

This file was deleted.

1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ makedocs(;
),
pages=[
"Home" => "index.md",
"API Reference" => "man/api.md"
],
)

Expand Down
56 changes: 45 additions & 11 deletions docs/src/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,16 +8,23 @@ Taxonomy.jl is a julia package to handle NCBI-formatted taxonomic databases.

Now, this package only supports `scientific name`.

Installation
------------
## Installation
Install Taxonomy.jl as follows:
```
julia -e 'using Pkg; Pkg.add("https://github.com/banhbio/Taxonomy.jl")'
julia -e 'using Pkg; Pkg.add("Taxonomy")'
```

Usage
-----
First, you need to download taxonomic data from NCBI's servers (ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz) and store this data to `Taxonomy.DB` object.
## Usage

### Download database
First, you need to download taxonomic data from NCBI's servers.
```
wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz
tar xzvf taxdump.tar.gz
```

### Create `Taxonomy.DB` object
You can create `Taxonomy.DB` object to store the data.

```julia
# Load the package
Expand All @@ -28,6 +35,7 @@ julia> db = Taxonomy.DB("db/nodes.dmp","db/names.dmp") # Create a Taxonomy.DB ob
julia> db = Taxonomy.DB("/your/path/to/db","nodes.dmp","names.dmp") # Alternatively, create the object from the path to the directory and the name of each files
```

### Taxon
You can construct a `Taxon` object from its taxonomic identifier and the `Taxonomy.DB` object.


Expand All @@ -41,7 +49,8 @@ julia> gorilla = Taxon(9593, db) # species Gorilla gorilla
julia> bacillus = Taxon(1386,db) # genus Bacillus
1386 [genus] Bacillus
```
Each `Taxon` object has 4-field `taxid`, `name`, `rank` and `db`. The filed `db` is hidden in the `print()` fuction, etc.

Each `Taxon` object has 4-field `taxid`, `name`, `rank` and `db`.

```julia
julia> @show human
Expand All @@ -59,14 +68,17 @@ human.rank = :species
julia> @show human.db
human.db = Taxonomy.DB("db/nodes.dmp","db/names.dmp")
```

You can get a variety of information, such as rank, parent and children by using functions.

```julia
julia> rank(gorilla)
:species

julia> parent(gorilla)
9592 [genus] Gorilla
```

```julia
julia> children(bacillus)
249-element Array{Taxon,1}:
Expand All @@ -83,9 +95,17 @@ julia> children(bacillus)
1522308 [species] Bacillus niameyensis
324767 [species] Bacillus infantis
```

Also, you can get the lowest common ancestor (LCA) of taxa.
```julia
julia> lca(human, gorilla)
207598 [subfamily] Homininae

julia> lca(human, gorilla, bacillus) # You can input as many as you want.
131567 [no rank] cellular organisms

julia> lca([human, gorilla, bacillus]) # Vector of taxon is also ok.
131567 [no rank] cellular organisms
```

Fuctions from `AbstractTrees.jl` can also be used.
Expand Down Expand Up @@ -118,7 +138,11 @@ julia> print_tree(homininae)
├─ 406788 [subspecies] Gorilla gorilla diehli
└─ 9595 [subspecies] Gorilla gorilla gorilla
```

### Lineage

Lineage information can be acquired by using `Lineage()`.

```julia
julia> lineage = Lineage(gorilla)
32-element Lineage:
Expand All @@ -138,7 +162,9 @@ julia> lineage = Lineage(gorilla)
9592 [genus] Gorilla
9593 [species] Gorilla gorilla
```

Struct `Lineage` stores linage informaction in `Vector`-like format.

```julia
julia> lineage[1]
1 [no rank] root
Expand All @@ -149,7 +175,9 @@ julia> lineage[9]
julia> lineage[end]
9593 [species] Gorilla gorilla
```
You can also access `Lineage` using `Symbol`, such as `:superkingdom`, `:family`, `:genus`, `:species` and etc.(Only Symbols in CanonicalRank can be used).

You can also access a `Taxon` in the `Lineage` using `Symbol`, such as `:superkingdom`, `:family`, `:genus`, `:species` and etc.(Only Symbols in CanonicalRank can be used).

```julia
julia> CanonicalRank
10-element Array{Symbol,1}:
Expand All @@ -170,7 +198,9 @@ julia> lineage[:order]
julia> lineage[:genus]
9592 [genus] Gorilla
```

You can use `Between`, `From`, `Until`, `Cols` and `All` selectors in more complex rank selection scenarios.

```julia
julia> lineage[Between(:order,:genus)]
8-element Lineage:
Expand Down Expand Up @@ -208,7 +238,9 @@ julia> lineage[Until(:class)]
32524 [clade] Amniota
40674 [class] Mammalia
```

Reformation of the lineage to your ranks can be performed by using `reformat()`.

```julia
julia> myrank = [:superkingdom, :phylum, :class, :order, :family, :genus, :species]

Expand All @@ -221,8 +253,10 @@ julia> reformat(lineage, myrank)
9604 [family] Hominidae
9592 [genus] Gorilla
9593 [species] Gorilla gorilla
```
If there is no corresponding taxon in the lineage to your ranks, then `UnclassifiedTaxon` will be stored.
```

If there is no corresponding taxon in the lineage to your ranks, then `UnclassifiedTaxon` will be stored.

```julia
julia> uncultured_bacillales = Taxon(157472,db)
57472 [species] uncultured Bacillales bacterium
Expand All @@ -236,4 +270,4 @@ julia> reformated = reformat(Lineage(uncultured_bacillales), myrank)
Unclassified [family] unclassified Bacillales family
Unclassified [genus] unclassified Bacillales genus
157472 [species] uncultured Bacillales bacterium
```
```
13 changes: 13 additions & 0 deletions docs/src/man/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# API Reference

## Public
```@autodocs
Modules = [Taxonomy]
Private = false
```

## Internal
```@autodocs
Modules = [Taxonomy]
Public = false
```
1 change: 0 additions & 1 deletion src/Taxonomy.jl
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ export CanonicalRank,
Lineage,
taxid, name, rank, parent, children, lca,
reformat, print_lineage, isdescendant, isancestor,
children, print_tree,
All, Between, Cols,
From, Until

Expand Down
1 change: 0 additions & 1 deletion src/database.jl
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@ DB(db_path::String, nodes_dmp::String, names_dmp::String)
Create DB(taxonomy database) object from nodes.dmp and names.dmp files.
You can specify the paths of the nodes.dmp and names.dmp files, or the directory where they exist and the names.
"""

function DB(nodes_dmp::String, names_dmp::String)
@assert isfile(nodes_dmp)
@assert isfile(names_dmp)
Expand Down
2 changes: 0 additions & 2 deletions src/lca.jl
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@

Return the `Taxon` object that is the lowest common ancestor of the given set of `Taxon`s
"""


function lca(taxa::Vector{Taxon})
lineages = [Lineage(taxon) for taxon in taxa]
overlap = intersect(lineages...)
Expand Down
10 changes: 6 additions & 4 deletions src/lineage.jl
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@ Base.getindex(l::Lineage, idx::From{Symbol}) = getindex(l, From(l.index[idx.firs
Base.getindex(l::Lineage, idx::Until{Int}) = l[1:idx.last]
Base.getindex(l::Lineage, idx::Until{Symbol}) = getindex(l, Until(l.index[idx.last]))

"""
get(db::Taxonomy.DB, idx::Union{Int,Symbol}, default)

Return the Taxon object stored for the given taxid or rank (i.e. :phylum), or the given default value if no mapping for the taxid is present.
"""
function Base.get(l::Lineage, idx::Union{Int,Symbol}, default::Any)
try
return getindex(l,idx)
Expand All @@ -76,7 +81,6 @@ end

Return the `Lineage` object reformatted according to the given ranks.
"""

function reformat(l::Lineage, ranks::Vector{Symbol})
line = AbstractTaxon[]
idx = Dict{Symbol,Int}()
Expand Down Expand Up @@ -112,7 +116,6 @@ Print a formatted representation of the lineage to the given `IO` object.
* `fill::Bool = false` - If true, prints UnclassifiedTaxon. only availavle when skip is false
* `skip::Bool`= false` - If true, skip printing `UnclassifiedTaxon` and delimiter.
"""

function print_lineage(io::IO, lineage::Lineage; delim::AbstractString=";", fill::Bool=false, skip::Bool=false)
name_line = String[]
for taxon in lineage
Expand Down Expand Up @@ -150,14 +153,13 @@ Base.show(io::IO, lineage::Lineage) = print_lineage(io, lineage)
isdescendant(descendant::Taxon, ancestor::Taxon)

Return true if the former taxon is a descendant of the latter taxon.
This function is overloaded because native AbstractTrees.isdescendant is too slow
"""
# overload because native AbstractTrees.isdescendant is too slow
AbstractTrees.isdescendant(descendant::Taxon, ancestor::Taxon) = ancestor in Lineage(descendant)

"""
isancestor(ancestor::Taxon, descendant::Taxon)

Return true if the former taxon is an ancestor of the latter taxon.
"""

isancestor(ancestor::Taxon, descendant::Taxon) = AbstractTrees.isdescendant(descendant, ancestor)
Loading