Skip to content


binomen post
Browse files Browse the repository at this point in the history
  • Loading branch information
sckott committed Dec 8, 2015
1 parent 7ba2a9b commit f465910
Show file tree
Hide file tree
Showing 410 changed files with 24,650 additions and 14,629 deletions.
186 changes: 186 additions & 0 deletions _drafts/2015-12-08-binomen-taxonomy-tools.Rmd
@@ -0,0 +1,186 @@
name: binomen-taxonomy-tools
layout: post
title: binomen - Tools for slicing and dicing taxonomic names
date: 2015-12-08
author: Scott Chamberlain
sourceslug: _drafts/2015-12-08-binomen-taxonomy-tools.Rmd
- R
- taxonomy
- split-apply-combine

```{r echo=FALSE}
comment = "#>",
collapse = TRUE,
warning = FALSE,
message = FALSE

The first version of `binomen` is now up on [CRAN][binomencran]. It provides various taxonomic classes for defining a single taxon, multiple taxa, and a taxonomic data.frame. It is designed as a companion to [taxize](, where you can get taxonomic data on taxonomic names from the web.

The classes (S3):

* `taxon`
* `taxonref`
* `taxonrefs`
* `binomial`
* `grouping` (i.e., classification - used different term to avoid conflict with classification in `taxize`)

For example, the `binomial` class is defined by a genus, epithet, authority, and optional full species name and canonical version.

binomial("Poa", "annua", authority="L.")

genus: Poa
epithet: annua
authority: L.

The package has a suite of functions to work on these taxonomic classes:

* `gethier()` - get hierarchy from a `taxon` class
* `scatter()` - make each row in taxonomic data.frame (`taxondf`) a separate `taxon` object within a single `taxa` object
* `assemble()` - make a `taxa` object into a `taxondf` data.frame
* `pick()` - pick out one or more taxonomic groups
* `pop()` - pop out (drop) one or more taxonomic groups
* `span()` - pick a range between two taxonomic groups (inclusive)
* `strain()` - filter by taxonomic groups, like dplyr's filter
* `name()` - get the taxon name for each `taxonref` object
* `uri()` - get the reference uri for each `taxonref` object
* `rank()` - get the taxonomic rank for each `taxonref` object
* `id()` - get the reference uri for each `taxonref` object

The approach in this package I suppose is sort of like `split-apply-combine` from `plyr`/`dplyr`, whereas this is aims to make it easy to do with taxonomic names.

## Install

For examples below, you'll need the development version:

```{r eval=FALSE}


## Make a taxon

Make a taxon object

(obj <- make_taxon(genus="Poa", epithet="annua", authority="L.",
family='Poaceae', clazz='Poales', kingdom='Plantae', variety='annua'))

Index to various parts of the object

The binomial


The authority


The classification


The family


## Subset taxon objects

Get one or more ranks via `pick()`

obj %>% pick(family)
obj %>% pick(family, genus)

Drop one or more ranks via `pop()`

obj %>% pop(family)
obj %>% pop(family, genus)

Get a range of ranks via `span()`

obj %>% span(kingdom, family)

Extract classification as a `data.frame`


## Taxonomic data.frame's

Make one

df <- data.frame(order = c('Asterales','Asterales','Fagales','Poales','Poales','Poales'),
family = c('Asteraceae','Asteraceae','Fagaceae','Poaceae','Poaceae','Poaceae'),
genus = c('Helianthus','Helianthus','Quercus','Poa','Festuca','Holodiscus'),
stringsAsFactors = FALSE)
(df2 <- taxon_df(df))

Parse - get rank order via `pick()`

df2 %>% pick(order)

get ranks order, family, and genus via `pick()`

df2 %>% pick(order, family, genus)

get range of names via `span()`, from rank `X` to rank `Y`

df2 %>% span(family, genus)

Separate each row into a `taxon` class (many `taxon` objects are a `taxa` class)

```{r output.lines=1:20}

And you can re-assemble a data.frame from the output of `scatter()` with `assemble()`

out <- scatter(df2)

## Thoughts?

I'm really curious what people think of `binomen`. I'm not sure how useful this will be in the wild. Try it. Let me know. Thanks much :)


0 comments on commit f465910

Please sign in to comment.