binomen post

sckott · Dec 8, 2015 · f465910 · f465910
1 parent 7ba2a9b
commit f465910
Show file tree

Hide file tree

Showing 410 changed files with 24,650 additions and 14,629 deletions.
diff --git a/_drafts/2015-12-08-binomen-taxonomy-tools.Rmd b/_drafts/2015-12-08-binomen-taxonomy-tools.Rmd
@@ -0,0 +1,186 @@
+---
+name: binomen-taxonomy-tools
+layout: post
+title: binomen - Tools for slicing and dicing taxonomic names
+date: 2015-12-08
+author: Scott Chamberlain
+sourceslug: _drafts/2015-12-08-binomen-taxonomy-tools.Rmd
+tags:
+- R
+- taxonomy
+- split-apply-combine
+---
+
+```{r echo=FALSE}
+knitr::opts_chunk$set(
+  comment = "#>",
+  collapse = TRUE,
+  warning = FALSE,
+  message = FALSE
+)
+```
+
+The first version of `binomen` is now up on [CRAN][binomencran]. It provides various taxonomic classes for defining a single taxon, multiple taxa, and a taxonomic data.frame. It is designed as a companion to [taxize](https://github.com/ropensci/taxize), where you can get taxonomic data on taxonomic names from the web.
+
+The classes (S3):
+
+* `taxon`
+* `taxonref`
+* `taxonrefs`
+* `binomial`
+* `grouping` (i.e., classification - used different term to avoid conflict with classification in `taxize`)
+
+For example, the `binomial` class is defined by a genus, epithet, authority, and optional full species name and canonical version.
+
+```r
+binomial("Poa", "annua", authority="L.")
+```
+
+```r
+<binomial>
+  genus: Poa
+  epithet: annua
+  canonical:
+  species:
+  authority: L.
+```
+
+The package has a suite of functions to work on these taxonomic classes:
+
+* `gethier()` - get hierarchy from a `taxon` class
+* `scatter()` - make each row in taxonomic data.frame (`taxondf`) a separate `taxon` object within a single `taxa` object
+* `assemble()` - make a `taxa` object into a `taxondf` data.frame
+* `pick()` - pick out one or more taxonomic groups
+* `pop()` - pop out (drop) one or more taxonomic groups
+* `span()` - pick a range between two taxonomic groups (inclusive)
+* `strain()` - filter by taxonomic groups, like dplyr's filter
+* `name()` - get the taxon name for each `taxonref` object
+* `uri()` - get the reference uri for each `taxonref` object
+* `rank()` - get the taxonomic rank for each `taxonref` object
+* `id()` - get the reference uri for each `taxonref` object
+
+The approach in this package I suppose is sort of like `split-apply-combine` from `plyr`/`dplyr`, whereas this is aims to make it easy to do with taxonomic names.
+
+## Install
+
+For examples below, you'll need the development version:
+
+```{r eval=FALSE}
+install.packages("binomen")
+```
+
+```{r}
+library("binomen")
+```
+
+## Make a taxon
+
+Make a taxon object
+
+```{r}
+(obj <- make_taxon(genus="Poa", epithet="annua", authority="L.",
+  family='Poaceae', clazz='Poales', kingdom='Plantae', variety='annua'))
+```
+
+Index to various parts of the object
+
+The binomial
+
+```{r}
+obj$binomial
+```
+
+The authority
+
+```{r}
+obj$binomial$authority
+```
+
+The classification
+
+```{r}
+obj$grouping
+```
+
+The family
+
+```{r}
+obj$grouping$family
+```
+
+## Subset taxon objects
+
+Get one or more ranks via `pick()`
+
+```{r}
+obj %>% pick(family)
+obj %>% pick(family, genus)
+```
+
+Drop one or more ranks via `pop()`
+
+```{r}
+obj %>% pop(family)
+obj %>% pop(family, genus)
+```
+
+Get a range of ranks via `span()`
+
+```{r}
+obj %>% span(kingdom, family)
+```
+
+Extract classification as a `data.frame`
+
+```{r}
+gethier(obj)
+```
+
+## Taxonomic data.frame's
+
+Make one
+
+```{r}
+df <- data.frame(order = c('Asterales','Asterales','Fagales','Poales','Poales','Poales'),
+  family = c('Asteraceae','Asteraceae','Fagaceae','Poaceae','Poaceae','Poaceae'),
+  genus = c('Helianthus','Helianthus','Quercus','Poa','Festuca','Holodiscus'),
+  stringsAsFactors = FALSE)
+(df2 <- taxon_df(df))
+```
+
+Parse - get rank order via `pick()`
+
+```{r}
+df2 %>% pick(order)
+```
+
+get ranks order, family, and genus via `pick()`
+
+```{r}
+df2 %>% pick(order, family, genus)
+```
+
+get range of names via `span()`, from rank `X` to rank `Y`
+
+```{r}
+df2 %>% span(family, genus)
+```
+
+Separate each row into a `taxon` class (many `taxon` objects are a `taxa` class)
+
+```{r output.lines=1:20}
+scatter(df2)
+```
+
+And you can re-assemble a data.frame from the output of `scatter()` with `assemble()`
+
+```{r}
+out <- scatter(df2)
+assemble(out)
+```
+
+## Thoughts?
+
+I'm really curious what people think of `binomen`. I'm not sure how useful this will be in the wild. Try it. Let me know. Thanks much :)
+
+[binomencran]: https://cran.rstudio.com/web/packages/binomen