Skip to content

Commit

Permalink
compiled pdf
Browse files Browse the repository at this point in the history
  • Loading branch information
cboettig committed Oct 12, 2012
1 parent 3628943 commit 04bb5ec
Show file tree
Hide file tree
Showing 9 changed files with 1,007 additions and 86 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
@@ -1,4 +1,4 @@
Package: treebase
Package: rtreebase
Type: Package
Title: An R package for discovery, access and manipulation of online phylogenies
Version: 0.0-6
Expand Down
61 changes: 30 additions & 31 deletions inst/doc/treebase/appendix.md
Expand Up @@ -9,19 +9,19 @@ Appendix
Reproducible computation: A diversification rate analysis
---------------------------------------------------------

This appendix illustrates the diversification rate analysis discussed in the text.
For completeness we begin by executing the code discussed in the manuscript which
locates, downloads, and imports the relevant data:
This appendix illustrates the diversification rate analysis discussed in
the main text. For completeness we begin by executing the code discussed
in the manuscript which locates, downloads, and imports the relevant data:





Different diversification models make different assumptions
about the rate of speciation, extinction, and how these rates may be changing
over time. The authors consider eight different models, implemented in the
laser package [@Rabosky2006b]. This code fits each of the eight models to that
data:
Different diversification models make different assumptions about the rate
of speciation, extinction, and how these rates may be changing over time.
The original authors consider eight different models, implemented in the
laser package [@Rabosky2006b]. This code fits each of the eight models
to that data:



Expand All @@ -42,9 +42,10 @@ models <- list(



Each of the model estimate includes an AIC score indicating the goodness of
fit, penalized by model complexity (lower scores indicate better fits)
We ask R to tell us which model has the lowest AIC score,
Each of the model estimate includes an Akaike Information Criterion
(AIC) score indicating the goodness of fit, penalized by model complexity
(lower scores indicate better fits) We ask R to tell us which model has
the lowest AIC score,



Expand All @@ -56,19 +57,17 @@ best_fit <- names(models[which.min(aics)])



and confirm the result presented in @Derryberry2011;
that the yule.2.rate model is the best fit to the data.
and confirm the result presented in @Derryberry2011; that the best-fit
model in the laser analysis was a Yule (net diversification rate) model
with two separate rates.


The best-fit model in the laser analysis was a Yule (net diversification
rate) model with two separate rates. We can ask ` TreePar ` to see if
a model with more rate shifts is favoured over this single shift,
a question that was not possible to address using the tools provided in
`laser`. The previous analysis also considers a birth-death model that
allowed speciation and extinction rates to be estimated separately, but
did not allow for a shift in the rate of such a model. In the main text
we introduced a model from @Stadler2011 that permitted up to 3 change-points
in the speciation rate of the Yule model,
We can ask ` TreePar ` to see if a model with more rate shifts is favoured
over this single shift, a question that was not possible to address using
the tools provided in `laser`. The previous analysis also considers a
birth-death model that allowed speciation and extinction rates to be
estimated separately, but did not allow for a shift in the rate of such
a model. In the main text we introduced a model from @Stadler2011 that
permitted up to 3 change-points in the speciation rate of the Yule model,



Expand Down Expand Up @@ -100,10 +99,10 @@ birth_death_models <- bd.shifts.optim(x, sampling = c(1,1,1,1),



The models output by these functions are ordered by increasing number of shifts.
We can select the best-fitting model by AIC score, which is slightly cumbersome
in `TreePar` syntax. First compute the AIC scores of both the `yule_models` and the
`birth_death_models` we fitted above,
The models output by these functions are ordered by increasing number
of shifts. We can select the best-fitting model by AIC score, which is
slightly cumbersome in `TreePar` syntax. First, we compute the AIC scores
of both the `yule_models` and the `birth_death_models` we fitted above,



Expand All @@ -119,8 +118,9 @@ sapply(birth_death_models, function(pars)



And then generate a list identifying which model has the best (lowest) AIC score among the Yule models and
which has the best AIC score among the birth-death models,
Then we generate a list identifying which model has the best (lowest)
AIC score among the Yule models and which has the best AIC score among
the birth-death models,



Expand All @@ -144,8 +144,7 @@ best_model <- which.min(c(min(yule_aic), min(birth_death_aic)))



which confirms that the Yule 2-rate
model is still the best choice based on AIC score. Of the eight models
which still confirms that the Yule 2-rate model is still the best choice based on AIC score. Of the eight models
in this second analysis, only three were in the original set considered
(Yule 1-rate and 2-rate, and birth-death without a shift), so we could by
no means have been sure ahead of time that a birth death with a shift, or
Expand Down
2 changes: 1 addition & 1 deletion inst/doc/treebase/elsarticle.latex
Expand Up @@ -106,7 +106,7 @@ $endfor$
\title{$title$}
\author[cpb]{Carl Boettiger\corref{cor1}}
\ead{cboettig@ucdavis.edu}
\cortext[cor1]{Corresponding author, cboettig@ucdavis.edu}
\cortext[cor1]{Corresponding author}
\address[cpb]{Center for Population Biology, University of California, Davis, California 95616}
\author[stats]{Duncan Temple Lang}
\address[stats]{Department of Statistics, University of California, Davis, California 95616}
Expand Down
Binary file added inst/doc/treebase/figure1.pdf
Binary file not shown.
Binary file added inst/doc/treebase/figure2.pdf
Binary file not shown.
13 changes: 9 additions & 4 deletions inst/doc/treebase/treebase.Rmd
Expand Up @@ -299,20 +299,25 @@ look at trends in the submission patterns of publishers over time:
````

Many journals have only a few submissions, so we will label any not
in the top ten contributing journals as Other:
in the top ten contributing journals as "Other":


``` {r top_journals}
topten <- sort(table(pub), decreasing=TRUE)[1:10]
meta[["publisher"]] <- as.character(meta[["publisher"]])
meta[["publisher"]][!(pub %in% names(topten))] <- "Other"
meta[["publisher"]] <- as.factor(meta[["publisher"]])
````

We plot the distribution of publication years for phylogenies deposited
in TreeBASE, color coding by publisher in Figure 1.

``` {r dates, fig.width=8, fig.height=3.5, fig.cap="Histogram of publication dates by year, with the code required to generate the figure.", dev.opts=list(pointsize=8) }
library(ggplot2)
ggplot(meta) + geom_bar(aes(date, fill = publisher))
library(ggplot2)
library(reshape2)
df <- acast(meta, date ~ publisher, value.var='publisher', length)
df <- melt(df, varnames=c("date", "publisher"))
ggplot(df) + geom_area(aes(x=date,y=value, fill = publisher))
````

Typically we are interested in the metadata describing the phylogenies
Expand Down Expand Up @@ -348,7 +353,7 @@ Reproducible research has become a topic of increasing interest in
recent years, and facilitating access to data and using scripts that
can replicate analyses can help lower barriers to the replication of
statistical and computational results [@Schwab2000; @Gentleman2004;
@Peng2011b]. The `treebase` package facilitates this process, as we
@Peng2011a]. The `treebase` package facilitates this process, as we
illustrate in a simple example.

Consider the shifts in speciation rate identified by @Derryberry2011
Expand Down

0 comments on commit 04bb5ec

Please sign in to comment.