compiled pdf

ropensci · Oct 12, 2012 · 04bb5ec · 04bb5ec
1 parent 3628943
commit 04bb5ec
Show file tree

Hide file tree

Showing 9 changed files with 1,007 additions and 86 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,4 +1,4 @@
-Package: treebase
+Package: rtreebase
 Type: Package
 Title: An R package for discovery, access and manipulation of online phylogenies
 Version: 0.0-6

diff --git a/inst/doc/treebase/appendix.md b/inst/doc/treebase/appendix.md
@@ -9,19 +9,19 @@ Appendix
 Reproducible computation: A diversification rate analysis
 ---------------------------------------------------------
 
-This appendix illustrates the diversification rate analysis discussed in the text. 
-For completeness we begin by executing the code discussed in the manuscript which 
-locates, downloads, and imports the relevant data: 
+This appendix illustrates the diversification rate analysis discussed in
+the main text.  For completeness we begin by executing the code discussed
+in the manuscript which locates, downloads, and imports the relevant data:
 
 
 
 
 
-Different diversification models make different assumptions 
-about the rate of speciation, extinction, and how these rates may be changing
-over time.  The authors consider eight different models, implemented in the 
-laser package [@Rabosky2006b]. This code fits each of the eight models to that
-data:
+Different diversification models make different assumptions about the rate
+of speciation, extinction, and how these rates may be changing over time.
+The original authors consider eight different models, implemented in the
+laser package [@Rabosky2006b]. This code fits each of the eight models
+to that data:
 
 
 
@@ -42,9 +42,10 @@ models <- list(
 
 
 
-Each of the model estimate includes an AIC score indicating the goodness of
-fit, penalized by model complexity (lower scores indicate better fits)
-We ask R to tell us which model has the lowest AIC score,
+Each of the model estimate includes an Akaike Information Criterion
+(AIC) score indicating the goodness of fit, penalized by model complexity
+(lower scores indicate better fits) We ask R to tell us which model has
+the lowest AIC score,
 
 
 
@@ -56,19 +57,17 @@ best_fit <- names(models[which.min(aics)])
 
 
 
-and confirm the result presented in @Derryberry2011; 
-that the yule.2.rate model is the best fit to the data.  
+and confirm the result presented in @Derryberry2011; that the best-fit
+model in the laser analysis was a Yule (net diversification rate) model
+with two separate rates.  
 
-
-The best-fit model in the laser analysis was a Yule (net diversification
-rate) model with two separate rates.  We can ask ` TreePar ` to see if
-a model with more rate shifts is favoured over this single shift,
-a question that was not possible to address using the tools provided in
-`laser`. The previous analysis also considers a birth-death model that 
-allowed speciation and extinction rates to be estimated separately, but 
-did not allow for a shift in the rate of such a model.  In the main text
-we introduced a model from @Stadler2011 that permitted up to 3 change-points
-in the speciation rate of the Yule model,
+We can ask ` TreePar ` to see if a model with more rate shifts is favoured
+over this single shift, a question that was not possible to address using
+the tools provided in `laser`. The previous analysis also considers a
+birth-death model that allowed speciation and extinction rates to be
+estimated separately, but did not allow for a shift in the rate of such
+a model.  In the main text we introduced a model from @Stadler2011 that
+permitted up to 3 change-points in the speciation rate of the Yule model,
 
 
 
@@ -100,10 +99,10 @@ birth_death_models <- bd.shifts.optim(x, sampling = c(1,1,1,1),
 
 
 
-The models output by these functions are ordered by increasing number of shifts.  
-We can select the best-fitting model by AIC score, which is slightly cumbersome 
-in `TreePar` syntax.  First compute the AIC scores of both the `yule_models` and the 
-`birth_death_models` we fitted above,
+The models output by these functions are ordered by increasing number
+of shifts.  We can select the best-fitting model by AIC score, which is
+slightly cumbersome in `TreePar` syntax.  First, we compute the AIC scores
+of both the `yule_models` and the `birth_death_models` we fitted above,
 
 
 
@@ -119,8 +118,9 @@ sapply(birth_death_models, function(pars)
 
 
 
-And then generate a list identifying which model has the best (lowest) AIC score among the Yule models and 
-which has the best AIC score among the birth-death models, 
+Then we generate a list identifying which model has the best (lowest)
+AIC score among the Yule models and which has the best AIC score among
+the birth-death models,
 
 
 
@@ -144,8 +144,7 @@ best_model <- which.min(c(min(yule_aic), min(birth_death_aic)))
 
 
 
-which confirms that the Yule 2-rate  
-model is still the best choice based on AIC score.  Of the eight models 
+which still confirms that the Yule 2-rate  model is still the best choice based on AIC score.  Of the eight models 
 in this second analysis, only three were in the original set considered 
 (Yule 1-rate and 2-rate, and birth-death without a shift), so we could by
 no means have been sure ahead of time that a birth death with a shift, or

diff --git a/inst/doc/treebase/elsarticle.latex b/inst/doc/treebase/elsarticle.latex
@@ -106,7 +106,7 @@ $endfor$
   \title{$title$}
   \author[cpb]{Carl Boettiger\corref{cor1}}
   \ead{cboettig@ucdavis.edu}
-  \cortext[cor1]{Corresponding author, cboettig@ucdavis.edu}
+  \cortext[cor1]{Corresponding author}
   \address[cpb]{Center for Population Biology, University of California, Davis, California 95616}
   \author[stats]{Duncan Temple Lang}
   \address[stats]{Department of Statistics, University of California, Davis, California 95616}

diff --git a/inst/doc/treebase/figure1.pdf b/inst/doc/treebase/figure1.pdf
diff --git a/inst/doc/treebase/figure2.pdf b/inst/doc/treebase/figure2.pdf
diff --git a/inst/doc/treebase/treebase.Rmd b/inst/doc/treebase/treebase.Rmd
@@ -299,20 +299,25 @@ look at trends in the submission patterns of publishers over time:
 ````
 
 Many journals have only a few submissions, so we will label any not
-in the top ten contributing journals as “Other”:
+in the top ten contributing journals as "Other":
 
 
 ``` {r top_journals}
     topten <- sort(table(pub), decreasing=TRUE)[1:10]
+    meta[["publisher"]] <- as.character(meta[["publisher"]])
     meta[["publisher"]][!(pub %in% names(topten))] <- "Other"
+    meta[["publisher"]] <- as.factor(meta[["publisher"]])
 ````
 
 We plot the distribution of publication years for phylogenies deposited
 in TreeBASE, color coding by publisher in Figure 1.
 
 ``` {r dates, fig.width=8, fig.height=3.5, fig.cap="Histogram of publication dates by year, with the code required to generate the figure.", dev.opts=list(pointsize=8) }
-  library(ggplot2) 
-  ggplot(meta) + geom_bar(aes(date, fill = publisher)) 
+library(ggplot2) 
+library(reshape2)
+df <- acast(meta, date ~ publisher, value.var='publisher', length)
+df <- melt(df, varnames=c("date", "publisher"))
+ggplot(df) + geom_area(aes(x=date,y=value, fill = publisher)) 
 ````
 
 Typically we are interested in the metadata describing the phylogenies
@@ -348,7 +353,7 @@ Reproducible research has become a topic of increasing interest in
 recent years, and facilitating access to data and using scripts that
 can replicate analyses can help lower barriers to the replication of
 statistical and computational results [@Schwab2000; @Gentleman2004;
-@Peng2011b].  The `treebase` package facilitates this process, as we
+@Peng2011a].  The `treebase` package facilitates this process, as we
 illustrate in a simple example.
 
 Consider the shifts in speciation rate identified by @Derryberry2011