# Total Energy Expenditure by Mass Scatter Plot

The plot that will be reproduced in this example is a log-log scatter plot of total energy expenditure (kcal/day) against mass (kg) as published by Pontzer et al. (2014) in PNAS (doi: [10.1073/pnas.1316940111](http://dx.doi.org/10.1073/pnas.1316940111)). Log-log plots of this sort are fairly common in biological anthropology, and this one has some of the common complications: multiple fitted lines, confidence intervals, different colored points for different groups, and some annotations.

![](images/energyexpenditure_plot.png)

The data underlying this plot are included in the supplementary material of the paper and have been extracted and made available in my bioanth datasets repository.

## Some preliminaries

First, I always change the R option for importing strings (I find that the default convert to factor causes more problems than it solves).

In [1]:
options(stringsAsFactors=F)

Next, import the data. The resulting data frame has five variables: species, order, captive, mass (in kg), and tee (total energy expenditure in kcal/day).

In [16]:
dset = read.csv(url("https://raw.githubusercontent.com/ryanraaum/bioanth-datasets/master/raw/energyexpenditure.csv"))
str(dset)

'data.frame':	86 obs. of  5 variables:
 $ species: chr  "Microcebus_murinus" "Lepilemur_ruficaudatus" "Eulemur" "Lemur_catta" ...
 $ order  : chr  "Primates" "Primates" "Primates" "Primates" ...
 $ captive: chr  "no" "no" "no" "no" ...
 $ mass   : num  0.064 0.77 1.84 2.24 2.21 4.9 7.12 12 72.2 46.6 ...
 $ tee    : num  28 121 146 146 217 ...


Because the target plot shows primates in red and nonprimates in black, it will be useful to have a variable for this color difference. In addition, the nonprimates and the non-captive primates have filled circles while the captive primates have non-filled ones, so a variable for this will be useful as well.

In [13]:
point_color = ifelse(dset$order == "Primates", "red", "black")
table(point_color)

point_color
black   red 
   67    19 

Because there are missing data in the `captive` variable (unknown/missing for the nonprimates), the fill factor has to be constructed a little differently.

In [15]:
point_fill = rep("fill", nrow(dset))
point_fill[dset$captive == "no"] = "open"
table(point_fill)

point_fill
fill open 
  78    8 