Skip to content

tdhock/idioms

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 

Repository files navigation

This page catalogs a set of idioms that I have found useful in R.

Accumulating data.frames

To create a large data.frame from two smaller data.frames you may be tempted to do

big.df <- NULL
for(i in i.vec){
  for(j in j.vec){
    small.df <- data.frame(i, j)
    big.df <- rbind(big.df, small.df)
  }
}

It works, but it is slower than

big.df.list <- list()
for(i in i.vec){
  for(j in j.vec){
    small.df <- data.frame(i, j)
    big.df.list[[paste(i, j)]] <- small.df
  }
}
big.df <- do.call(rbind, big.df.list)

Recording package dependencies

Because of the versionless install.packages function, it is difficult to conduct truly reproducible research using R. In one of your scripts you may be tempted to write

library(ggplot2)

to indicate that you code uses the ggplot2 package. But there were major backwards-incomapatible changes to the ggplot2 package in 2015. How will the future users of your code (including your future self) know which version to use?

Instead, I would recommend writing the following at the top of your R script. It indicates the version of a package from a CRAN-like repository.

works_with_R("3.2.3", ggplot2="1.0.1")

Even better, if the package can be found on GitHub you can indicate the repository that it comes from, and the specific commit that you used.

works_with_R("3.2.3", 
             "tdhock/ggplot2@a8b06ddb680acdcdbd927773b1011c562134e4d2")

I recommend defining works_with_R in your ~/.Rprofile.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages