How to get at R source. I am sick of Googling this. I am writing it down this time.
Clone or download
1
Latest commit 1f36986 Jul 31, 2017
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore Point people to lookup Jul 31, 2017
README.Rmd Point people to lookup Jul 31, 2017
README.md Point people to lookup Jul 31, 2017
access-r-source.Rproj Point people to lookup Jul 31, 2017

README.md

Accessing R Source

2017-07-31 update: Since I wrote this @jimhester has created the lookup package to automate this process. So if all you want is the result, just use that! If you want a bit more context, then keep reading. AFAIK this info is still fundamentally sound.

How to get at R source. I am sick of Googling this. I am writing it down this time.

TL;DR

methods(<S3_GENERIC>)
<S3_GENERIC>.default
<S3_GENERIC>.<CLASS>
getAnywhere(<S3_GENERIC>.<CLASS>)
getS3method("<S3_GENERIC>", "<CLASS>")
<NAMESPACE>:::<S3_GENERIC>.<CLASS>

References

The definitive reference is this classic R News article:

Accessing the Sources

Uwe Ligges

https://cran.r-project.org/doc/Rnews/Rnews_2006-4.pdf

Volume 6/4, October 2006. Go to page 43.

Another good reference is the help file for method():

https://stat.ethz.ch/R-manual/R-patched/library/utils/html/methods.html

Just print it

If you are lucky, just printing the function will work.

setNames
#> function (object = nm, nm) 
#> {
#>     names(object) <- nm
#>     object
#> }
#> <bytecode: 0x7fce93c98920>
#> <environment: namespace:stats>

But there are many ways this can fail.

vector             # .Internal
#> function (mode = "logical", length = 0L) 
#> .Internal(vector(mode, length))
#> <bytecode: 0x7fce94835c40>
#> <environment: namespace:base>
class              # .Primitive
#> function (x)  .Primitive("class")
subset             # S3 generic
#> function (x, ...) 
#> UseMethod("subset")
#> <bytecode: 0x7fce933c0178>
#> <environment: namespace:base>

What then?

Function is an S3 generic

These are characterized by UseMethod() in the printed result:

subset
#> function (x, ...) 
#> UseMethod("subset")
#> <bytecode: 0x7fce933c0178>
#> <environment: namespace:base>

Print the default method by appending .default:

subset.default
#> function (x, subset, ...) 
#> {
#>     if (!is.logical(subset)) 
#>         stop("'subset' must be logical")
#>     x[subset & !is.na(subset)]
#> }
#> <bytecode: 0x7fce935f1228>
#> <environment: namespace:base>

Or list the methods for this generic:

methods(subset)
#> [1] subset.data.frame subset.default    subset.matrix    
#> see '?methods' for accessing help and source code

Then print the method you seek:

subset.matrix
#> function (x, subset, select, drop = FALSE, ...) 
#> {
#>     if (missing(select)) 
#>         vars <- TRUE
#>     else {
#>         nl <- as.list(1L:ncol(x))
#>         names(nl) <- colnames(x)
#>         vars <- eval(substitute(select), nl, parent.frame())
#>     }
#>     if (missing(subset)) 
#>         subset <- TRUE
#>     else if (!is.logical(subset)) 
#>         stop("'subset' must be logical")
#>     x[subset & !is.na(subset), vars, drop = drop]
#> }
#> <bytecode: 0x7fce93314388>
#> <environment: namespace:base>

Sometimes the method definition is not exported from the package namespace. That's indicated by an asterisk * in the listing (pardon the way I do this, but I have to send through capture.output() if the asterisks are to survive tail():

mout <- capture.output(methods(print))
tail(mout)
#> [1] "[195] print.vignette*                                   "
#> [2] "[196] print.warnings                                    "
#> [3] "[197] print.xgettext*                                   "
#> [4] "[198] print.xngettext*                                  "
#> [5] "[199] print.xtabs*                                      "
#> [6] "see '?methods' for accessing help and source code"

Good news: you've found the function you need. Bad news: you still can't read the source.

print.xgettext
#> Error in eval(expr, envir, enclos): object 'print.xgettext' not found

Use getAnywhere() or getS3method() to close the deal:

getAnywhere(print.xgettext)
#> A single object matching 'print.xgettext' was found
#> It was found in the following places
#>   registered S3 method for print from namespace tools
#>   namespace:tools
#> with value
#> 
#> function (x, ...) 
#> {
#>     cat(x, sep = "\n")
#>     invisible(x)
#> }
#> <bytecode: 0x7fce9787e150>
#> <environment: namespace:tools>

This printed the source AND we learned the associated namespace. If you know the namespace, you can also use ::: to see source:

tools:::print.xgettext
#> function (x, ...) 
#> {
#>     cat(x, sep = "\n")
#>     invisible(x)
#> }
#> <bytecode: 0x7fce9787e150>
#> <environment: namespace:tools>

Compiled code

Whenever you see .C(), .Call(), .Fortran(), .External(), or .Internal() and .Primitive(), the source you seek is in underlying compiled code.

You need to locate the source code of R or the associated add-on package on the internet or download it locally. Then browse around or search.

Visit the source of R on the internet:

Download source of R itself:

  • Download source for current release from https://cran.r-project.org, e.g. R-3.2.2.tar.gz, and unpack it.

    tar xvf R-3.2.2.tar.gz
    
  • See more info there about getting the development version.

  • Or checkout from the official Subversion repository https://svn.r-project.org/R/.

How to find what you need in the R source (paraphrasing Ligges, in places):

  • For R and standard R packages, look in subdirs of $R_HOME/src/, most especially $R_HOME/src/main/.
  • If calling R function is .Primitive() or .Internal(), find the entry point $R HOME/src/main/names.c. Then try to find that function. Example below.
  • Use the GitHub search capabilities.

Example: I want the source for levels<-.

`levels<-`    # .Primitive()
#> function (x, value)  .Primitive("levels<-")

Search for levels<- in $R HOME/src/main/names.c and we find this line:

{"levels<-", do_levelsgets, 0, 1, 2, {PP_FUNCALL, PREC_LEFT, 1}}

which tells us we're looking for do_levelsgets. Now use your search capability (GitHub? grep?) to look for that within the files below $R_HOME/src/. I choose the GitHub option and use this query:

 do_levelsgets path:src/main

And finally arrive at my destination: lines 1242 through 1261 in $R_HOME/src/main/attrib.c.

Visit the source of an R package on the internet:

  • If developed on GitHub, go to package's official repo. Ideally, this will be provided as a URL on the package's CRAN page, but sadly not always the case.
  • If the package is on CRAN but not on GitHub, go to the read-only mirror of its source from the METACRAN project.

Download source of an add-on package:

How to find what you need in R package source:

  • Look in the src directory. If you are lucky, there will be a file that obviously contains the function of interest.
  • Otherwise, search with grep, your editor/IDE, or GitHub queries and follow the trail to rainbow's end.

Example: I want the source for dplyr::bind_rows. dplyr is developed on GitHub.

First I search within the R directory with this GitHub search query:

bind_rows path:R

GitHub search only shows you the first one or two matches within a file, but I gather that R/bind.r is where I want to look. I visit it in the browser and use the browser to search for bind_rows, which reveals the function definition. That reveals I actually need bind_rows_.

So I do another GitHub search with this query:

bind_rows_

which reveals hits in

R/bind.r
R/RcppExports.R
src/bind.cpp
src/RcppExports.cpp

Reading the relevant bit of src/bind.cpp reveals I really need rbind__impl.

So I do another GitHub search with this query:

rbind__impl

And finally arrive at my destination: lines 7 though 119 of src/bind.cpp.

Things I haven't covered

An incomplete list:

  • S4
  • Other places to put code, e.g. R-Forge or BitBucket