Clean up your code
- Have you ever written a long script in R that conducts oodles of analyses and wished that someone would come along and make it all clearer to understand and use?
- Well you’re not alone.
- A recent survey of over 1500 scientists reported a crisis of reproducibility with "selective reporting" being the most cited contributing factor and 80% saying code availability is playing a role
- We created Rclean to help scientists more easily write "cleaner" code
- Rclean provides a simple way get the code you need to produce a specific result
- Rclean uses data provenance tp capture what your code actually does when it’s running and then allows you to pull out the essential code that produces specific outputs.
- By focusing in on the specific results you want, Rclean let’s you spend more energy on your science and less time figuring out your code.
Install and Setup
You can install Rclean from CRAN:
You can install the most up to date version easily with devtools:
You will also need to be able to generate data provenance from your script. This can be done using provR:
Once installed, per usual R practice, just load the Rclean and provR packages:
Here's a demo video for the following example:
Once you have your script and workspace setup, you can use Rclean to get clean chunks of a larger script that produce specific results you want. We'll use the micro.R script, which can be found inside the package repo in the exec directory. The following example assumes that your current working directory is exec.
First, you'll need to record information about the script you would
like to parse. Rclean uses data
provenance to verify what lines of code depend on each other inside of
the larger script. We can use the
provR package to generate
provenance. The next bit of code runs our script and saves the
provenance to memory, which we then pass to the
options function, so
that Rclean has access to it:
prov.capture("micro.R") options(prov.json = prov.json())
Or, if you have provenance saved as a text file, you can load it in like this:
options(prov.json = readLines("prov_micro.json"))
Now that we have the provenance loaded, we can start cleaning. Rclean will give us a list of possible values we can get code for:
You can then pick and choose from among these results and get the essential code to produce the output, like so:
Notice that the 'clean' function doesn't require you to quote your results, it interprets all inputs as names of results.
In many cases, it's handy just to take a look at the isolated code, but if you can also save the code for later use or sharing.
my.code <- clean(x) write.code(my.code, file = "x.R")
If you would like to copy your code to the clipboard, you can do that by not specifying a file path.
Contributing: if you would like to contribute, please read [[CONTRIBUTING.md]].