Creating living documents and reproducible reports with R markdown and Jupyter notebooks
Curious about how living documents and reproducible reports could help your research? This repo contains a workshop walkthrough about how R markdown and Jupyter notebooks can enrich your research workflow.
What are "living documents" and "reproducible reports"?
While everyone seems to have their own take on what these two terms mean and how they differ from one another, "living documents" and "reproducible reports" are ways for researchers to share code, images, and text in a single document.
For my part, I view "living documents" as work-in-progress documents. They're great for keeping notes on your data cleaning, data processing, and data analysis by allowing you to add plain text, plots, and live code in a single place. As researchers, we might spend months away from a project (while we're busy with something else or while we're waiting for reviewers to get back to us). When it comes time to start up the project again, living documents can help us jump back into the project quickly: Taking good notes about what our code does -- right next to real code -- can help us remember exactly what we were thinking and why we made the choices we did.
"Reproducible reports," on the other hand, I see as documents that are meant to publicly accompany your research outputs (e.g., talks, posters, journal articles). These are ways for other researchers (and maybe even your reviewers) to see all of the work that you did when handling your data and creating your analyses. Given ongoing concerns about transparency and reproducibility in a variety of fields (including psychology and cognitive science), using reproducible reports can provide vital information about the data cleaning, processing, and analysis that supported your conclusions.
Why should I care?
Researchers -- especially within cognitive science and psychology -- are increasingly interested in promoting transparency and reproducibility. There are badges that researchers can earn for sharing their data and materials that promote the prominence of open science, and some journals even require data and code sharing now.
Providing an explicit accounting of your data and code practices can help demonstrate the value of your work. As an added bonus, if there's future interest in replicating your work, providing your code openly can help those future replication efforts use methods as close to your original work as possible.
For your sanity
Think of this as your present self doing something nice for your future self. If your present self takes a few minutes to add some explanatory text, code comments, or a useful plot, you'll be saving your future self headaches and time. Present-you knows what you're doing because present-you is ankle-deep in things. Future-you, on the other hand, will probably have spent weeks or months away from the problem and will have to spend valuable time puzzling through the traces that then-past-you created. Do future-you a favor!
For your sanity and science
A transformative way to think about this is to see that the effort you put in for helping your future self can be equally valuable for helping the broader research community engage with and make sense of your research.
With just a little bit of additional effort, you can tweak your living documents into reproducible reports. If you're taking good notes and adding comments to your code in your living document while you're doing the research, all you need to do is publish the document after you're done!
To run the workshop materials, just click on the Binder button above. From there, you'll be served your own Jupyter instance in the cloud.
If you'd like something more permanent, feel free to fork the repository or download the files! The beauty of R markdown and Jupyter notebooks lies in their flexibility -- so experiment until you find what works best for you!
- A few that I've done:
- An early one for a research paper: https://github.com/a-paxton/emotion-dynamics
- One from a poster in 2016: https://github.com/a-paxton/explaining-mechanisms-of-global-warming
- One for a methods manuscript that's under review: https://github.com/nickduran/align-linguistic-alignment
- One for a 2017 paper: https://github.com/a-paxton/dual-conversation-constraints
- A gallery of interesting Jupyter notebooks: https://github.com/jupyter/jupyter/wiki/A-gallery-of-interesting-Jupyter-Notebooks