Reproducible Research using Jupyter Notebooks: Curriculum Development Hackathon
- Goal: Develop the content of a two-day Data Carpentry workshop teaching how to conduct research reproducibly using the Jupyter notebook
- Location: Berkeley Institute for Data Science (BIDS), Berkeley, CA
- Dates and times: January 9 - 11, 2017; 9 am - 5 pm each day
To apply: https://goo.gl/aPO71f; deadline December 5, 2016. Successful applicants will be notified by December 12.
Making science more reproducible has enormous potential to accelerate research advances, including for practicing individuals. Despite this, the tools and approaches that are already available are rarely taught. To address this, we are organizing a 3-day hands-on workshop aimed at developing, and later teaching, a short-course curriculum on using Jupyter notebooks for reproducible research practices. The event will be held January 9 - 11, 2016, in Berkeley, CA, at the Berkeley Institute for Data Science (BIDS). We aim to assemble a diverse and interdisciplinary group of participants, and invite those interested to apply by December 5, 2016, at https://goo.gl/aPO71f.
As science becomes increasingly more data and computation intensive, maintaining the ability to build on our own or other’s prior work requires that the process that takes data and other inputs all the way to the results presented in a paper is documented and made available in full detail. The concept of reproducible research means that someone else should either be able to obtain the same results given all the documented inputs and the published instructions for processing them, or if not, the reasons why should be apparent from comparing the executed processing steps to the documented ones.
Aside from its role as a cornerstone of the scientific process, reproducible research holds tremendous benefits for individual researchers themselves. Knowing and applying reproducibility-promoting practices and tools could enable researchers to revisit and build on their own work more effectively and efficiently, which in turn has the potential to accelerate the progress of science as a whole. Yet, such practices and tools are only rarely taught, and many scientists remain unaware of them. In response to this gap, participants of the Reproducible Science Curriculum initiative developed an initial 2-day bootcamp-style workshop curriculum around best practices for reproducible research in R. This curriculum has since been adopted by Data Carpentry.
The objective of this hackathon is to continue this effort by developing a workshop curriculum around the Jupyter notebook as a tool promoting best practices for reproducible research. Jupyter notebooks are increasingly widely adopted, and have been the main method of displaying detailed results in a number of high-profile scientific papers. As a tool promoting reproducible practices, Jupyter notebooks allow users to interleave text, code, and output into a single, interactive document that includes features facilitating research exploration, interactive learning, and sharing over the internet. Their dynamic nature is ideally suited to sharing all steps of the research workflow in a reproducible manner. Although creating a curriculum around the Python programming language is part of motivating this hackathon, Jupyter notebooks can be used with over 40 common programing languages, and the notebook document format is programming language-agnostic.
Goals of the event
The event is designed as a 3-day hackathon to develop a short course curriculum on using Jupyter notebooks for reproducible research practices. The goal is to develop concrete material that can subsequently be taught at a two-day training workshop. The workshop would encompass the overall goals of the existing Data Carpentry workshops, teach people the skills and value of working reproducibly, and demonstrate the role of reproducibility in publishing.
To accomplish this, we aim to assemble a diverse and interdisciplinary group of researchers, educators, and developers, encompassing various levels of experience and a broad set of skills. This includes participants who are involved or familiar with Jupyter notebooks, or with developing tools and technologies under the reproducible science rubric that use, interact with, or build upon Jupyter notebooks. It also includes more broadly researchers and other people enthusiastic to learn more about reproducible research and who are interested in helping to create and teach this curriculum.
Space is limited, and hence onsite participation is by invitation only and requires submitting an application. Applications are due December 5, 2016. Travel support is available. Members of groups underrepresented in computational science are especially encouraged to apply.
Organizers & Contact
If you have any questions, feel free to contact anyone of the organizing committee:
Hilmar Lapp, Duke University
François Michonneau, University of Florida
Jasmine Nirody, UC Berkeley
Kellie Ottoboni, UC Berkeley
Tracy Teal, Data Carpentry
Jamie Whitacre, Project Jupyter, UC Berkeley Institute for Data Science