Reproducible Research Workflows and Hosting a Technical Blog
"A slightly schizophrenic, jupyter-centric stroll through some useful tricks I’ve picked up for better organizing my
shit work and sharing with colleagues and the wider world."
Materials for my talk to University of Toronto Coders on python- and github-based tools for organizing and sharing code, data, and ideas.
The code demonstrations in the video can be found in the misc folder. The primary two notebooks are
- workdocs - cell tag filtering and nbconvert
- cloudfiles - github-hosted pelican static html websites, and replacing binary objects in notebooks with urls to aws s3 buckets
The talk and demos cover two major components of the scientific workflow that I've developed over the years, and the software tools I've assembled to support it.
The first uses a combination of notebook parsing + cell tag filtering and nbconvert operations to allow all major communication types of information dissemination (e.g. PDF file, Slideshow, HTML) to be produced directly from a single 'master' document, without having to copy and paste certain parts into separate files for different purposes. This helps avoid over-proliferation of files and maximizes transparency of things like results and figure generation. Check out the ipynb-workdocs library for more info on this. The code in the misc folder represents the initial sketches of a code for updating ipynb-workdocs to the world of jupyter5.3+
The second major component of the scientific workflow discussed is how to turn the outputs of jupyter notebook cell tag filtering into slick github-hosted static html website and reveal.js html slideshows. I also discuss a few nice tricks like using amazon S3 buckets to host large media files.
For more info, check out some LabNotebook entries (c.f. 'OpenAccess' tag), e.g. about the notebook, about workdocs-cloudfiles, and some notes on modelling brain stimulation and macaque brain visualization.