Stored data for the 2026 class on VSCode, GitHub, Snakemake, and Biowulf. Information is included in the read the docs
Computational biology is quickly becoming integral and essential to the field of biomedical science. With big data, machine learning, and everyone building their own packages, coding anything relevant can feel daunting and insurmountable. But the attainable aspect of coding is anyone can work their way to writing advanced and unique code, all that is needed is practice.
"A journey of a thousand miles begins with a single step" -Laozi
As publically funded researchers, we have an obligation for our results to be as transparent and accessible as possible. At the NIH we should strive for Open Science, where data and associated methods are accessible. For data analysis, this means we want our pipelines to be accessible as possible, essentially a turn-key system that anyone with the input data (which should be deposited in a public archive) should be able to reproduce the results. Fully implementing the following framework spearheads the fight against the reproducibility crisis