Skip to content

Latest commit

 

History

History
286 lines (247 loc) · 13.4 KB

index.md

File metadata and controls

286 lines (247 loc) · 13.4 KB
layout title
default
Reproducible Research
<h2>Reproducible Research Tutorial Series</h2>

<p>This is a series of tutorials on improving the reproducibility of data analysis for those doing microbial ecology research. Although the materials focus on issues in microbial ecology, the principles are broadly applicable. Also, this series of tutorials is not designed to teach you R or mothur. Again, although the tutorials use R and mothur, you could use other tools (e.g. Python, QIIME) to achieve the same goals. This workshop will focus on the importance of command line practices (e.g. bash), scripting languages (e.g. mothur, R), version control (e.g. git), automation (e.g. make), and literate programming (e.g. Rmarkdown). These are the tools that are used in the Schloss lab to help improve the reproducibility of our manuscripts. By completing the activities in the tutorials you will be listed on the <a href="honor_roll">Reproducible Research Tutorial Honor Roll</a>, which provides a certification of your training.</p>

<p>To get started the outline in the Tutorial section below provides links to slides that correspond to each lesson. Hovering your mouse over each tutorial title will reveal links for the slide deck and blue YouTube icons, which will take you to videos of Pat Schloss leading learners through the slide decks. Some tutorials are rather lengthy and so links to "chapters" within the slides and video are provided. Within each of the slide decks, if you press "p", your browser should open the presenter notes, which are a transcription of the audio from the videos. Each video ranges in length from 30 to 90 minutes; however, you will likely need to take longer to ensure that you are comfortable with the material and to do the exercises for yourself. The tutorials are directed primarily towards researchers who are active in microbiome-based research. Throughout the materials there are numerous discussion points and activities that ask learners to engage their research group and group director in conversations and activities. Furthermore, group directors will likely find the videos useful for providing a broad overview of handling issues of reproducibility in microbiome research. Although the tutorial series is directed towards microbiome researchers, the topics outlined in the series should be of general interest to most microbiologists and scientists.</p>

<div class="row double-column">
	<div class="col-md-5">

		<h3>Tutorials</h3>
		<div id="accordion" class="tutorials">
			<h6>Introduction</h6>
			<div>
				<ul>
					<li>
						<a href="introduction">Full tutorial</a>
						<a href="https://youtu.be/CfO_f6a3XSo"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Issues in reproducible research</h6>
			<div>
				<ul>
					<li>
						<a href="reproducibility">Full tutorial</a>
						<a href="https://youtu.be/Is_12ws11GQ"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>First steps towards reproducibility</h6>
			<div>
				<ul>
					<li>
						<a href="first_steps">Full tutorial</a>
						<a href="https://youtu.be/KUWSXTavIhw"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Using high performance computers</h6>
			<div>
				<ul>
				<li>
					<a href="hpc">Full tutorial</a>
					<a href="https://youtu.be/5RgRS5VPX1g"><i class="fab fa-youtube"></i></a>
				</li>
				<li>
					<a href="hpc/#14">Using a local high performance computer</a>
					<a href="https://youtu.be/5RgRS5VPX1g?t=876"><i class="fab fa-youtube"></i></a>
				</li>
				<li>
					<a href="hpc/#24">Using AWS</a>
					<a href="https://youtu.be/5RgRS5VPX1g?t=936"><i class="fab fa-youtube"></i></a>
				</li>
				<li>
					<a href="hpc/#39">Setting up FileZilla</a>
					<a href="https://youtu.be/5RgRS5VPX1g?t=2710"><i class="fab fa-youtube"></i></a>
				</li>
				</ul>
			</div>


			<h6>The importance of documentation</h6>
			<div>
				<ul>
					<li>
						<a href="documentation">Full tutorial</a>
						<a href="https://youtu.be/llOrbyj0rp8"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Organizational skills for reproducibility</h6>
			<div>
				<ul>
					<li>
						<a href="organization">Full tutorial</a>
						<a href="https://youtu.be/kUnDnmBBGuU"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="organization/#10">Organizing a messy project</a>
						<a href="https://youtu.be/kUnDnmBBGuU?t=1348"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="organization/#15">Setting up a new project with a template</a>
						<a href="https://youtu.be/kUnDnmBBGuU?t=1855"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Using version control</h6>
			<div>
				<ul>
					<li>
						<a href="version_control">Full tutorial</a>
						<a href="https://youtu.be/299Anq-Fc4w"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="version_control/#22">Setting up and using git</a>
						<a href="https://youtu.be/299Anq-Fc4w?t=1090"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="version_control/#66">Using git with a microbiome project</a>
						<a href="https://youtu.be/299Anq-Fc4w?t=3270"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Automating reproducible analyses</h6>
			<div>
				<ul>
					<li>
						<a href="automation">Full tutorial</a>
						<a href="https://youtu.be/57pDlPCodkc"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="automation/#23">Writing bash scripts</a>
						<a href="https://youtu.be/57pDlPCodkc?t=748"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Scripting analyses</h6>
			<div>
				<ul>
					<li>
						<a href="scripting_analyses">Full tutorial</a>
						<a href="https://youtu.be/FiN3GiCAupo"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="scripting_analyses/#7">R features that improve reproducibility</a>
						<a href="https://youtu.be/FiN3GiCAupo?t=652"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="scripting_analyses/#64">Scripting the generation of a figure</a>
						<a href="https://youtu.be/FiN3GiCAupo?t=2840"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Generating reproducible documents</h6>
			<div>
				<ul>
					<li>
						<a href="literate_programming">Full tutorial</a>
						<a href="https://youtu.be/AKvUqJ98zwI"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="literate_programming/#17">Writing a reproducible report</a>
						<a href="https://youtu.be/AKvUqJ98zwI?t=763"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="literate_programming/#40">Output formats that are available</a>
						<a href="https://youtu.be/AKvUqJ98zwI?t=3143"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="literate_programming/#46">Generating a PDF of our report</a>
						<a href="https://youtu.be/AKvUqJ98zwI?t=3306"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Automation with makefiles</h6>
			<div>
				<ul>
					<li>
						<a href="make">Full tutorial</a>
						<a href="https://youtu.be/eWHE2RIGrWo"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="make/#23">Reproducing an analysis from 538 w/ make</a>
						<a href="https://youtu.be/eWHE2RIGrWo?t=695"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="make/#70">Using make to process microbiome data</a>
						<a href="https://youtu.be/eWHE2RIGrWo?t=3171"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>How to collaborate with yourself</h6>
			<div>
				<ul>
					<li>
						<a href="collaboration_with_yourself">Full tutorial</a>
						<a href="https://youtu.be/wE5AYmIoWBk"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="collaboration_with_yourself/#23">Implementing Gitflow</a>
						<a href="https://youtu.be/wE5AYmIoWBk?t=814"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="collaboration_with_yourself/#41">Resolving conflicts</a>
						<a href="https://youtu.be/wE5AYmIoWBk?t=1667"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>How to collaborate with others</h6>
			<div>
				<ul>
					<li>
						<a href="collaboration_with_others">Full tutorial</a>
						<a href="https://youtu.be/c4fkCtHWCEo"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="collaboration_with_others/#10">Filing a pull request</a>
						<a href="https://youtu.be/c4fkCtHWCEo?t=588"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="collaboration_with_others/#41">Guidelines for making contributions</a>
						<a href="https://youtu.be/c4fkCtHWCEo?t=2265"><i class="fab fa-youtube"></i></a>
					</li>
					<li>
						<a href="collaboration_with_others/#64">Code review of contributions</a>
						<a href="https://youtu.be/c4fkCtHWCEo?t=2407"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>


			<h6>Doing open science</h6>
			<div>
				<ul>
					<li>
						<a href="open_science">Full tutorial</a>
						<a href="https://youtu.be/lXnetUbbGIc"><i class="fab fa-youtube"></i></a>
					</li>
				</ul>
			</div>

		</div>
	</div>

	<div class="col-md-7 blurb">
		<h3>Reading</h3>
		<p>Much has been written on reproducibility over the past few years. These short papers provide a useful background for the overall scope of these materials and should be read before starting:</p>
		<ul>
			<li>Schloss PD. Identifying and overcoming threats to reproducibility, replicability, robustness, and generalizability in microbiome research. mBio. 2018 Jun 5;9(3). pii: e00525-18. <a href="https://dx.doi.org/10.1128/mBio.00525-18">doi: 10.1128/mBio.00525-18</a>.</li>
			<li>Collins FS, Tabak LA. NIH plans to enhance reproducibility. Nature. 2014 Jan;505:612-613. <a href="https://dx.doi.org/10.1038/505612a">doi: 10.1038/505612a</a>.</li>
			<li>Casadevall A, Ellis LM, Davies EW, McFall-Ngai M, Fang FC. A Framework for Improving the Quality of Research in the Biological Sciences. MBio. 2016 Aug 30;7(4). pii: e01256-16. <a href="https://dx.doi.org/10.1128/mBio.01256-16">doi: 10.1128/mBio.01256-16</a>.</li>
			<li>Ravel J, Wommack KE. All hail reproducibility in microbiome research. Microbiome. 2014 Mar 7;2(1):8. <a href="https://dx.doi.org/10.1186/2049-2618-2-8">doi: 10.1186/2049-2618-2-8</a>.</li>
			<li>Garijo D, Kinnings S, Xie L, Xie L, Zhang Y, Bourne PE, Gil Y. Quantifying reproducibility in computational biology: The case of the tuberculosis drugome. PLOS ONE. 2013 Nov;505:612-613. <a href="https://dx.doi.org/10.1371/journal.pone.0080278">doi: 10.1371/journal.pone.0080278</a>.</li>
			<li>Noble WS. A quick guide to organizing computational biology projects. PLoS Comput Biol. 2009 Jul;5(7):e1000424. <a href="https://dx.doi.org/10.1371/journal.pcbi.1000424">doi: 10.1371/journal.pcbi.1000424</a>.</li>

		</ul>
	</div>
</div>

<h3 style="padding-top:20px;">Dependencies</h3>
<p>A big pain in making your analysis reproducible is being explicit about the methods that are used to performing the analysis. The same goes for this series of tutorials! I would strongly recommend either setting up an AWS account and creating an AMI instance or learning to use your local high performance computer (HPC) facility so that you can more easily transition from this tutorial to your future analyses. This is covered in the fourth tutorial, <a href="hpc">Using high performance computers</a>. Regardless of the operating system you are in, here's what you'll need to work through the tutorials...</p>

<ul>
	<li><a href="https://www.r-project.org">R</a></li>
	<li><a href="https://www.gnu.org/software/make/">make</a></li>
	<li><a href="https://git-scm.com">git</a></li>
	<li><a href="https://www.gnu.org/software/wget/">wget</a></li>
	<li><a href="https://atom.io">Atom</a> with <a href="http://flight-manual.atom.io/getting-started/sections/atom-basics/">command line tools</a> or <a href="https://www.nano-editor.org">nano</a> will work well</li>
</ul>

<p>Part of the justification for my recommendation to use either AWS or your local HPC is that these will likely already have everything installed. Most of these will be installed if you are running Linux or Mac OS X. If you are using Mac OS X, <a href="https://brew.sh">homebrew</a> is your friend for installing various Linux-based programs. For Windows users, running <a href="https://www.howtogeek.com/249966/how-to-install-and-use-the-linux-bash-shell-on-windows-10/">Windows 10 with bash</a> or installing <a href="https://git-scm.com">git bash</a> and then installing R and make will likely get you where you need to be.</p>

<h3 style="padding-top:20px;">Contributing</h3>
<p>We love to get feedback and improvements from others. That's the idea behind Riffomonas - that we riff on each other's work to make it better! If you would like to contribute to the project or even ask a question, please feel free to <a href="https://github.com/riffomonas/reproducible_research/issues">file an issue</a> on our <a href="https://github.com/riffomonas/reproducible_research">GitHub repository</a>.</p>

<a href="https://github.com/riffomonas/reproducible_research/blob/gh-pages/LICENSE.md"><img src="https://licensebuttons.net/l/by-sa/3.0/88x31.png" /></a>