Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Clone this wiki locally
This course will teach core computing skills as well as project specific approaches. Each student will be developing and completing a research project targeting journal article submission by the end of the Quarter. There will be an emphasis on developing habits that increase automation which in turn will facilitate reproducibility. The course primary course platform will be GitHub, with each student creating their own repository.
T 3:00-4:20 FSH 213 Th 9:30-11:20 FSH 213
|zero||Biology, Course Framework, Getting set-up||Preface xiii-xxv; How to Learn Bioinformatics 1-18||Questions|
|one||Bash, version control, Project Set-up||Setting Up and Managing a Bioinformatics Project 21-35; Remedial Unix Shell 37-54||Questions|
|two||Jupyter, Annotation||Retrieving Bioinformatics Data 109-124, Unix Data tools 125-168||Questions|
|three||Projects||Working with Sequence Data 339-354||Questions|
|four||RNA-seq||Git for Scientists 67-83||Questions|
|five||lncRNA miRNA||Working with Remote Machines 57-66||Questions|
|six||DNA methylation||Gavery Slides||Questions|
|seven||Genome Browser||Working with Alignment Data 355-383, Working with Range Data 329-338||Questions|
|eight||Holiday||🌽 🍰 💻|
|nine||SNP||Bioinformatics Shell Scripting, Writing Pipelines 395-423||Questions|
🔺 subject to change based on guest availability
Bioinformatics Data Skills: Reproducible and Robust Research with Open Source Tools
By Vince Buffalo
Publisher: O'Reilly Media
Final Release Date: July 2015
- Quizzes (10✖️3) = 30 DUE Friday Midnight
- Project Progress (10✖️3) = 30 DUE Friday Midnight
- Draft Product (Week 5) = 15
- Final Product (Week 10) = 25
###Please get an account with following online services:
###Please review these webpages:
##Computing Environments and Software This course will be taught using personal computers of students. This approach (as opposed to using virtual machines in the cloud) has its disadvantages and advantages.
Any modern laptop should work fine. There will be some analysis that we will not complete during the course (given time constraints), however students should be able to clearly understand how to carryout the analysis. We will also be introducing students to cloud based options. Generally speaking, what we will be doing is more straightforward to do on Unix based machines, (Linux and MacOSx) though we will also show students Windows-centric solutions.
A good text editor will be very useful. There are several built in options with nano recommended by Software Carpentry. For this course I suggest stand alone applications.
Mac OS X
We will use Markdown. Below are some recommended editors. Text editors above would also work.
- Jupyter will work - see below.
Mac OS X
We will be using the "command-line", specifically the Bash shell. Below is information for this for different operating systems taken from the Software Carpentry website.
The Bash Shell
Bash is a commonly-used shell that gives you the power to do simple tasks more quickly.
Download the Git for Windows installer. Run the installer. Important: on the 6th page of the installation wizard (the page titled
Configuring the terminal emulator...) select
Use Windows' default console window. This will provide you with both Git and Bash in the Git Bash program.
Mac OS X
The default shell in all versions of Mac OS X is bash, so no need to install anything. You access bash from the Terminal (found in
/Applications/Utilities). You may want to keep Terminal in your dock for this workshop.
The default shell is usually Bash, but if your machine is set up differently you can run it by opening a terminal and typing
Note you should be able to run bash shell on any platform within Jupyter, once installed
GitHub Local Clients
We will be using GitHub, a Web-based Git repository hosting service. It offers distributed revision control of Git as well as adding its own features.
- GitHub Desktop is available for Mac and Windows
Formerly IPython Notebook
On a Mac, there is a stand alone version of the notebook - Pineapple
The newest version of BLAST+ for all operating systems is available @ ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
R is a programming language that is especially powerful for data exploration, visualization, and statistical analysis.
Below is information for this for different operating systems taken from the Software Carpentry website.
Mac OS X
You can download the binary files for your distribution from CRAN. Or you can use your package manager (e.g. for Debian/Ubuntu run
sudo apt-get install r-base and for Fedora run
sudo yum install R). Also, please install the RStudio IDE.
"Collectively, the bedtools utilities are a swiss-army knife of tools for a wide-range of genomics analysis tasks"
Available for download @ https://github.com/arq5x/bedtools2/releases Likely only available for Linux and Mac OS
"high-performance visualization tool for interactive exploration of large, integrated genomic datasets"
To download the software you will need to register. See https://www.broadinstitute.org/software/igv/log-in.
There are a number of programs that will be used that we might not actually run at full production level during the course given time and or processor constraints. It would be fine to install these to get familar with parameters
Here is a list of free web services we will likely use during the course