Skip to content
master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

BMMB554 | Introduction to Data Driven Life Sciences | Spring 2021

Course notes are here

Place and Time

Virtual using Zoom | Tuesday, Thursday 10:35am - 11:50am EST

Instructor

Anton Nekrutenko aun1@psu.edu Wartik 505 Office hours by appointment only

When contacting instructor use the above e-mail and include "BMMB554" in the subject line (simply click on e-mail address).

Course logistics

This course does not use Canvas. Canvas is a convoluted system with too many features and undefined purpose. Instead, this course is served from GitHub.

Do not contact me through Canvas! I will not check my Inbox there. Instead, contact me via email as described above.

Grading

Attendance (33.3%) + Homework (33.3%) + Final Project (33.3%) ≈ 100%

Topics

The course will be divided into several blocks:

Block 1: Tooling

Tools for working with data.

# What Why
1️⃣ BLOCK 1 ESSENTIAL TOOLS
1 Colaboratory Getting to know our main tool for this course
2 Python Fundamental concepts of Python programming language
3 NumPy Working with arrays
4 Pandas Dataframes for data intensive computing
5 Seaborn/MatPlotLib Looking at the data
6 Galaxy basics Introduction into Galaxy Platform
7 Advanced Galaxy Complex analyses with large datasets
2️⃣ BLOCK 2 DATA GENERATION
8 History of sequencing From Miescher to Sanger
9 Array amplification + reversible termination (Illumina) Short reads and their applications
10 Single molecule sequencing with ZMWs (PacBio) Long reads and their applications
11 Single molecule sequencing with nanopores (ONT) Longer reads and their applications
12 Genome sequencing and assembly From reads to genomes
13 Transcriptomics: Bulk and SC What's expressed?
14 Variation Detecting differences between populations and individuals
3️⃣ BLOCK 3 COMPUTATIONAL BIOLOGY
15 Alignment Aligning sequences
16 Mapping Mapping sequences
17 Assembly Assembling sequences
18 Stat I Statistical fundamentals I
19 Stat II Statistical fundamentals II
4️⃣ BLOCK 4 PUTTING IT ALL TOGETHER
20 Biology of coronaviruses nCoV has very sophisticated machinery
21 Where is the data Accessing nCoV assemblies and raw reads
22 Variation analysis What is the extent of nCoV variation
23 Reproducibility Making you analyses reproducible by others
24 Transparency Conveying you strategy and results

Final Project

We will get into specifics of the final project in early March.

About

Introduction to data driven life sciences

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published