January 13: Introduction and overview of course
Read Chapters 1-3 for Monday Jan 22
Bioinformatics Data Skills by Vince Buffalo
January 15: Set up tools and practice examples in Chapters 1-3
First we will install the following tools, and after that work through some command line exercises.
- iterm will work for this!
- preferred: putty
- preferred: FileZilla
Command line exercises
January 17: ENCODE data reproducibility and Example datasets
Overview of Encode / ChIP and Transposons as missing regulatory regions
What is a promoter and transcription factor?
Encode Data portal
Go through each category and get familiar — we will specifically be looking at:
DNA binding / TF-CHIPseq / K562 / paired-ended
Note that the data can be put in carts
Exploring meta data in unix
Print out meta data
select samples, click columns add control, click on table and then download .tsv
ls, head, tail, cat, awk / grep intro
Let’s figure out the best way to take notes in “MARK DOWN” !!
Evernote TXT ATOM
January 22: Start Organizing Meta Data into “Sample” File
We will need to make a file with all the sample information we want (hint ENCODE protal has it all)
First try in excel to understand the column numbers
Then let’s try putting together a sample file using Awk and Grep
Get FASTQ URLS -- Make a sample sheet !!
Read chapters 4-5
Jan 24: Let’s go get data !
We will each go retrieve a ENCODE data file from our sample sheet.
SFTP, SSH, SCP
wget -I file.txt
Lecture on SFTP / Servers / FIJI
Tour with Michael on Fiji
cloud computing of the future
Class Exercise: Each group presents their favorite 'programs' (e.g., grep, awk, sort ls) and the five most useful options from the man page.
Jan 27: GITHUB
Brief Overview of Git
Everyone set up a Git account !! 🙂 !!
Class excercise make a design file by Feb 3
Jan 29: IT lecture on Fiji
Please take notes on the key rules and regulations — to do and not to do’s !
Jan 31: Connect to Fiji
SFTP / SSH connect via terminal and Filezilla Set up SSH key Git Push/Pull — precooked class —
Feb 3: Regroup and round off basic unix and git exercises
Design File presentations
Discuss and catch up on what we have learned about unix and commands etc
Feb 5: NextFlow / nf-core chipseq
Read basic documentation and install nextflow in your path !
Feb 7: NextFlow / nf-core chipseq
Set up design and sample files — folder structures — Run.sh
sbatch run.sh squeue -u X000 scancel jobid
Familiarize yourself and take notes on file types
##Read next flow documentation and example nextflow.out
Homework google the programs used in nextflow.out Fastqc TimaGalore BWAMem SortBAM MergeBAM BigWig MACSCallPeak Peak QC
### Goal to plan and get ready to run NF-Core ChIP-seq pipeline
Time to do our first data analysis ! Let's go over some handy principles Sbatch Screen squeue -u scancel grep nextflow.out
Class Exercise: each group present a file format and where in the pipeline it is being executed and what columns and information are contained in the files
Feb 12: Intuitive statistics I
Lets cover some of the basic statistics being used in the NF-Core Chip-Seq pipeline.
Parametric -vs- NonParametric data Probability Distributions: Poisson, Binomial, HyperGeometric, negative binomial, logarithmic T-test, anova, wilcox/fisher, Kolmogorov–Smirnov test Scan statistics, False Discovery rate, EfECt SiZe
Recomended reading: Biometry Chapter 4
Feb 14: Intuitive Statistics II
We will go over the most used statistics in the NF-Core ChipSeq container. Go over the files and designs and makes sure all groups are ready for run.sh !
Class exercise: each group presents a statistical principle and how it is used in NF-CORE ChIPseq
Feb 17: Regroup on Nextflow, Git, Sample sheet and Design files
What happened and now what
Feb 19: What is happening during the run and what are the output files
BAM BIGWIG IGV PEAKs -- let's explore the result output !
Feb 21: Data analysis planning and brainstorming questions to address
Let’s figure out what we want to know about TF regulation on mRNA promoters, lncRNA promoters and LTR promoters
IGV install and visualization of ChiPseq results !
Feb 24: R and Data analysis visualization
Intro to R Install R, discuss some basics commands. Intro to Rviz Chipseeker install
Good R tutorial: https://www.youtube.com/watch?v=fDRa82lxzaU
Feb 26: ChIPSeeker I -- Setting up R to make first plots of Chip-Seq results
Meta plots of all promoters
Read Chipseeker documentation
Feb 28: ChiPSeeker II -- Setting up R to make first plots of Chip-Seq results
Meta plots of all promoters #####Do all DNA binding proteins have the same pattern? #####Digging into ChipSeeker
March 2: ChipSeeker III -- Polishing results
Finalizing first analyses
Class exercise of how to now compare mRNA, lncRNA and LTR promoter binding properties
March 4: Group Data presentation : suggestions for future experiments
March 6: Make plan to divide up planned analyses
March 9-20 : Run R analyses for TF regulation of mRNA
March 31- April 20: pRactical
Can we use data standards and reporducibility to write a paper on our findings? Let's set up the Paper-Pository on Git