Skip to content

Full programme

Andrea Telatin edited this page Aug 4, 2021 · 3 revisions

Day1: The first tour

A brief introduction to the core concepts of the command line as a good environment for bioinformatics analysis. A hands-on tutorial to log in into a remote Linux server using each participant's laptop, and test the first commands. Our main goal is the setup of your clients, in order to allow you off-site access to the workshop!

  • What is a terminal: the "terminal prompt"
  • Accessing a remote server using ssh (Mac or Linux) or the program PuTTY (from Windows)
  • Using screen to set up a persistent session on a remote server
  • The filesystem: using pwd and ls to interact with it from the terminal. Tab completion.
  • mkdir to create directories
  • A command-line text editor: nano

Day 2: basic commands

Understanding the "file system", relative and absolute paths, and the commands to organize the directories, to list files, to copy and move files and directories. Introduction to commands to interact with text files and view them. We will use the FASTA and FASTQ file formats as examples.

  • ls (with some parameters), wildcards, cd, rmdir, find
  • Interactive visualization of text files: less
  • Viewing text files with cat, head and tail
  • Counting lines and characters with wc
  • Selecting lines with patterns using grep
  • Redirecting a command output into a text file

Day 3: Extracting data from text files

Using terminal commands to interact with bioinformatics files. We'll introduce the SAM file format used to store NGS mapping, the VCF format (for SNP calling), and some tabular annotation files (GFF, GTF).

  • Recap of previous commands to be used with for SAM files
  • Downloading datasets with wget
  • The sort, uniq, cut commands
  • Joining tabular datasets with join
  • Command pipes: combining multiple commands

Day 4: Installing packages and parsing FASTA/FASTQ FILES

  • Installing Miniconda to manage packages in your Linux / Mac
  • The FASTA and FASTQ formats
  • Common manipulations, conversions and statistics of sequence files
  • Quality filtering and QC using fastp

Day 5: The bioinformatician's perspective

Using short reads alignment as a theme, we'll introduce how to install new software to be used from the command line, users and file permissions, and of course the alignment program bwa and samtools the swiss army knife to manipulate SAM files.

  • Introduction to the SAM/BAM formats
  • Using bwa to align short reads
  • Using samtools to operate with SAM/BAM files
  • Introduction to version-controlled repositories and git