Presentation material for PyCon Sweden 2015
Python in Life Sciences: How Python Drives the Analysis of Billions of DNA Sequences
A genomics research center produces larges amount of data per day; a single one of the new Illumina machines for sequencing can produce around 2TB of data composed of millions of files in under 3 days.
The first part will focus on how Python manages the preprocessing and analysis of billions of DNA sequences in a completely automated way. We will also cover how sequencing results are visualized using Flask and MongoEngine to solve medical mysteries in the clinic today.
Any Python programmer with interest in how Python is applied to the growing life sciences field of genomics.
Any scientist with interest in how other labs are managing the complex data flow and analysis of a genomics facility.
The attendees will learn about a state of the art genomics pipeline and how we use Python to manage, store and analyze large amounts of biologically-significant data.
They will also get a peek behind the website that allows clinicians to productively review results from DNA sequencing and make life-changing diagnoses.
The content of the presentation is structured into 3 different parts.
On each part there is a
slides.md file that accompany the slides for that part,
which are located in the same directory under the name
The complete slides set is located here.