Skip to content

Latest commit

 

History

History
142 lines (100 loc) · 4.38 KB

index.md

File metadata and controls

142 lines (100 loc) · 4.38 KB
site
sandpaper::sandpaper_site

Introduction to Data Analysis in Python

This is a 4-session introductory Python workshop hosted by the Center for Computational Biomedicine at Harvard Medical School.

Learning Objectives:

  • Apply fundamentals of Python programming including variables, expressions, loops, and conditional statements.
  • Create modular Python programs which handle file I/O, string manipulation, and object manipulation.
  • Manage Conda environments and lookup documentation to utilize built-in and 3rd party Python packages.
  • Manipulate data and objects using Python and Pandas.
  • Visualize data in Python using Matplotlib and Seaborn.

Participants will also be introduced to popular packages for machine learning and bioinformatics analyses in Python, and pointed towards resources to continue their learning.

Week 1 Schedule

Session 1: April 18 from 2:00pm - 4:00pm

Self-Guided Sessions 1

Session 2: April 20 from 2:00pm - 4:00pm

Self-Guided Sessions 2

Week 2 Schedule

Session 3: April 25 from 2:00pm - 4:00pm

Self-Guided Sessions 3

Session 4: April 27 from 2:00pm - 4:00pm

::::::::::::::::::: instructor

Day Schedule

Current status: Days 2 and 3 lessons still need a final pass for polish and small fixes.

  • We need to add instructions for using conda environments in jupyter
  • The plotting lesson (not pushed yet) has matplotlib but needs seaborn and some more biological examples like heatmaps.
  • The application example lessons on days 3 and 4 have not been started, though I have existing notebooks I'm planning to base them on.

Day 1

  • Introduction (20 min)
  • Variables (25 min)
  • Break (10 min)
  • Types (25 min)
  • Built-in functions (30 min)
  • Wrap-up (10 min)

At-home Lessons 1

  • Strings
  • Objects

Day 2

  • Lists (45 min)
  • Loops (25 min)
  • Break (10 min)
  • Libraries (20 min)
  • Loading Data (20 min)

At-home Lessons 2

  • Conda
  • Dictionaries

Day 3

  • Conditionals (35 min)
  • Dataframes (40 min)
  • Break (10 min)
  • Writing Functions (25 min)
  • Using scipy for statistics (10 min)

At-home Lessons 3

  • Data Wranging Practice
  • Preparing data for plotting

Day 4

  • Plotting with Seaborn and Matplotlib (60 min) half-done
  • Break (10 min)
  • Using scikit-learn for machine learning (20 min) half-done
  • Using BioMart get gene annotations (15 min) not started
  • Wrap-up and next steps (15 min)

::::::::::::::::::::::::::::::

:::::::::: prereq

  • Please follow the steps found here to install the needed software and data for this workshop. :::::::::::::::::

Materials for this workshop have been based on materials from the following:

Learn to Discover Basic Python

Programming with Python

Plotting and Programming with Python

Introduction to Conda for (Data) Scientists

Introduction to data analysis with R and Bioconductor

This lesson is made from the The Carpentries Workbench template.