rvenkat82 edited this page Nov 11, 2012 · 96 revisions
Clone this wiki locally

PyCon Canada - November 10th & 11th 2012

Sign Up For Tutorials

The tutorials at PyCon Canada have a very limited capacity. If you're interested in attending one of the PyCon Canada tutorials, please sign-up in advance below. Note that you must already have a ticket to PyCon Canada 2012 to attend.

We will also be using this list to send out any tutorial requirements (things that you should have installed on your laptop before arriving at the tutorial).

HDF5 is for Lovers

by Anthony Scopatz (90 minute tutorial)

Saturday, November 10th 2012 at 10:25 am


HDF5 is a hierarchical, binary database format that has become a de facto standard for scientific computing. While the specification may be used in a relatively simple way (persistence of static arrays) it also supports several high-level features that prove invaluable. These include chunking, ragged data, extensible data, parallel I/O, compression, complex selection, and in-core calculations.

This tutorial will discuss tools, strategies, and hacks for really squeezing every ounce of performance out of HDF5 in new or existing projects. It will also go over fundamental limitations in the specification and provide creative and subtle strategies for getting around them. Overall, this tutorial will show how HDF5 plays nicely with all parts of an application making the code and data both faster and smaller. With such powerful features at the developer's disposal, what is not to love?!

This tutorial is targeted at a more advanced audience which has a prior knowledge of Python and NumPy. Knowledge of C or C++ and basic HDF5 is recommended but not required. This tutorial should be taken in conjunction with David Warde-Farley's Introduction to Numerical and Scientific Computing with Python tutorial below.

Software requirements:

Recent versions of PyTables and its dependencies HDF5, NumPy, and numexpr are required for this tutorial. Matplotlib and IPython are also highly recommended. For browsing HDF5 files, either ViTables (http://vitables.org/) or HDFView (http://www.hdfgroup.org/hdf-java-html/hdfview/) are quite useful.

To aid in the installation process, there are several free distributions of Python that include all or most of the packages covered in a one-click installer format. Anaconda CE (Community Edition) from Continuum Analytics is a good choice as PyTables (and its dependencies) out of the box. EPD Free contains NumPy, IPython and Matplotlib, but the other packages will need to be installed by the user. The full EPD is also available for free for academics and contains everything needed for this tutorial. Windows users may also have some luck with Python(x, y), though the instructor has no experience with this distribution.

Add your name here:

  1. Anthony Scopatz (Instructor)
  2. Stefan Wiechula
  3. Nasser M. Abukhdeir
  4. Christopher Ing
  5. Greg Wilson
  6. Fernando Perez
  7. Scott Rostrup
  8. David Warde-Farley
  9. Ram Venkat
  10. Ramsey D'silva
  11. Zhuyi Xue
  12. Eric Anderson
  13. Edwin Frondozo
  14. Taavi Burns


Introduction to Numerical and Scientific Computing with Python

by David Warde-Farley (3 hour tutorial)

Saturday, November 10th 2012 at 1:05 pm


Intended audience: beginning to intermediate Python programmers, anyone who has written Python applications that feels they might have need of number crunching facilities.

This tutorial will offer an introduction to numerical data processing and scientific computing using Python. The tutorial will revolve around NumPy, the fundamental package for scientific computing with Python, and introduce users to its use not only in implementing numerical algorithms but in interfacing with legacy systems and C libraries. We will touch on scientific visualization with matplotlib as well as other "general interest" packages such as SciPy and scikit-learn, and discuss and demonstrate strategies for writing high-performance numerical code that can be easily integrated into larger Python-based applications.

Software requirements:

Recent versions of NumPy, Matplotlib will suffice for most of the tutorial. Cython will also be covered and will be useful for the later advanced portions (note that this requires a C compiler). IPython and its optional dependencies pyzmq and libzmq are highly recommended.

There are several free distributions of Python that include all or most of the packages covered in a one-click installer format. The most comprehensive of these is Anaconda CE (Community Edition) from Continuum Analytics: it includes all the packages touched on in this tutorial as well PyTables, required for Anthony Scopatz's HDF5 tutorial above. It is available for Windows, Mac and Linux.

EPD Free, a free community edition of Enthought Python Distribution, contains everything in this tutorial except for Cython. High school and postsecondary students and staff are also eligible for a free academic license for the full version of EPD courtesy of Enthought. Either option is also available for Windows, Mac or Linux.

Windows users may also have some luck with Python(x, y), though the instructor has no experience with this distribution.

Add your name here:

  1. David Warde-Farley (Instructor)
  2. Ashwin Panchapakesan
  3. Andrey Paramonov
  4. Gerrat Rickert
  5. Vid Ayer
  6. Chris Cooper
  7. Jonathan Dobson
  8. Mike Pettypiece
  9. Nasser M. Abukhdeir
  10. Annika Hillebrandt
  11. Yanshuai Cao
  12. David Kua
  13. Chris Fournier
  14. Jeremy Banks
  15. Fernando Perez
  16. Jason Cornell
  17. Simon Ditner
  18. Ram Venkat


  1. Edwin Frondozo
  2. Olivier Yiptong
  3. Ramsey D'silva
  4. Mahmoud Hashim
  5. Terence Lo

Fast, Faster, Fastest: Getting the Best Performance From Python

by Greg Ward (2 hour tutorial)

Sunday, November 11th 2012 at 12:55 pm


Find and fix your performance bottlenecks. Where should you spend your time so your users don't have to spend theirs waiting for your code? Topics covered: algorithmic complexity ("big O" notation); using the right algorithm for the job; profiling to find the hot spots; micro-optimization tricks; caching vs. computing; storage hierarchies; and when/how you should turn to C.

Software requirements:

  • Python interpreter (2.7 or 3.2; bonus points if you have both)
  • text editor
  • Cython (optional)

Add your name here:

  1. Greg Ward (Instructor)
  2. Javier de la Rosa (@versae)
  3. Cameron Davidson-Pilon
  4. Todd Whiteman
  5. Martine Vong
  6. Julien Vong
  7. Matt Okura
  8. Mike Pettypiece
  9. Yanshuai Cao
  10. Ashwin Panchapakesan
  11. Chris Boothe
  12. Stefan Wiechula
  13. Trevor Bekolay
  14. Steve Singer
  15. Fernando Perez
  16. Alan Boudreault
  17. Jason Cornell
  18. David Warde-Farley
  19. Evan Hicks


  1. Anthony Scopatz
  2. Ramsey D'silva
  3. Zach Aysan
  4. Russell Warren
  5. Amrik Singh
  6. Matt Ruten
  7. Zhuyi Xue
  8. Terence Lo
  9. Ye Liu