Python Computing for Data Science
A Graduate Seminar Course at UC Berkeley (AY 250)
Campbell Hall: Monday 2 - 5 PM SPRING 2018
Python has become the de facto superglue language for modern scientific computing. In this course we will learn Pythonic interactions with databases, imaging processing, advanced statistical and numerical packages, web frameworks, machine-learning, and parallelism. Each week will involve lectures and coding projects. In the final project, students will build a working codebase useful for their own research domain.
This class is for any student working in a quantitative discipline and with familiarity with Python. Those who completed the Python Bootcamp or equivalent will be eligible. You should follow the steps to install the Anaconda 3.6.X distribution as well as
|Jan 22||Advanced Python Language Concepts (decorators, OrderedDict,
Generators, Iterables, Context Managers)
- scipy §2.1
|Jan 29||Pandas, Scipy, & Numpy Numpy:||- scipy §§ 1.3, 1.5, 2.2
- skim chap 4/5 of McKinney
|Feb 5||Data vizualization (Matplotlib, Bokeh, Altair, Plotly, mayavi)||- Skim Tufte's Vizualization book
- colormap talk (Scipy 2015)
|Feb 12||Interacting with the world (requests, email, IoT/pyserial)||None||Josh|
|Feb 19||Holiday (no class)|
|Feb 26||Parallelism (asyncio, dask, IPython cluster)||- [ipyparallel docs] (http://ipyparallel.readthedocs.io/en/latest/intro.html)||Josh|
|Mar 5||Database interaction (sqlite, postgres, SQLAlchemy, peewee),
Large datasets (xarray, HDF5)
|Mar 12||Machine Learning I (sklearn, NLP)
NOTE: 3:10pm start!
|Mar 19||Machine Learning II (keras [tensorflow])||None||Josh|
|Mar 26||Spring Break|
|Apr 2||Image processing (OpenCV, skimage)||None||Stefan van der Walt|
|Apr 9||Web frameworks & RESTful APIs, Flask||None||Josh|
|Apr 16||Bayesian programming & Symbolic math||Probabalistic Programming eBook
pip install pymc3
|Apr 23||Speeding it up (Numba, Cython, wrapping legacy code)||TBD||Josh|
|Apr 30/Onward||final project work|
Throughout these lectures we will be peppering in sidebar knowledge concepts:
- Jupyter & JuypterLab
- using git & github
- Data science workflows
- reproducible research
- application building
Each Monday we will be introducing a resonably self-contained topic with two back-to-back lectures. In between a short (~20 minute) breakout coding session will be conducted. Homeworks will require you to write a large (several hundred line) codebase.
Help sessions will be conducted interactively on the Piazza site for the course. There is also an in-person help session every Tuesday from 11am-noon at BIDS (in Doe library). Email Josh with any questions.
Email us at firstname.lastname@example.org or contact the professor directly (email@example.com). You can also contact the GSI, Chelsea Harris, at (firstname.lastname@example.org. Auditing is not permitted by the University but those wishing to sit in on a class or two should contact the professor before attending.