Skip to content

Going with Python

François Briatte edited this page May 3, 2024 · 11 revisions

As a useR, you will at least hear about Python from time to time, and sometimes (not always) with good reason: there are things that are worth being done in Python rather than in R. Here is a list of resources to get you started. (It does not cover setting up your Python environment).

For an overview of functions that achieve the same thing as the tidyverse ones, see this conversion guide.

Currently missing:

  • An introduction to Web scraping with BeautifulSoup, lxml, requests etc.

Handbooks

Ani Adhikari, John DeNero and David Wagner, Computational and Inferential Thinking: The Foundations of Data Science, 2022

Textbook for the Data 8: Foundations of Data Science course at UC Berkeley. Use both the course and the textbook as a starting point for 'Data 100' course (and textbook) mentioned below on this page.

Allen B. Downey, Elements of Data Science, 2022

Quoting from the homepage: "an introduction to data science for people with no programming experience. My goal is to present a small, powerful subset of Python that allows you to do real work in data science as quickly as possible."

Allen B. Downey, Think Stats. Exploratory Data Analysis in Python, 2014

Free book. Almost ten years old, but still useful.

Dirk Hovy, Text Analysis in Python for Social Scientists. Prediction and Classification, 2022

A recent book that should help with the specific areas of text mining and text analysis with Python, although I am adding it to this list without having had a chance to take a proper look at its actual contents.

Jacqueline Kazil and Katharine Jarmul, Data Wrangling with Python, 2016

A book that covers data import, data wrangling, and Web scraping using the Scrapy library, as well as APIs. The list of appendices looks very useful.

Sam Lau, Joey Gonzalez, and Deb Nolan, Learning Data Science, f. 2023

Forthcoming book based on the Principles and Techniques of Data Science ('Data 100') course at UC Berkeley. Assumes you know Python already, which you will, if you take the course, and before you do, also take the 'Data 8' course mentioned above.

Andreas C. Müller, Sarah Guido, Introduction to Machine Learning with Python, 2016

All the essentials, in just one book.

Sebastian Raschka and Vahid Mirjalili, Python Machine Learning, 2019

Extensive book. The first author also has a full course for you: Machine Learning (University Wisconsin-Madison, 2018).

Jake VanderPlas, Python Data Science Handbook, 2016

This book covers essential Python data science modules: NumPy, Pandas, Matplotlib, and machine learning with Scikit-learn. The only other module that you will actually need is statsmodels.

Tutorials

Tim Hopper, Python Plotting for Exploratory Data Analysis, 2020

Basic plots, using matplotlib or other libraries built on top of it.

Rafe Kettler, A Guide to Python's Magic Methods, 2012

Python has this __thing__ called magic methods. Check out how they work.

Lev Maximov, Pandas Illustrated: The Definitive Visual Guide to Pandas

Data wrangling with the Pandas package. If you are coming from R, see also Conor MM's R to Python [pandas] useful data wrangling snippets.

Ines Montani, Advanced NLP with spaCy (n.d.)

A full course on text mining and natural language processing with one of the best Python libraries around.

Guillaume Plique et al., minet: Web Mining Library and Command Line Tool Written in Python

A tool made by the médialab at Sciences Po, Paris. Check the Github repository for the full documentation.

Jake VanderPlas, A Whirlwind Tour of Python, 2016

This tutorial will show the Python language essentials. It is intended at people familiar with another language.

Courses

Ethan Swan and Bradley Boehmke, Intro to Python for Data Science Workshop, c. 2022

Very easy to follow. View it as Binder notebooks, or follow the slides.

Stefan McCabe, Programming with Data

Harder to follow, but lists many interesting examples for replication.

Thomas J. Sargent and John Stachurski, Quantitative Economics with Python

Three courses for the price of one free.

Courses in French

Courses with certificates

The following list was sent by a student (thanks Urjasvi) who was looking for Python courses from providers that deliver completion certificates:

More might be available via e.g. DataCamp and edX.


Based on a few bookmarks dating back to 2017-03-15 (I did not dig into the older ones).