Going with Python
As a useR, you will at least hear about Python from time to time, and sometimes (not always) with good reason: there are things that are worth being done in Python rather than in R. Here is a list of resources to get you started. (It does not cover setting up your Python environment).
For an overview of functions that achieve the same thing as the tidyverse ones, see this conversion guide.
Currently missing:
- An introduction to Web scraping with BeautifulSoup, lxml, requests etc.
Ani Adhikari, John DeNero and David Wagner, Computational and Inferential Thinking: The Foundations of Data Science, 2022
Textbook for the Data 8: Foundations of Data Science course at UC Berkeley. Use both the course and the textbook as a starting point for 'Data 100' course (and textbook) mentioned below on this page.
Allen B. Downey, Elements of Data Science, 2022
Quoting from the homepage: "an introduction to data science for people with no programming experience. My goal is to present a small, powerful subset of Python that allows you to do real work in data science as quickly as possible."
Allen B. Downey, Think Stats. Exploratory Data Analysis in Python, 2014
Free book. Almost ten years old, but still useful.
Dirk Hovy, Text Analysis in Python for Social Scientists. Prediction and Classification, 2022
A recent book that should help with the specific areas of text mining and text analysis with Python, although I am adding it to this list without having had a chance to take a proper look at its actual contents.
Jacqueline Kazil and Katharine Jarmul, Data Wrangling with Python, 2016
A book that covers data import, data wrangling, and Web scraping using the
Scrapy
library, as well as APIs. The list of appendices looks very useful.
Sam Lau, Joey Gonzalez, and Deb Nolan, Learning Data Science, f. 2023
Forthcoming book based on the Principles and Techniques of Data Science ('Data 100') course at UC Berkeley. Assumes you know Python already, which you will, if you take the course, and before you do, also take the 'Data 8' course mentioned above.
Andreas C. Müller, Sarah Guido, Introduction to Machine Learning with Python, 2016
All the essentials, in just one book.
Sebastian Raschka and Vahid Mirjalili, Python Machine Learning, 2019
Extensive book. The first author also has a full course for you: Machine Learning (University Wisconsin-Madison, 2018).
Jake VanderPlas, Python Data Science Handbook, 2016
This book covers essential Python data science modules: NumPy, Pandas, Matplotlib, and machine learning with Scikit-learn. The only other module that you will actually need is statsmodels.
Tim Hopper, Python Plotting for Exploratory Data Analysis, 2020
Basic plots, using
matplotlib
or other libraries built on top of it.
Rafe Kettler, A Guide to Python's Magic Methods, 2012
Python has this
__thing__
called magic methods. Check out how they work.
Lev Maximov, Pandas Illustrated: The Definitive Visual Guide to Pandas
Data wrangling with the Pandas package. If you are coming from R, see also Conor MM's R to Python [pandas] useful data wrangling snippets.
Ines Montani, Advanced NLP with spaCy (n.d.)
A full course on text mining and natural language processing with one of the best Python libraries around.
Guillaume Plique et al., minet: Web Mining Library and Command Line Tool Written in Python
A tool made by the médialab at Sciences Po, Paris. Check the Github repository for the full documentation.
Jake VanderPlas, A Whirlwind Tour of Python, 2016
This tutorial will show the Python language essentials. It is intended at people familiar with another language.
Ethan Swan and Bradley Boehmke, Intro to Python for Data Science Workshop, c. 2022
Very easy to follow. View it as Binder notebooks, or follow the slides.
Stefan McCabe, Programming with Data
Harder to follow, but lists many interesting examples for replication.
Thomas J. Sargent and John Stachurski, Quantitative Economics with Python
Three courses for
the price of onefree.
-
Kim AntunezLino Galiana, Python pour la data science, 2023 - Ewen Gallic, Python pour les économistes, 2018
The following list was sent by a student (thanks Urjasvi) who was looking for Python courses from providers that deliver completion certificates:
- Coursera: Charles Russell Severance, Python for Everybody Specialization
- Coursera: Paul Resnick and Steve Oney, Python Basics
More might be available via e.g. DataCamp and edX.
Based on a few bookmarks dating back to 2017-03-15 (I did not dig into the older ones).