This is a fork of the pandas-cookbook modified to use the polars library instead of pandas.
polars is a Python library for doing data analysis. It's really fast and lets you do exploratory work incredibly quickly.
The goal of this cookbook is to give you some concrete examples for getting started with polars. The docs are really comprehensive. However, I've often had people tell me that they have some trouble getting started, so these are examples with real-world data, and all the bugs and weirdness that entails.
It uses 3 datasets:
- 311 calls in New York
- How many people were on Montréal's bike paths in 2012
- Montreal's weather for 2012, hourly
It comes with batteries (data) included, so you can try out all the examples right away.
- Chapter 1: Reading from a CSV
Reading your data into polars is pretty much the easiest thing. Even when the encoding is wrong!
- Chapter 2: Selecting data & finding the most common complaint type
It's not totally obvious how to select data from a polars dataframe. Here I explain the basics (how to take slices and get columns)
- Chapter 3: Which borough has the most noise complaints? (or, more selecting data)
Here we get into serious slicing and dicing and learn how to filter dataframes in complicated ways, really fast.
- Chapter 4: Find out on which weekday people bike the most with groupby and aggregate
The groupby/aggregate is seriously my favorite thing about polars and I use it all the time. You should probably read this.
- Chapter 5: Combining dataframes and scraping Canadian weather data
This chapter has been omitted due to inactive web URLs.
- Chapter 6: String operations! Which month was the snowiest?
Strings with polars are great. It has all these vectorized string operations and they're the best. We will turn a bunch of strings containing "Snow" into vectors of numbers in a trice.
- Chapter 7: Cleaning up messy data
Cleaning up messy data is never a joy, but with polars it's easier <3
- Chapter 8: Parsing Unix timestamps
This is basically a quick trick that took me 2 days to figure out.
- Chapter 9 - Loading data from SQL databases
How to load data from an SQL database into polars, with examples using SQLite3, PostgreSQL, and MySQL.
The easiest way is to try it out instantly online using Binder's awesome service. Start by clicking here, wait for it to launch, then click on "cookbook", and you'll be off to the races! It will let you run all the code interactively without having to install anything on your computer.
To install it locally, you'll need Jupyter notebook and polars on your computer.
You can get these using
pip (you may want to do this inside a virtual environment to avoid conflicting with your other libraries).
pip install -r requirements.txt
This can be difficult to get set up and require you to compile a whole bunch of things. I instead use and recommend Anaconda, which is a Python distribution which will give you everything you need. It's free and open source.
Once you have polars and Jupyter, you can get going!
git clone https://github.com/escobar-west/polars-cookbook.git
A tab should open up in your browser at
This repository contains a Dockerfile and can be built into a docker container. To build the container run following command from inside of the repository directory:
docker build -t escobar-west/polars-cookbook -f Dockerfile-Local .
run the container:
docker run -d -p 8888:8888 -e "PASSWORD=MakeAPassword" <IMAGE ID>
you can find out about the id of the image, by checking
After starting the container, you can access the Jupyter notebook with the cookbook on port 8888.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License