This repository contains material for the tutorial I presented at the EuroSciPy 2024 conference in Szczecin.
The tutorial cover the following topics:
- DataFrames as Panels of Data
- Create DataFrames
- Work With Tidy Data
- Manipulate DataFrames
- Share Results and Insights
- Lazy DataFrames
The workshop consists of 90 minutes of live code demonstrations and hands-on exercises.
The demos and examples use three public datasets:
data/postal_codes.csv: Postal Codes in Polanddata/billboard_songs.csvanddata/billboard_ranks.csv: Top 100 songs on Billboard in 2000data/schedule.csv: Workshop schedule
See data/ for more information, including licenses and links to the original datasets.
You should create a virtual environment and install Polars and other necessary dependencies.
Note: Demonstrations were done on Linux Ubuntu with Python 3.12.5 and packages and versions specified in
requirements.txt.
If you're running Anaconda or Miniconda, you should set up a separate environment for this tutorial. You can use conda to do so:
$ conda env create -n euroscipy-polars -f environment.yml
$ conda activate euroscipy-polarsRemember to activate your Conda environment.
If you're using a plain Python distribution, then you can use venv to create a virtual environment:
$ python -m venv venv
$ source venv/bin/activate
(venv) $ python -m pip install -r requirements.inOn Windows, you don't need source when activating your virtual environment. You can type venv\Scripts\activate instead.
The workshop mostly consists of live code demonstrations. You can find simple notes from the demos in the file polars_introduction.py. Use jupytext to convert the notes to a Jupyter Notebook if you prefer.
Demonstration code, exercises, and solutions are licensed under an MIT license.