Skip to content
Material for my PyData Jupyter & Pandas Workshops, I'm also available for personal in-house trainings on request
Jupyter Notebook Python
Branch: master
Clone or download
Latest commit 83df047 Jul 10, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
data cleanup for binder Jul 8, 2019
helpers updates for EuroPython 2018 Jul 22, 2018
img updates for EuroPython 2019 Jul 8, 2019
notebooks
solutions updates for EuroPython 2018 Jul 23, 2018
.gitattributes updates for EuroPython 2018 Jul 22, 2018
.gitignore updates for EuroPython 2018 Jul 23, 2018
README.md Install jupyterlab Jul 9, 2019
environment.yml
requirements.txt

README.md

Analytics with Pandas and Jupyterlab

Follow-Along tutorial to get you started.

Poster

Pandas is the Swiss-Multipurpose Knife for Data Analysis in Python. With Pandas dealing with data-analysis is easy and simple but there are some things you need to get your head around first as Data-Frames and Data-Series.

The tutorial provides a compact introduction to Pandas for beginners for I/O, data visualisation, statistical data analysis and aggregation within Jupiter notebooks.

Binder

Run Jupyterlab in the cloud, requires internet access.

Binder

Installation

Local Installation

Copy this repository to your computer

# get this repository
git clone https://github.com/alanderex/pydata-pandas-workshop.git
cd pydata-pandas-workshop

Make sure to update to the latest vesion just when the training starts:

git pull

Having Anaconda installed simply create your ENV with

# install working environment with conda
conda env create -n pydata-pandas-workshop -f environment.yml

# environment should be activated now
# if not type: source activate pydata-pandas-workshop

In case the installation via file fails, simply:

conda env create -n pydata-pandas-workshop python=3.6
source activate pydata-pandas-workshop
conda install pandas jupyterlab xlrd xlsxwriter dask seaborn -y

Alternatively you can also create a python virtual enviroment and

pip install -r requirements.txt

If you don't want to use anaconda, you can use python3 and

pip install pandas jupyter barnum numpy matplotlib xlsxwriter seaborn bokeh jupyterlab parquet dask

(at your own risk)

Start Juypterlab

jupyter lab
# paste the url displayed in your browser, if it doesn't open anyway:
# http://localhost:8888/lab

A Practical Start: Reading and Writing Data Across Multiple Formats

  • CSV

  • Excel

  • JSON

  • Clipboard

  • data

    • .info
    • .describe

DataSeries & DataFrames / NumPy

  • Ode to NumPy
  • Data-Series
  • Data-Frames

Data selection & Indexing

  • Data-Series:
    • Slicing
    • Access by label
    • Index
  • Data-Frames:
    • Slicing
    • Access by label
    • Peek into joining data
  • Returns a copy / inplace
  • Boolean indexing

Operations

  • add/substract
  • multiply

Data Visualisation

  • plot your data directly into your notebook

Peek Into Statistical Data Analysis & Aggregation

  • Merging
  • Multi-Index
  • DateTime Index
  • Resampling
  • Pivoting

Scaling and Optimizing

  • Faster file I/O with Parquet
  • Scaling and Distributing with Dask
You can’t perform that action at this time.