Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.

CERN CSC 2019 Visualization Exercises


The overall goal of these exercises is to introduce some of the concepts we learnt during the lectures, but also to use some of the tools we use on a daily basis as data scientists.


Remember to select the 96 Python3 option from the software stack drop down when you 'spawn' the machine on SWAN.


git clone

Tutorial Contents

In these exercises we look at:

  • Visual Exploration of a Dataset - using visualization to explore data and tell a story of interesting insights found in our data. This will be performed using:
  • How to create visualizations using these tools for visualization of distributions, correlations, identifying outliers, etc.
  • How to customize visualizations to create more coherent visualizations by removing noise from plots such as distracting lines, axes boundaries, and so on.
  • For Altair, how to build a complex dashboard-like visualization in Jupyter.

The core exercises are all in the static visualization section, since this is what most people use when producing figures for example, they are also generally more scalable which is of particular importance when dealing with huge datasets.

The interactive visualization section is more for those who are already well versed in Matplotlib and Seaborn, and who want to extend their knowledge.


Thanks to the creator of the FIFA Kaggle Data set, and the SWAN team @ CERN for helping me in preparing this tutorial!

You can’t perform that action at this time.