Skip to content

jpivarski-talks/2022-08-03-codas-hep-columnar-tutorial

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Materials for CoDaS-HEP 2022 Columnar Data Analysis tutorial

This repository contains three notebooks on columnar data analysis, presented at CoDaS-HEP at 12:30pm on August 3, 2022 by Jim Pivarski and Ioana Ifrim.

How to participate

You don't need to install anything on your computer to participate; I encourage everyone to join on Binder.

Launch Binder

Binder tips:

If your notebook becomes unresponsive, reconnect to the kernel or restart the kernel from the "Kernel" menu.

While working on exercises, keep a copy of your work-in-progress in a text editor, so that you don't lose them if the web page reloads. "Run → Run All Above Selected Cell" and "Kernel → Restart Kernel and Run up to Selected Cell" will rerun all of the code to get your Python session back to the state it was in before a page reload or kernel restart.

If you want to install and run on your computer

We use the libraries and versions listed in environment.yml. You also need to install JupyterLab. You don't have to install the libraries with pip (you can use conda, for instance), and you don't need to use the exact versions that this tutorial has been pinned to, but you should probably use at least those versions.

We won't spend any time in the tutorial session on installing libraries. If an installation on your computer doesn't work, switch to Binder by pressing the button above.

Browsing the material online without Binder

If you want to see the notebooks online but don't want to execute them in Binder, the order is

  • part-1.ipynb: overview of programming paradigms and array-oriented in particular
  • part-2.ipynb: overview of tools for columnar data analysis of particle physics data
  • project.ipynb: set-up and challenge problems—discovering the Higgs boson

The default notebooks are unevaluated. To see static outputs from a previous run, look in the evaluated directory.

Related tutorials

At SciPy 2022, Jim presented array-oriented programming for a non-physics audience: materials (GitHub and Binder), video (YouTube).

The focus and the problems are different, so if you'd like more practice, take a look!