This repo usese the CMS ML HATS tutorial as a base for examples shown at the ML Basic Tools and Resources sessions of the COFI 2023 Winter School
To run locally, run commands like these from your terminal:
This environment focuses on having pytorch, torchvision, pytorch_lightning, wandb, tritonclient, jupyter, and a few other niceties installed.
This will be slightly easier to solve with the old conda installer
# Download the setup bash file for your machine from here https://github.com/conda-forge/miniforge#mambaforge
# e.g. wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
chmod u+x Mambaforge-Linux-x86_64.sh
./Mambaforge-Linux-x86_64.sh # follow instructions in the installation
git clone https://github.com/nickmanganelli-sr/cofi-2023.git
cd cofi-2023
mamba env create -f environment.yml
conda activate cofi-2023
jupyter lab # this will create a JupyterLab instance from which you can run all the notebooks.
This environment focuses on installing as much as possible, capable of running pytorch, keras, tensorflow, pytorch_geometric, and many other packages (in addition to the ones in the streamlined environment). The yaml file and environment name differ.
WARNING: This is a very heavy installation, and solving dependencies on different OS's may take unreasonably long with an old installation of conda. Please ensure you use one with the mamba solver (e.g. with the directed miniforge installation below)
# Download the setup bash file for your machine from here https://github.com/conda-forge/miniforge#mambaforge
# e.g. wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
chmod u+x Mambaforge-Linux-x86_64.sh
./Mambaforge-Linux-x86_64.sh # follow instructions in the installation
git clone https://github.com/nickmanganelli-sr/cofi-2023.git
cd cofi-2023
mamba env create -f environment-ext.yml
conda activate cofi-2023-ext
jupyter lab # this will create a JupyterLab instance from which you can run all the notebooks.
This is a set of tutorials for the CMS Machine Learning Hands-on Advanced Tutorial Session (HATS).
They are intended to show you how to build machine learning models in python, using Keras
, TensorFlow
, and PyTorch
, and use them in your ROOT
-based analyses.
We will build event-level classifiers for differentiating VBF Higgs and standard model background 4 muon events and jet-level classifiers for differentiating boosted W boson jets from QCD jets using dense and convolutional neural networks.
We will also explore more advanced models such as graph neural networks (GNNs), variational autoencoders (VAEs), and generative adversarial networks (GANs) on simple datasets.
The recommended method for running the tutorials live is the Vanderbilt Jupyterhub, follow the instructions here.
Not as well supported, but instructions are here.
All these notebooks can be run on your local machine as well. It can often be useful to test your models and pipelines locally, but it is not recommended to run full trainings as these can be resource-intensive.
To run locally, run these commands from your terminal:
# Download the setup bash file for your machine from here https://github.com/conda-forge/miniforge#mambaforge
# e.g. wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-Linux-x86_64.sh
# Install: (the mamba directory can end up taking O(1-10GB) so make sure the directory you're using allows that quota)
chmod u+x Mambaforge-Linux-x86_64.sh
./Mambaforge-Linux-x86_64.sh # follow instructions in the installation
git clone https://github.com/FNALLPC/machine-learning-hats/
cd machine-learning-hats
mamba create -f environment.yml
mamba activate machine-learning-hats
jupyter lab # this will create a JupyterLab instance from which you can run all the notebooks.
You can launch this repository in a "Binder" instance using: , or for a specific notebook by navigating to the rocket icon on the website and clicking on the Binder option.
This may be a more convenient, but it has not been well tested and the set-up time can be slow.
Each notebook can also be launched in a Google Colab instance by clicking "Google Colab" option in the menu bar above. To use this, you will have to install any extra libraries needed for the tutorial yourself and re-download the relevant datasets each time.
The indico page is: https://indico.cern.ch/event/1311559/
The Mattermost for live support is: https://mattermost.web.cern.ch/cms-exp/channels/hatslpc-2023
The datasets we will use are located here:
This project is created using the excellent open source Jupyter Book project and the executablebooks/cookiecutter-jupyter-book template.