Skip to content

cemac/LIFD_ENV_ML_NOTEBOOKS

Repository files navigation

Leeds Institute for Fluid Dynamics Machine Learning For Earth Sciences

Jupyter Notebooks

GitHub release GitHub top language GitHub issues GitHub last commit GitHub All Releases GitHubDOI

Twitter Follow

Leeds Institute for Fluid Dynamics (LIFD) has teamed up with the Centre for Environmental Modelling and Computation (CEMAC) team to create Jupyter notebook tutorials on the following topics.

  1. ConvolutionalNeuralNetworks
  2. Physics_Informed_Neural_Networks
  3. GaussianProcesses
  4. RandomForests
  5. GenerativeAdversarialNetworks
  6. AutoEncoders
  7. DimensionalityReduction
  8. XGBoost

PLEASE NOTE YOU MUST CLONE RECURSIVELY (SEE BELOW)

These notebooks require very little previous knowledge on a topic and will include links to further reading where necessary. Each notebook will take about two hours to run through and should run out of the box on home installations of Jupyter notebooks. These notebooks are designed with automatic checking of Python environment files to remain easy to set up into the future.

As this resource grows, in order to not make the repository unwieldy this repository is made up of submodules that can be cloned individually.

How do I get started?

Some tutorials are so lightweight you can run them on binder. The others we recommend running on your local machine. To get started, either clone this repository (LARGE SIZE) or select a tutorial to clone and run each tutorial separately.

Colab enabled tutorials

  1. GaussianProcesses
  2. RandomForests
  3. GenerativeAdversarialNetworks
  4. DimensionalityReduction

Binder enabled tutorials

  1. GaussianProcesses
  2. RandomForests
  3. GenerativeAdversarialNetworks
  4. AutoEncoders

Cloning the whole repository

bash git clone --recursive git@github.com:cemac/LIFD_ENV_ML_NOTEBOOKS.git

then follow the individual README.md instructions.

Cloning individual tutorials

  1. git clone --recursive git@github.com:cemac/LIFD_ConvolutionalNeuralNetworks.git
  2. git clone --recursive git@github.com:cemac/LIFD_RandomForests.git
  3. git clone --recursive git@github.com:cemac/LIFD_GaussianProcesses.git
  4. git clone --recursive git@github.com:cemac/LIFD_Physics_Informed_Neural_Networks.git
  5. git clone --recursive git@github.com:cemac/LIFD_GenerativeAdversarialNetworks.git
  6. git clone --recursive git@github.com:cemac/LIFD_AutoEncoders.git
  7. git clone --recursive git@github.com:cemac/LIFD_DimensionalityReduction.git
  8. git clone --recursive git@github.com:cemac/XGBoost-notebook.git

How to Run

These notebooks can run with the resources provided and the Anaconda environment setup. If you are familiar with Anaconda, Jupyter notebooks and GitHub then simply clone this repository and run it within your Jupyter notebook setup. Otherwise, please read the how to run guide. Individual notebooks have bespoke instructions.

git clone --recursive git@github.com:cemac/LIFD_ENV_ML_NOTEBOOKS.git
cd LIFD_ENV_ML_NOTEBOOKS

Requirements

Python

It is recommended you use Anaconda to manage the Python packages required. Some machine-learning libraries are large and if you only wish to run one notebook consider installing the environment provided for that specific notebook. Otherwise, you can install all required packages running the following commands.

conda env create -f <env-file>.yml
conda activate <env-name>
# save yourself some space with one extra command
conda clean -a

What if I forgot to clone recursively?

Not to worry. In your cloned folder simply run:

git submodule init
git submodule update --init --recursive

Hardware

These notebooks are designed to run on a personal computer. Although please note the techniques demonstrated can be very computationally intensive, so there may be options to skip steps depending on the hardware available, e.g. use pre-trained models.

Knowledge

No background knowledge of the environmental-science or machine-learning concepts is required. We have assumed some foundational knowledge but links are provided to in-depth information on the fundamentals of each concept.

Contributions

We hope that this resource can be built upon to provide a wealth of training material for Earth-science machine-learning topics at Leeds.

Licence information

Creative Commons License
LIFD_ENV_ML_NOTEBOOKS by cemac is licensed under a Creative Commons Attribution 4.0 International License.

Acknowledgements

Leeds Institute of Fluid Dynamics, CEMAC, Helen Burns, Matthew Gaddes, Oliver Pollard, Chetan Deva, Fergus Shone, Michael MacRaild, Phil Livermore, Giulia Fedrizzi, Eszter Kovacs, Ana Reyna Flores, Francesca Morris, Emma Pearce, Maeve Murphy Quinlan, Sara Osman, Jonathan Coney, Eilish O'grady, Leif Denby, Sandra Piazolo, Caitlin Howarth, Claire Bartholomew, Anna Hogg, Ali Gooya, Tamora James and Donald Cummins.