Machine Learning, Big Data, and Deep Learning in Astronomy
A Severo Ochoa School of the Instituto de Astrofísica de Andalucía (CSIC)
- School materials
- Execution of the tutorials
- Credits and acknowledgements
This repository hosts the materials for the school and instructions on how to run the tutorials. It also contains a conda environment needed to execute the python notebooks, either locally or in a cloud platform.
The main web page of the school is https://www.granadacongresos.com/somachine2021. It contains general information, registration, topics, tutors and the daily schedule.
- Tutorial 01+02: Practical ML: Scikit-learn (Juan Antonio Cortés, UGR)
- Tutorials 03+04: Big Data: Algorithms and Spark, Data Analysis with Spark (Diego García, UGR)
- Tutorial 05: Practical DL: A Quick Glance (Francisco Pérez, UGR)
- Placing AI and ML in Context - Jorge Casillas, UGR
- Theoretical Foundations of ML: Classical Problems, Algorithms and Validation - Julián Luengo, UGR
- Data Preprocessing in ML - Julián Luengo, UGR
- Singular Problems in ML - Salvador García, UGR
- ML in Astronomy: An Overview - Kyle Boone, University of Washington
- Normalizing flows for the HR diagram
- Examples of autoencoder applications
- Galaxy classification in surveys - Helena Domínguez Sánchez, ICE-CSIC
- Big Data: Foundations and Frameworks - Alberto Fernández, UGR
- Big Data: Algorithms and Spark (Theoretical and Practical) - Diego García, UGR
- BD in Astronomy: An Overview - Federica Bianco, University of Delaware
- Vera C. Rubin Observatory: A Big Data Machine for the 21st Century - Meredith Rawls, Vera Rubin Observatory
- Theoretical Foundations of DL and CNNs - Anabel Gómez, UGR
- Autoencoders: An Overview and Applications - David Charte, UGR
- Successful case studies of DL - Siham Tabik, UGR
- An Overview of Deep Learning in Astronomy - Ashish Mahabal, Caltech
- Emulators and their application to supernova data - Wolfgang Kerzendorf, Michigan State University
- The SKA Telescope Data Deluge - Javier Moldón, IAA
- The SKA Telescope Data Challenges - Anna Bonaldi, SKAO
- Applications of unsupervised learning to astronomical datasets - Dalya Baron, Tel Aviv University
- Deep Learning and Image Reconstruction - Andrés Asensio, IAC
Execution of the tutorials
Tutorials 01, 02 and 05 can be followed as Jupyter notebooks using python. Information below shows how to run those notebooks on cloud services or in your local machine. Tutorials 03 and 04 use Spark, which you can install in your machine (see instructions here) or can be executed using the Virtual Machine (VM) provided above.
There are three options to execute the Jupyter notebook tutorials using python (01, 02 and 05). Choose whatever suits you more:
- Execute tutorials on the cloud using myBinder. A temporary virtual machine will be created in myBinder.org containing a Jupyter Lab and the corresponding notebooks. No user access required, just follow the link. This service is temporary, so nothing stored here will persist, and the machine will be removed automatically and without warning after some time of inactivity.
- Execute tutorials on JupyterHub instance at IAA-CSIC. Similar to first option, but the virtual machines will be served by Jupyter Hub deployed on the host institution (IAA). You need to login to your dedicated machine (see below for credentials) and the Jupyter instance will be available for two weeks from the start of the school. Your progress will be stored and can be retrieved every time you access the service. You can use this service to experiment and work on your own files.
- Use your own machine. This repository contains a conda environment to help you install all the required software.
Option 1. Execute notebook tutorials on the cloud
Interactive mybinder link to execute the python notebooks:
or follow this link
myBinder.org is a free and open organization providing free cloud resources. Therefore, the resources may be limited and the changes you make in the notebooks or the system are not persistent. Please, always keep a local copy of any file you want to keep, because Binder will automatically eliminate the virtual machine assigned to you after some time of inactivity.
Option 2. Execute notebook tutorials in the JupyterHub instance at IAA-CSIC
The IAA-CSIC Severo Ochoa Center provides a prototype JupyterHub instance available here:
Login with user:
It will take some minutes to create the instance (especially the first time you access). You can access your instance in
https://spsrc-jupyter.iaa.csic.es/user/<username>/lab/ and you can start by using the navigation bar on the left to open the file
A lightweight desktop is also available, you can access it immediately by changing
desktop in the path. For example go to:
https://spsrc-jupyter.iaa.csic.es/user/<username>/desktop/ and you will have a desktop environment with graphical interface in your browser.
Contrary to option 1, these instances offer persistent storage throughout the duration of the school. All virtual machines and their contents will be removed by the 2nd of May, 2021.
In case of problems using this JupyterHub instance please file an issue at https://github.com/spsrc/somachine2021/issues
Option 3. Execute notebook tutorials in your local machine
We recommend using
conda to manage the dependencies. Miniconda is a light-weight version of Anaconda. First we show how to install Miniconda if you don't have it already. More details here
Miniconda for Linux:
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh bash ./Miniconda3-latest-Linux-x86_64.sh rm ./Miniconda3-latest-Linux-x86_64.sh
Miniconda for macOS:
curl -O https://repo.anaconda.com/miniconda/Miniconda3-latest-MacOSX-x86_64.sh bash Miniconda3-latest-MacOSX-x86_64.sh rm Miniconda3-latest-MacOSX-x86_64.sh
Note that the installation will suggest you to modify your bashrc so conda is always available, which is a good idea in general. Alternatively, if you want the Miniconda installation to be encapsulated in your working directory without affecting the rest of your system you can install it with the following option. The first command only needs to be done once, and the second one needs to be done everytime you open a new terminal.
bash ./Miniconda3-latest-Linux-x86_64.sh -b -p my_conda_env source my_conda_env/etc/profile.d/conda.sh
Get the contents of the school
Download this repository and create conda environment with the dependencies
git clone https://github.com/spsrc/somachine2020.git cd somachine conda env create -f environment.yml conda activate somachine
If you want to use Jupyer Lab:
conda install -c conda-forge jupyterlab jupyter lab
Credits and acknowledgements
This repository and the Jupyter Hub service for the tutorials are provided by the SKA Regional Centre Prototype, which is funded by the State Agency for Research of the Spanish MCIU through the "Center of Excellence Severo Ochoa"; award to the Instituto de Astrofísica de Andalucía (SEV-2017-0709), the European Regional Development Funds (EQC2019-005707-P), by the Junta de Andalucía (SOMM17_5208_IAA), project RTI2018-096228-B-C31(MCIU/AEI/FEDER,UE) and PTA2018-015980-I.