This repository contains set of tutorial which are the introduction for the topics covered in the subject Machine Learning II of the Master in Artificial Intelligence tought at the three universities of Galicia, i. e., University of A Coruña (UDC), University of Santiago de Compostela (USC), University of Vigo (UVigo).
- David Mera Pérez (coordinator, USC)
- Enrique Fernández Blanco (UDC)
- David Olivieri Cecchi (UVigo)
The practices will be developed via Notebooks. In order to run them you will need to install Python (>3.8), a Notebook server (e.g. Jupyter) and all the necessary packages for executing the programs. There are different possible configurations, you can use the configuration more appropriate for your interest. However, in this manual we will give you some alternatives:
-
Google Colab environment to run the Notebooks. Colab allows you to load Notebooks and also create new ones. It is also possible to install new libraries using commands such as
!pip install xxx
. The free account has some limitations, most of them linked to the computing resources. -
Python (>3.8) and pip (pip is automatically installed if you are working with virtual environments or if you are installed Python from the official web page). Once Python is installed, Jupyter and the rest of the necessary packages can be installed using
pip install xxx
. -
Anaconda is a development framework focused on Data Science and Machine Learning, which is available in Windows, Linux and MacOS. This framework is composed of different packages and software including Jupyter and conda. The latter is a package environment management system that allows us to have different virtual package environments in the same system. While
pip
installs packages at system level and may cause conflicts, conda allows to have each package configuration in a separate virtual environment without conflicts. All the dependencies are managed by conda. Users can activate and deactivate virtual environments at their discretion. The way to manage the packages is using the commandconda install xxx
instead of pip. -
Miniconda is a free minimal installer for conda, which is available in Windows, Linux and MacOS. It is a small, bootstrap version of Anaconda that includes only conda, Python, the packages they depend on, and a small number of other useful packages. Anaconda is a huge framework with many unnecessary packages. Miniconda allows the user only install the necesary ones. It must be noted that Jupyter is not included in Minicoda. It must be installed as a new package
conda install jupyter
.
Conda commands for managing environments
https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
Conda cheatsheet
All these commands must be run in a terminal. These commands have been checked in a Linux system. Some differences may appear in other distributions.
Create an environment
conda create --name environment_name
It is possible to establish a specific Python version for the environment.
conda create --name environment_name python=x.y.z
Activate the environment
The activation of the environment allows us to work with the packages and features of that environment. It is important to keep in mind that outside the environment these libraries do not necessarily exist.
conda activate environment_name
Deactivate an environment
This command allows us to exit the environment.
conda deactivate
Remove an environment
This operation cannot be reversed.
conda remove --name environment_name -all
List all the available environments in our system
conda info --envs
Alternative option to list all the available environments:
conda envs list
Install a new package in the environment
To install a package in an environment, the environment must be active (conda activate environment_name
).
conda install package_name
List all the installed packages in a environment
The environment must be active to list all its packages ( conda activate environment_name
).
conda list
Export an environment file
conda env export > environment_file_name.yml
Export an environment file across platforms
conda env export --from-history > environment_file_name.yml
Create a new environment from a .yml file
conda env create -f environment_file_name.yml
conda create --name ml2
Make sure to note the installation directory of the virtual environment if you intend to use 'pip' from within it
Packages linked to the Online learning practices with River
conda install jupyter scikit-learn pandas matplotlib python-graphviz
conda install -c conda-forge rich
Note: The package installation from conda-forge use to be slow.
Important! There are two alternatives at this point (we recommend to use the second option):
- To acquire the River package from a conda repository, install it from the conda-forge. However, it's important to note that the latest available version of River in conda-forge is 0.13.
conda install -c conda-forge river
- To acquire the most recent version of the River package (0.21.0), install it using pip within your virtual environment. Ensure that you utilize the pip version located specifically within your virtual environment, not the global one. Locate your virtual environment directory, typically found at a path similar to /anaconda/envs/virtual_env_name/.
/home/user/anaconda/envs/virtual_env_name/bin/pip install river
Note: To locale your virtual environment storage you can execute the following commands:
conda activate virtual_env_name
echo $CONDA_PREFIX
Packages linked to the Federated Learning practices with Flower
There is a FLower package avaible in the conda-forge repository:
conda install -c conda-forge flwr
However, the official web page recommends to install it from the pip repository in order to get the newest and stablest version. Note that you should use the pip version located specifically within your virtual environment.
pip install flwr
Note: To locale your virtual environment storage you can execute the following commands:
conda activate virtual_env_name
echo $CONDA_PREFIX
For simulations that use the Virtual Client Engine, flwr should be installed with the simulation extra:
pip install flwr[simulation]
*This is our scenario
It's important to note that Flower is not a learning framework in itself, and as such, it wraps other machine learning frameworks like TensorFlow, PyTorch, or Scikit-learn in the communication layer to enable federated learning.
In the laboratory practices of this subject we are going to use tensorflow
conda install tensorflow
To obtain the notebooks for developing laboratory practices, you can either download the ZIP file from GitHub or clone the repository using Git via HTTPS or SSH. Please note that for the SSH connection, you must have an SSH certificate.
git clone git@github.com:ennanco/MIA_ML2.git
Important: The examples located within the initial three working units (online ML+Concept Drift) have been specifically tailored for compatibility with River 0.21.
In order to run Jupyter, the following command must be executed (the appropriate conda environment must be activated if necessary).
jupyter notebook
Once executed it is necessary to open the browser and access to http://localhost:8888/.
The security token can be found in the terminal where we execute the command.