# Introduction

Machine learning is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.

There are several types of machine learning problems:


* **Supervised learning**: This type of problem involves training a model on labeled data, so that it can make predictions on new and unseen data. Examples include regression and classification problems.


* **Unsupervised learning**: This type of problem involves training a model on unlabeled data, so that it can discover patterns or relationships in the data. Examples include clustering and dimensionality reduction problems.


* **Semi-supervised learning**: This type of problem is a hybrid of the above two, where the model is trained on a dataset that contains a small amount of labeled data and a large amount of unlabeled data.


* **Reinforcement learning**: This type of problem focuses on training models to make a sequence of decisions. The model is trained by learning from the consequences of its previous decisions.


* **Generative Models**: In this type of problem, the model is trained to generate new examples that are similar to the ones in the training set. Examples include Generative Adversarial Networks (GANs) and Variational Autoencoders (VAEs).



## Supervised learning 

Supervised learning is a type of machine learning in which a model is trained on labeled data, so that it can make predictions on unseen samples. The goal of supervised learning is to build a model that can capture the relationship between the input and output data, i.e accurately predict the output for a given input.

There are two main types of supervised learning problems: regression and classification.


1. **Regression**: In a regression problem, the goal is to predict a continuous value for the output. For example, predicting the price of a house based on its size, location, and other features. Linear regression and polynomial regression are some examples of regression algorithms that we will discuss in the lecture.


2. **Classification**: In a classification problem, the goal is to predict a discrete value for the output. For example, classifying an email as spam or not spam, or classifying an image as a picture of a dog or a cat. Logistic regression, decision trees, and support vector machines (SVMs) are some examples of classification algorithms.



## Unsupervised learning 

Unsupervised learning is a type of machine learning in which a model is trained on unlabeled data, so that it can discover patterns or relationships in the data. Unlike supervised learning, unsupervised learning does not have a specific output or target that the model is trying to predict. Instead, the goal is to find structure or hidden patterns in the data. The results of unsupervised learning are often visualized as scatter plots to make it easier to understand the underlying structure of the data.

There are several types of unsupervised learning problems, for instance:


1. **Clustering**: The goal of clustering is to group similar data points together. Clustering algorithms such as k-means can be used to group data points based on their similarities.


2. **Dimensionality reduction**: The goal of dimensionality reduction is to reduce the number of features or dimensions in the data while preserving as much information as possible. Methods such as principal component analysis (PCA) and linear discriminant analysis (LDA) can be used for dimensionality reduction.


Note that Unsupervised learning can also be used as a pre-processing step before supervised learning in order to reduce the amount of parameters.



## Semi-supervised learning

Semi-supervised learning is a type of machine learning that is a hybrid of supervised and unsupervised learning. It is used when a dataset contains a small amount of labeled data and a large amount of unlabeled data. The goal is to use the labeled data to train a model, and then use that model to label the unlabeled data.

The main difference between semi-supervised and supervised learning is the size of the labeled dataset. In supervised learning, the model is trained on a large dataset with many labeled examples. In contrast, in semi-supervised learning, the model is trained on a small dataset with only a few labeled examples.

One of the main advantages of semi-supervised learning is that it can make use of a large amount of unlabeled data to improve the performance of the model. This can be especially useful when labeled data is scarce or expensive to obtain.



## Reinforcement learning

Reinforcement learning (RL) is a type of machine learning that focuses on training models to generate a sequence of decisions. The model, called an agent, learns to interact with an environment in order to maximize a reward. RL is particularly useful when the goal is to train an agent to make decisions in a dynamic and uncertain environment. It is, for instace, widely used in several fields such as robotics, gaming, and finance.



## Generative models

Generative models are trained to generate new data points that are similar to the ones in the training set. The goal is to learn the underlying probability distribution of the data, and then to use it to generate new samples.

There are several types of generative models:


1. **Generative Adversarial Networks (GANs)**: GANs consist of two networks, a generator and a discriminator. The generator is trained to generate new samples that are similar to the training data, and the discriminator is trained to distinguish the generated samples from the real samples. Through this adversarial process, the generator learns to generate more realistic samples.


2. **Variational Autoencoders (VAEs)**: VAEs consist of two networks, an encoder and a decoder. The encoder maps the input data to a lower-dimensional representation, called the latent space. The decoder then maps the latent space back to the original data space. VAEs can be used to generate new samples by sampling from the latent space and passing it through the decoder.


Generative models can be used for a wide range of applications such as image synthesis or text generation and will be discussed in this lecture. 



## Importan Python packages

There are several useful Python packages for machine learning providing functionalities to load, process, visualize, model and evaluate data.


* **NumPy** is a package for scientific computing with Python. It provides support for large arrays and matrices along with a useful library of mathematical functions.


* **SciPy** is a package for scientific computing in Python. It provides advanced functionality for optimization and statistical methods.


* **Scikit-learn** is a machine learning library for Python. It provides a wide range of learning algorithms.


* **TensorFlow and Keras** TensorFlow is a library for machine learning. Keras is a high-level neural networks API running on top of TensorFlow allwing easy prototyping.


* **PyTorch** provides a dynamic computational graph, which allows for building and training neural networks in a more flexible way compared to static computational graphs like TensorFlow.


* **Pandas** is a package for structuring and managing data. 


* **Matplotlib** is a plotting library.


