# What is "AI"?

Terms such as *artificial intelligence* (AI), *machine learning* (ML) and *deep learning* (DL) are often used interchangeably, but represent different concepts in the field of intelligent machines and computer systems.

## AI

*Artificial intelligence* (AI) is the overarching field that encompasses everything to do with the development of machines or systems that can mimic human intelligence. AI can be thought of as the general goal of making machines intelligent and enabling them to perform tasks that normally require human intelligence, such as understanding natural language or recognizing images. AI can be achieved through various techniques, and one of the most well-known methods is *machine learning*.

John McCarthy, one of the "founding fathers" of this field, defines AI as follows:

> AI is the science and engineering of making intelligent machines, especially intelligent computer programs. It is related to the similar task of using computers to understand human intelligence, but AI does not have to confine itself to methods that are biologically observable.
>
> -- John McCarthy

## Machine Learning

*Machine learning* (ML) is a branch of AI that focuses on a specific approach to achieving artificial intelligence. At its core, machine learning is about developing algorithms that enable computers to learn from data. Instead of relying on explicit programming a defined set of rules, these algorithms adapt to the underlying data and improve their performance over time by analyzing and learning from the information provided to them.

It's like teaching a computer to recognize cats in photos by showing it thousands of cat pictures and then letting it recognize patterns on its own. Machine learning plays an important role in a variety of applications, from recommendation systems for products to predictive analytics and image recognition.

<br />

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/ml-traditional.png?raw=true" alt="prog-ml" width="500" style="float: right;"/>
<figcaption>Fig.1 - Traditional Programming in contrast to Machine Learning.</figcaption>
</figure>

<br />

An engineering-oriented definition for ML as stated by Tom Mitchell:

> A computer program is said to learn from experience **E** with respect to
some task **T** and some performance measure **P**, if its performance on **T**, as
measured by **P**, improves with experience **E**.
>
>-- Tom Mitchell


## Deep Learning

*Deep learning* (DL) is a subfield of machine learning that uses artificial neural networks inspired by the structure and functions of the human brain. These neural networks consist of layers of interconnected nodes that process and transform data.Deep learning has shown remarkable success in complex tasks such as image and speech recognition, natural language processing and even mastering complex games.The "deep" in deep learning refers to the many layers of these neural networks.

To clarify the differences between these terms, one can think of *AI* as the overarching goal, *machine learning* as one of the primary methods to achieve *AI*, and *deep learning* as a specialized and very effective approach within ML.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/ai-ml-dl.png?raw=true" alt="ai-ml-dl" width="400"/>
<figcaption>Fig.2 - The differences of the term AI, Machine Learning, and Deep Learning.</figcaption>
</figure>

# The ML landscape

The field of ML involves a large amount of different types of machine learning systems. To give a clearer picture it is useful to classify them in broad categories, based on different criteria.

* How they are supervised during training?
* Do they learn incrementally or on the fly?
* Do they compare new data points to known ones, or do they detect patterns in the training data to build a predictive model?

These criteria are not mutually exclusive and can be combined. For example, a state-of-the-art E-Mail spam filter may learn on the fly using a deep neural network model trained using human-provided examples of spam and ham, which makes it an online, model-based, supervised learning system.

## Training Supervision

Machine learning systems can be categorized based on the amount and nature of supervision they receive during training. The "classical" categories, include supervised learning, unsupervised learning, semi-supervised learning, and reinforcement learning.

### Supervised Learning

In supervised learning, the training set you feed to the algorithm includes the
desired solutions, called *labels*.

A typical supervised learning task is *classification*. The spam filter is a good example of this: it is trained with many example emails along with their class (spam or ham), and it must learn how to classify new emails.

Another typical task is to predict a target numeric value, such as the price of a car, given a set of features (mileage, age, brand, etc.). This sort of task is called *regression*.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/supervised_ml.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.3 - E-Mail classification as an example of supervised learning.</figcaption>
</figure>

### Unsupervised learning
In unsupervised learning, the training data is unlabeled. The system tries to learn without a teacher.

For example, say you have a lot of data about your blog's visitors. You may want to run a clustering algorithm to try to detect groups of similar visitors. At no point do you inform the algorithm which group a visitor belongs to; it discovers these connections autonomously without your assistance.

Another common unsupervised task is *association rule learning*, in
which the goal is to dig into large amounts of data and discover interesting
relations between attributes. For example, suppose you own a supermarket.
Running an association rule on your sales logs may reveal that people who
purchase barbecue sauce and potato chips also tend to buy steak. Thus, you
may want to place these items close to one another.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/unsupervised_ml.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.4 - Clustering user groups as an example of supervised learning.</figcaption>
</figure>

### Semi-supervised learning

Since labeling data is usually time-consuming and costly, you will often have plenty of unlabeled instances, and few labeled instances. Some algorithms can deal with data that's partially labeled. This is called *semi-supervised learning*.

Some photo-hosting services, such as Google Photos, are good examples of
this. Once you upload all your family photos to the service, it automatically recognizes that the same person *A* shows up in photos 1, 5, and 11, while another person *B* shows up in photos 2, 5, and 7. This is the unsupervised part of the algorithm (clustering). Now all the system needs is for you to tell it who these people are. Just add one label per person and it is able to name everyone in every photo, which is useful for searching photos.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/semi-supervised_ml.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.5 - Semi-supervised learning with two classes (triangles and squares): the unlabeled examples (circles) help classify a new instance (the cross) into the triangle class rather than the square class, even though it is closer to the labeled squares.</figcaption>
</figure>

### Self-supervised learning
Another approach to machine learning involves actually generating a fully labeled dataset from a fully unlabeled one. Again, once the whole dataset is labeled, any supervised learning algorithm can be used. This approach is called *self-supervised learning*.

For example, if you have a large dataset of unlabeled images, you can randomly mask a small part of each image and then train a model to recover the original image. During training, the masked images are used as the inputs to the model, and the original images are used as the labels.

The resulting model may be quite useful in itsel - for example, to repair damaged images or to erase unwanted objects from pictures. But more often than not, a model trained using self-supervised learning is not the final goal. You'll usually want to tweak and fine-tune the model for a slightly different
task.

Suppose you actually want to have a pet classification model: given a picture of any pet, it will tell you what species it belongs to. If you have a large dataset of unlabeled photos of pets, you can start by training an image-repairing model using self-supervised learning. Once it's performing well, it should be able to distinguish different pet species: when it repairs an image of a cat whose face is masked, it must know not to add a dog's face. Assuming your model's architecture allows it, it is then possible to fine-tune the model so that it predicts pet species instead of repairing images.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/self-supervised_ml.png?raw=true" alt="ai-ml-dl" width="800"/>
<figcaption>Fig.6 - Self-supervised learning example: A masked image of a cat as input (left) and the target (right) we want our model to predict.</figcaption>
</figure>

### Reinforcement learning
In Reinforcement learning, the learning system, called an ***agent*** in this context, can observe the environment, select and perform an ***action***, and gets the changed ***state*** of its environment and a ***reward*** in return (or penalty in the form of a negative reward). It must then learn by itself what is the best strategy, called a *policy*, to get the most reward over time. A policy defines what action the agent should choose when it is in a given situation.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/reinforcement_learning.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.7 - For example, many robots implement reinforcement learning algorithms to learn how to walk.</figcaption>
</figure>

## Batch vs. Online Learning

Another criterion used to classify machine learning systems is whether or not
the system can learn incrementally from a stream of incoming data.

### Batch learning

Batch learning involves training a ML system on the entire dataset at once. In batch learning, the algorithm processes the entire dataset (often multiple times, where one iteration is known as ***epoch***) to update the model parameters. The system is trained and updated incrementally until it converges to an optimal solution or until a stopping criterion is met. After training is complete, the ML system is deployed for making predictions on new data. Batch learning is commonly used when the entire dataset is available upfront and can fit into memory. It is often associated with offline training where there is no need for real-time updates or adjustments to the model.

In many situations a model's performance can decay slowly over time, simply because the world continues to evolve while the model remains unchanged. This phenomenon is often called ***model rot*** or ***data drift***. The solution is to regularly retrain the model on up-to-date data, depending on the task at hand. A system classifing pictures of cats and dogs will probably need no incremental updates, but if the model makes predictions on the financial market, then it is likely to decay quite fast.

### Online Learning

Online learning, also known as incremental learning or streaming learning, involves updating the model continuously as new data becomes available. Instead of processing the entire dataset at once, the model is updated iteratively, typically after receiving each new data point or a small group of data points called ***minibatches***.

Online learning is well-suited for scenarios where data is continuously streaming in and the model needs to adapt to changing patterns or trends in real-time. This approach is commonly used in applications such as recommendation systems, fraud detection, and monitoring systems where immediate responses to new data are required.



## Instance-Based Versus Model-Based Learning

ML systems can also be classified based on how they generalize. While performance on training data is important, the goal is to excel on new unseen instances. Generalization can be classified in two main approaches: instance-based learning and model-based learning.

### Instance-based learning

The simplest form of learning is simply to learn by heart. For instance, in creating a E-Mail spam filter, one might simply flag emails *identical* to those previously marked as spam by users, though this approach isn't optimal.

A more effective method involves flagging emails *similar* to known spam. This requires a *measure of similarity* between emails, for example by counting shared words. The system would flag an
email as spam if it has many words in common with a known spam email.

This approach, known as instance-based learning, relies on memorized examples and compares new instances to the learned examples using a similarity measure. For example, in a classification scenario, a new instance would be categorized based on its similarity to previously classified instances.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/instance-based.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.8 - Instance-based learning with two classes (triangles and squares): the new instance is classified by using a distance measure to find the nearest neighbours</figcaption>
</figure>

### Model-based learning

Model-based learning, on the other hand, involves building a generalized model from the training data, which can then be used to make predictions on new, unseen instances. Instead of storing all training instances, model-based learning extracts patterns and relationships from the data to create a representation of the underlying structure.

In model-based learning, the system learns parameters or coefficients that define the relationships between the input features and the target variable. This model can then be used to predict the target variable for new unseen instances.

<figure>
<img src="https://github.com/bbirke/ml-python/blob/main/images/model-based.png?raw=true" alt="ai-ml-dl" width="600"/>
<figcaption>Fig.9 - Model-based learning with two classes (triangles and squares): the new instance is classified by a trained model. The dotted line corresponds to the decision boundary of the model.</figcaption>
</figure>