# MACHILE LEARNING (ML) LANDSCAPE

## MACHINE LEARNING DEFINITION

[Machine Learning is the] field of study that gives computers the ability to learn
without being explicitly programmed.
—Arthur Samuel, 1959

### When use ML?

We could use machine learning if we are facing any of the following scenearios:

• Problems for which existing solutions require a lot of hand-tuning or long lists of
rules: one Machine Learning algorithm can often simplify code and perform better.

• Complex problems for which there is no good solution at all using a traditional
approach: the best Machine Learning techniques can find a solution.

• Fluctuating environments: a Machine Learning system can adapt to new data.

• Getting insights about complex problems and large amounts of data.

***

## TYPES OF ML

We can categorize machine learning into various types based on their training supervision, adaptability to real-time learning, or reliance on comparing new data points with known ones.

### Supervised/Unsupervised Learning

#### Supervised learning
Supervised learning is a machine learning paradigm where the algorithm learns from labeled data, which means it is provided with input-output pairs during training. The goal is to learn a mapping from inputs to outputs, allowing the algorithm to make predictions or decisions on unseen data.
For example, consider a supervised learning task of predicting house prices based on features such as size, number of bedrooms, and location. In this case, the algorithm would be trained on a dataset where each data point includes features of a house (input) along with its corresponding sale price (output). By learning from this labeled data, the algorithm can then predict the price of a new house given its features.
Some of the most important supervised learning algorithms are.

    • k-Nearest Neighbors
    
    • Linear Regression
    
    • Logistic Regression
    
    • Support Vector Machines (SVMs)
    
    • Decision Trees and Random Forests
    
    • Neural networks2

#### Unsupervised learning
Unsupervised learning is a machine learning paradigm where the algorithm learns patterns and structures from unlabeled data. It aims to discover hidden relationships or groupings within the data without explicit guidance.
An example of unsupervised learning is clustering customer data based on purchasing behavior. The algorithm groups customers into clusters based on similarities in their purchase history, without any predefined labels or categories. This can help businesses identify customer segments and tailor marketing strategies accordingly.

    • K-Means Clustering
    
    • Hierarchical Clustering
    
    •Principal Component Analysis (PCA)
    
    •Autoencoders
    
    •Generative Adversarial Networks (GANs)
    
    •Apriori Algorithm

#### Semisupervised learning
Semi-supervised learning is a paradigm that combines aspects of both supervised and unsupervised learning. In semi-supervised learning, the algorithm leverages a small amount of labeled data along with a larger pool of unlabeled data to improve its performance in tasks such as classification or regression.A practical example of semi-supervised learning is in document classification. Suppose you have a large collection of text documents, but only a small subset of them are labeled with their corresponding categories (e.g., "sports," "politics," "technology"). By employing semi-supervised learning techniques, such as co-training or self-training, the algorithm can utilize both the labeled documents and the unlabeled ones to improve its classification accuracy. It might initially learn from the labeled documents and then use its confidence in those predictions to make predictions on the unlabeled documents, iteratively refining its model with each iteration.

### Reinforcement learning

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by interacting with an environment. Unlike supervised learning, where the algorithm is trained on labeled data, and unsupervised learning, which deals with unlabeled data, reinforcement learning operates on a different principle: trial and error.In reinforcement learning, the agent takes actions in an environment and receives feedback in the form of rewards or penalties, indicating how well it performed the task. The goal of the agent is to learn a policy—a strategy for selecting actions based on states—that maximizes cumulative reward over time.
A classic example of reinforcement learning is training an AI to play a game, such as chess or Go. The agent (the AI player) interacts with the game environment by making moves and receives feedback in the form of wins, losses, or draws. By learning from these outcomes through trial and error, the agent gradually improves its strategy, learning to make better moves that lead to higher chances of winning.

### Batch and online learning

In batch learning, the model is trained on the entire dataset at once. It requires access to the complete dataset before training begins, and the model is updated based on this complete dataset.  Suppose you have a dataset of housing prices with various features like size, location, and number of rooms. In batch learning, you would use the entire dataset to train your model. For instance, you might use linear regression to predict housing prices based on these features. The model learns from the entire dataset in one go, adjusting its parameters to minimize the error across all examples
 In online learning, the model is trained incrementally as new data becomes available. It updates its parameters continuously, learning from each new data point or small batches of data as they arrive, without needing the entire dataset at once: Consider a spam email filter. With online learning, the filter learns continuously as it encounters new emails. As each new email is classified as spam or not spam, the model updates its parameters to improve its classification accuracy. The model adapts to changing trends in spam emails over time, without needing to retrain on the entire dataset each time..

### Instance based vs Model ba

In instance-based learning, the model does not explicitly learn a generalized representation of the data. Instead, it memorizes the training examples and makes predictions based on similarity measures between new instances and the instances in the training data.  Consider the k-nearest neighbors (k-NN) algorithm. In k-NN, when a new data point is encountered, the model looks at the k nearest data points (neighbors) in the training set and predicts based on the majority class or average value of those neighbors. There is no explicit model training; instead, the algorithm stores the entire training dataset and uses it to make predictions at runtime.n: In model-based learning, the model learns a generalized representation of the data during the training phase. This representation is typically captured by parameters or coefficients that define a model's structure. The trained model is then used to make predictions on new dale: Linear regression is a classic example of model-based learning. During training, the model learns the coefficients (weights) that define a linear relationship between input features and the target variable. Once trained, the model can predict the target variable for new instances based on their feature values, using the learned coefficien
***

## Exercises

This questions are proposed in book 'Hands On Machine Learning with Scikit Learn Keras and Tensorflow - 2019'

#### How would you define Machine Learning?

The capacity of computers to learn without explicity programming.

#### Can you name four types of problems where it shines?

#### What is a labeled training set?

#### What are the two most common supervised tasks?

#### Can you name four common unsupervised tasks?

#### What type of Machine Learning algorithm would you use to allow a robot to walk in various unknown terrains?

#### What type of algorithm would you use to segment your customers into multiple groups?

#### Would you frame the problem of spam detection as a supervised learning problem or an unsupervised learning problem? 

#### What is an online learning system?

#### What is out-of-core learning?

#### What type of learning algorithm relies on a similarity measure to make predictions?

#### What is the difference between a model parameter and a learning algorithm’s hyperparameter?

#### What do model-based learning algorithms search for? What is the most common strategy they use to succeed? How do they make predictions?

#### Can you name four of the main challenges in Machine Learning?

#### If your model performs great on the training data but generalizes poorly to new instances, what is happening? Can you name three possible solutions?

#### What is a test set and why would you want to use it? What is the purpose of a validation set?

#### What can go wrong if you tune hyperparameters using the test set?

#### What is repeated cross-validation and why would you prefer it to using a single validation set?et?to use it???
ts.sed
