# Introduction To Machine Learning

## Artificial Intelligence

**Artificial Intelligence (AI)** is a field of computer science aimed at designing systems that can perform tasks we usually associate with human thinking. These tasks include learning from experience, solving complex problems, understanding language, and interpreting visual information. AI aims to copy the way humans think, reason, and make decisions, using powerful algorithms and vast amounts of data to accomplish these goals.

### Example:
A common example is **Google Assistant,** which can understand human language, process it, and perform tasks like setting alarms, reminders, playing music, and answering questions.

## Machine Learning 

**Machine Learning (ML)** is a part of AI that helps computers learn from data in a way that's similar to how people learn from experience. Instead of being given exact instructions, ML algorithms look for patterns in the data and use those patterns to make decisions or predictions. As they process more data over time, they get better at recognizing trends, making predictions, and improving accuracy just like how we get better at something the more we practice.

### Example:
A simple use case of ML is **email spam filtering,** where the system learns to distinguish between spam and important emails based on patterns in the data.

# Types Of Machine Learning

Machine learning can be broadly clssified into two types:
1. **Supervised Machine Learning**
2. **Unsupervised Machine Learning**

# Supervised Machine Learning

In **supervised machine learning,** we train the model based on a dataset which has input and output values. From the data present, we aim to train the model in such a way that when the model is presented with new, unseen data it can predict accurate outputs. We create algorithms in such a way that the distance between the data outputs and predicted outputs are minimum.

Supervised machine learning can be further classified into two types based on their applications:
1. **Regression**
2. **Classification**

## Regression

**Regression** aims at predicting continuous numeric values based on input data. The model learns the relationship between the input features and the continuous output variables. The number of input features can be either single or multiple.

### Example:
- **Single Linear regression :** Predicting house prices based on the loaction.
- **Multiple Regression :** Predicting house prices based on location and heighbourhood.

Let’s dive into the concept of regression by looking at a simple dataset. We have some input values (X) and their corresponding output values (Y), as shown in the table below. Using this data, we’ll try to predict the output for inputs where we don’t have values yet.

|   Input (X)  | Output (Y)     |
| ------------- | -------------- |
|      1        | 1.2            |
|      2        | 1.9            |
|      3        | 3.1            |
|      4        | 3.8            |
|      5        | 5.2            |

In the graph below, you’ll see these data points plotted out. By applying linear regression, we can find a best-fit line that helps us understand the relationship between X and Y.

![Linear Regression](Linear_Regression_Example1.png)

In the graph below, we have plotted a best-fit line that represents the predicted relationship between the input values (X) and the output values (Y). This line is derived from the existing data points and minimizes the average distance from these points, effectively capturing the trend of the data.

Using the best-fit line, we can estimate output values for inputs that we do not have corresponding data for. This capability is a fundamental aspect of regression analysis, allowing us to make informed predictions based on the observed data.

![Linear Regression](Linear_Regression_Example2.png)

This is how we make use of regression algorithms to find outputs based on input data. By analyzing a set of data points, we can create a model that captures the underlying relationship between the variables. The best-fit line derived from linear regression not only illustrates this relationship but also allows us to predict outputs for new inputs.

## Classification

**Classification** is used when the ouput is discrete. If it has two possible outcomes, it's called **binary classification** and if it has more than two possible outcomes, it is known as **multiclass-classification.** 

### Example:
Based on the number of hours studied, played and slept, we can determine if the student will pass or fail. Since there are only two possible outcomes  it can be classified under **binary classification problem.**

To make this concept easier to grasp, we can visualize it with a simple graph that highlights the two classes—“Pass” and “Fail”—based on the number of hours studied. This visual representation will help us better understand how classification works in practice!

![Classification](Classification_Example1.png)

# Unsupervised Machine Learning

In **unsupervised machine learning,** there is no input-output pairing or correct answer provided for any input. Instead, the model is given raw data to figure out patterns, realtionships and meaningful insights present in the raw data. It is useful when we are given complex raw data and we don't know what to look for.

Unsupervised machine learning can be broadly classified into two types:
1. **Clustering**
2. **Dimensionality reduction**

## Clustering

**Clustering** is a technique used in unsupervised machine learning to group similar data points into distinct clusters. The goal is to organize the data in such a way that items within the same cluster exhibit high similarity to one another while being dissimilar to those in other clusters.

### Example:
**Customer segmentation** is a perfect example of clustering in unsupervised learning. Imagine we have a dataset containing the salaries of different people. By applying clustering algorithms, we can group individuals into categories such as high-income, middle-income, and low-income. These clusters help businesses target their marketing efforts more effectively. For instance, high-income individuals might be shown luxury or premium products, while more budget-friendly items can be marketed to lower-income groups. This tailored approach allows for more relevant and personalized advertising, maximizing the impact of marketing strategies.

![Classification](Clustering_Example1.png)

## Dimensionality Reduction

**Dimensionality reduction** is a technique used to reduce the number of input features in a dataset while preserving its essential information. The goal is to simplify the dataset by eliminating redundant or less important features, making it easier to analyze and process.

### Example:
Imagine you have a dataset of thousands of images, where each image is made up of countless tiny pixels. Instead of analyzing every single pixel, which can be time-consuming, **dimensionality reduction** helps by focusing only on the most important features of the image. For instance, with **Principal Component Analysis (PCA)**, the model can identify key patterns or shapes in the images and reduce the amount of data you need to work with, while still keeping the important information. This makes tasks like recognizing objects in the images faster and easier, without losing much detail.
