# What is Machine Learning

All of computer programming has revolved around solving problems by specifying a set list of instructions that needed to be executed. This field of computer science deals with modeling or defining the "Learning" process mathematically so that loosely specific instructions can be transformed into precise processes leading to significant automation.

On it's most basic level, a machine learning model (the architecture and the parameters of which are problem specific) trains and learns the patterns from a dataset and then tries onto replicating these results assisting in predictions and generation of new results that can be implemented onto the real world.

## Categories of Machine Learning

At its core, machine learning can be divided into two primary categories: supervised learning and unsupervised learning.

Supervised learning involves creating a model that captures the relationship between observed features of data and the corresponding labels. Once established, this model can be utilized to assign labels to new, unseen data. Supervised learning is further divided into classification and regression tasks: classification deals with discrete category labels, whereas regression focuses on continuous labels. We'll explore examples of both in the next section.

On the other hand, unsupervised learning models the features of a dataset without relying on any labels, often described as 'allowing the data to reveal its own structure.' This category includes tasks like clustering and dimensionality reduction. Clustering algorithms identify distinct groups within the data, while dimensionality reduction techniques aim to find more concise representations of the data. Examples of both types of unsupervised learning will also be discussed in the following section.

Additionally, there are semi-supervised learning methods, which bridge the gap between supervised and unsupervised learning. These methods are particularly useful when only partial labeling of the data is available

## Forms of Machine Learning Application

Let's look at a general picture of how machine learning is implemented in real world.

### Classification: Predicting discrete labels

Let's take a look at a simple classification task. Consider the image below. Where the task is to partition the given data points into clusters.
![image.png](attachment:image.png)

Imagine we're dealing with data points that have two characteristics, like coordinates on a map. Each point is either blue or red, representing two different categories. Our goal? To create a system that can predict whether a new point should be blue or red based on its location.


Now, there are plenty of ways to tackle this classification problem, but we're going to keep it simple. We'll assume we can separate these two groups by drawing a straight line across our map. Points on one side will be blue, and on the other, red.

In machine learning lingo, our "model" is essentially saying, "Hey, a straight line can divide these groups." The "model parameters" are the specifics of that line - where it sits and which way it's tilted.


Here's where it gets interesting: we don't just guess where to put this line. We let the data guide us. This process of figuring out the best place to draw the line is what we call "training the model." It's the "learning" part of machine learning, where our system gets smarter by looking at lots of examples.


It's a bit like teaching a computer to be a really good guesser, using nothing but a ruler and a bunch of colored dots!

This is what the final model would look like.

![image.png](attachment:image.png)

With the model trained, it's ready for real-world application. We can now feed it fresh, unlabeled data and let it work its magic. The model will draw its decision boundary through this new dataset, effectively categorizing each point. This process is typically referred to as prediction. Check out the illustration below for a visual representation:

![image.png](attachment:image.png)

Classification in machine learning involves categorizing data into discrete classes. While this might seem straightforward—you could eyeball this simple dataset and draw a dividing line—the real power of machine learning shines when dealing with massive, multidimensional datasets. It's in these complex scenarios where the approach truly proves its worth, offering a scalable solution that human intuition alone can't match.

### Regression: Predicting Continuous Labels

In contrast to predicting discrete labels, let's now look at predicting continuous labels.

Consider the data shown below.
![image.png](attachment:image.png)

Here we have a 2 Dimensional Data and the color represents the intensity of the output of the two dimensional feature of each data point.

There are multiple models that can solve this problem but a simple linear regression can fix the puzzle. This model basically fits a plane into the data and finds a perfect fit.

![image.png](attachment:image.png)

Take a look at how the `feature_1`$\times$`feature 2` plane, mirrors our earlier two-dimensional plot. This time, though, we've gotten creative with label representation, using both color and 3D positioning. It's pretty clear from this perspective that if we were to slide a plane through this 3D data, we could make some solid predictions about labels for any given input parameters. Now, let's circle back to our 2D view. When we fit that plane I mentioned, we end up with something that looks like this:

![image.png](attachment:image.png)

### Clustering: Infering labels on unlabeled Data

Supervised learning algorithms, like the classification and regression examples we've explored, aim to build models that predict labels for new data. On the flip side, unsupervised learning tackles data description without relying on known labels.

A popular unsupervised learning technique is clustering, where data points are automatically grouped into discrete categories. Picture this: you've got some two-dimensional data plotted out, and it looks something like this:


![image.png](attachment:image.png)

At a glance, you can see these points naturally fall into distinct groups. This is where clustering shines. It leverages the data's inherent structure to figure out which points are related. Let's put the speedy and intuitive k-means algorithm to work. Check out the clusters it identifies:

![image.png](attachment:image.png)

K-means clustering is like a data detective, pinpointing k cluster centers as its model. The algorithm's goal? Find the sweet spot where each data point is as close as possible to its assigned center.

Sure, in two dimensions, this might seem like child's play. But here's where it gets interesting: as our datasets grow larger and more complex, these clustering techniques become powerful tools. They can unearth valuable insights that might otherwise stay hidden in the data haystack.


### Dimensionality reduction: Inferring structure of unlabeled data

Another fascinating unsupervised learning technique is dimensionality reduction. It's a bit more abstract than our previous examples, but it's incredibly powerful. The goal? To distill high-dimensional data into a lower-dimensional form that still captures the essence of the original dataset.

Think of it as finding the "Cliff's Notes" version of your data – it preserves the key plot points while trimming away the less crucial details. Different dimensionality reduction methods have their own unique ways of deciding what's important to keep. We'll dive deeper into these techniques when we explore Manifold Learning later on.


Let's consider the following data points 
![image.png](attachment:image.png)

When you look at this data, you can't help but notice its interesting structure. It's essentially a one-dimensional line that's taken a spiraling journey through two-dimensional space. In a way, you could argue that this data is fundamentally one-dimensional, despite its higher-dimensional disguise.

To tackle this kind of scenario, you'd want a dimensionality reduction model that's clever enough to pick up on this nonlinear embedded structure. The goal? To tease out that lower-dimensional representation hidden within the data's twists and turns.

This approach is all about peeling back the layers to reveal the data's true nature, even when it's playing dress-up in a higher-dimensional costume.