# Dimensionality Reduction

### What is Dimensionality Reduction?
- Dimensionality reduction is a method for representing a given dataset using a lower number of features (i.e. dimensions) while still capturing the original data's meaningful properties.
- The reason we do this is to remove irrelevant or redundant features, or simply noisy data, to create a model with a lower number of variables.

### Word Bank
- ***Features***: They are often called variables and refer to the individual measureable properties or characteristics of the data. For example in a dataset about cars, features could be attributes like engine size, horesepower, weight, fuel efficiencey, anything measureable.

- ***Noisy data***: Irrelevant, random, or erroneous information that obscures the underlying patters or relationships in the dataset.

### Why do we use Dimensionality Reduction? 
It is used for several reasons in data analysis and machine learning, but it is primarily used to simplify data.
### Key Reasons
1. ***Reducing Overfitting***: In high-dimensional datasets, models try to overfit data because they want to try to fit the noisy data, which causes the underlying patterns of the dataset to be lost.

2. ***Improving Model Performance***: Reducing the number of features can lead to faster and more efficient machine learning algorithms.

3. ***Easier Data Visualization***: It is hard to visualize data with more than 3 features(dimensions), and reducing the demensions makes it easier to explore visual patterns.

4. ***Noise Reduction***: Removes a lot of noisy data, to focus on only the important data points.

5. ***The Curse of Dimensionality***: As the amount of features grows, the amount of data to ensure the data set is reliable grows exponentialy. This requires more space.



### Types of Dimensionality Reduction:

### Principal Component Analysis (PCA)
- The most common dimensionality reduction method.
- It combines and transforms the data set's feature to produce new features.
- These are called principal components.
- The principal components together comprise the majority or all the variance present in the original data set.
- PCA then projects data onto a new space defined by these new features.
- Focuses on data variance
###  
***Example***:
- We have a dataset about snakes with four variables:   
    - body length (X1), 
    - body diameter at widest point (X2) 
    - fang length (X3), 
    - weight (X4), 
    - age (X5). 
###  
Of course, some of these five features may be correlated, such as body length, diameter, and weight.

By reducing these data points we can create a data set with less vairables.

In [None]:
%% pip3 install -U scikit-learn # Get the datasets

In [4]:
# Import necessary libraries
from sklearn import datasets  # to retrieve the iris Dataset
import pandas as pd  # to load the dataframe
from sklearn.preprocessing import StandardScaler  # to standardize the features
from sklearn.decomposition import PCA  # to apply PCA
import seaborn as sns  # to plot the heat maps

### Linear Discriminant Analysis (LDA)

- LDA is similar to PCA as the new features are derived from the initial model. 
### 
- However, LDA is different as it focuses not only on data variance but class difference as well.
## 
- One goal of LDA is to maximize interclass difference while minimizing intraclass difference.
#
- Otherwise LDA is almost the same as PCA

## Sources
- https://www.ibm.com/topics/dimensionality-reduction#:~:text=Dimensionality%20reduction%20is%20a%20method,a%20lower%20number%20of%20variables
## 
- https://en.wikipedia.org/wiki/Dimensionality_reduction 

