## 19MAR
### Assignment

### Q1

In [None]:
Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.

In [None]:
Ans:- Min-Max scaling is a data normalization technique used in data preprocessing. It transforms the values of the dataset
to a scale between 0 and 1, which helps in improving the performance of machine learning algorithms. The formula for Min-Max 
scaling is as follows:

$x_{scaled} = \dfrac{x - x_{min}}{x_{max} - x_{min}}$

where x is the original value, x_min is the minimum value in the dataset, and x_max is the maximum value in the dataset.

For example, suppose we have a dataset of house prices, and we want to normalize the values of the "price" column using 
Min-Max scaling. The original prices range from $100,000 to $1,000,000, with a mean of $500,000. The normalized prices would
range from 0 to 1, with a mean of 0.5. Applying Min-Max scaling to this dataset would transform the original prices as 
follows:

Original prices: [100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000]

Scaled prices: [0.0, 0.111, 0.222, 0.333, 0.444, 0.556, 0.667, 0.778, 0.889, 1.0]

By applying Min-Max scaling to the prices, we can ensure that the prices are on the same scale and that the model can more 
easily interpret the data.

### Q2

In [None]:
Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

In [None]:
Ans:- Unit Vector Scaling is a feature scaling technique that scales the data based on the magnitude of the feature vectors.
The technique scales each feature such that their Euclidean norm (magnitude) is 1.

The formula for unit vector scaling is:

xunit = x/∑ni=1 (xi)2

where $x$ is the original feature vector, $n$ is the number of features in the vector, and $x_{unit}$ is the unit vector
scaled feature vector.

The main difference between Min-Max scaling and Unit Vector Scaling is that Min-Max scaling scales the values of each 
feature within a fixed range, while Unit Vector scaling scales the feature values based on the magnitude of the feature 
vector.

An example of Unit Vector Scaling in Python:

In [1]:
import numpy as np
from sklearn.preprocessing import normalize

# create a feature matrix
X = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])

# apply unit vector scaling
X_unit = normalize(X, norm='l2')

print("Original Feature Matrix:\n", X)
print("Unit Vector Scaled Feature Matrix:\n", X_unit)


Original Feature Matrix:
 [[1 2 3]
 [4 5 6]
 [7 8 9]]
Unit Vector Scaled Feature Matrix:
 [[0.26726124 0.53452248 0.80178373]
 [0.45584231 0.56980288 0.68376346]
 [0.50257071 0.57436653 0.64616234]]


In [None]:
As shown in the output, each feature is scaled based on the magnitude of the feature vector, such that the magnitude of each
feature vector in the resulting matrix is 1.

### Q3

In [None]:
Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

In [None]:
Ans:- PCA (Principal Component Analysis) is a popular technique used for dimensionality reduction. It transforms a
high-dimensional dataset into a lower-dimensional dataset by identifying and preserving the most important features that
account for the majority of the variance in the original data.

The steps for performing PCA are as follows:

=> Standardize the data: Subtract the mean from each feature and divide by the standard deviation to ensure that all 
features are on the same scale.
=> Compute the covariance matrix: Calculate the covariance matrix for the standardized data.
=> Compute the eigenvectors and eigenvalues of the covariance matrix: The eigenvectors represent the directions of maximum 
variance in the data, and the eigenvalues represent the magnitude of the variance in each direction.
=> Choose the principal components: Select the eigenvectors with the largest eigenvalues, as they account for the most 
variance in the data.
=> Transform the data: Project the original data onto the new lower-dimensional space formed by the selected principal 
components.
Here is an example of how to apply PCA using Python:

In [None]:
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import pandas as pd

# Load the iris dataset
iris = load_iris()

# Standardize the data
iris_std = (iris.data - iris.data.mean()) / iris.data.std()

# Create a PCA object with 2 components
pca = PCA(n_components=2)

# Fit and transform the standardized data using PCA
iris_pca = pca.fit_transform(iris_std)

# Convert the transformed data to a Pandas DataFrame and plot it
iris_df = pd.DataFrame(data=iris_pca, columns=['PC1', 'PC2'])
iris_df['target'] = iris.target
sns.scatterplot(data=iris_df, x='PC1', y='PC2', hue='target')


In [None]:
In this example, we load the iris dataset and standardize the data. We then create a PCA object with 2 components and fit 
and transform the standardized data using PCA. Finally, we convert the transformed data to a Pandas DataFrame and plot it
using Seaborn to visualize the results. The resulting plot shows the iris data projected onto the two principal components,
with different colors representing the three different iris species.

### Q4

In [None]:
Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
Extraction? Provide an example to illustrate this concept.

In [None]:
Ans:- PCA can be used as a feature extraction technique to reduce the dimensionality of a dataset by transforming the 
original features into a smaller set of uncorrelated features called principal components. These principal components
represent the directions of maximum variance in the data.

PCA works by identifying the eigenvectors and eigenvalues of the covariance matrix of the original data. The eigenvectors 
represent the directions of maximum variance in the data, and the eigenvalues represent the amount of variance in the data 
that is accounted for by each eigenvector.

To perform feature extraction using PCA, we first standardize the data by subtracting the mean and dividing by the standard 
deviation. We then compute the covariance matrix and its eigenvectors and eigenvalues. The eigenvectors with the highest
eigenvalues represent the principal components, and we can use them as new features in our dataset.

For example, let's say we have a dataset with 10 features and we want to reduce it to 3 features using PCA. We can perform 
PCA on the dataset and select the 3 principal components with the highest eigenvalues. We can then use these 3 principal
components as new features in our dataset, effectively reducing the dimensionality of the data.

Here's an example code snippet in Python using scikit-learn to perform PCA for feature extraction on the iris dataset:

In [4]:
from sklearn.datasets import load_iris
from sklearn.decomposition import PCA
import pandas as pd

# Load the iris dataset
iris = load_iris()

# Standardize the data
iris_std = (iris.data - iris.data.mean()) / iris.data.std()

# Create a PCA object with 3 components
pca = PCA(n_components=3)

# Fit and transform the standardized data using PCA
iris_pca = pca.fit_transform(iris_std)

# Convert the transformed data to a Pandas DataFrame and print the first 5 rows
iris_df = pd.DataFrame(data=iris_pca, columns=['PC1', 'PC2', 'PC3'])
print(iris_df.head())


        PC1       PC2       PC3
0 -1.359848  0.161815 -0.014142
1 -1.375054 -0.089673 -0.106627
2 -1.463637 -0.073435  0.009069
3 -1.390862 -0.161259  0.015989
4 -1.382438  0.165542  0.045636


In [None]:
In this example, we first standardize the iris dataset using the mean and standard deviation of the data. We then create a 
PCA object with 3 components and fit and transform the standardized data using PCA. We finally convert the transformed data 
to a Pandas DataFrame and print the first 5 rows to see the new features.

### Q5

In [None]:
Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

In [None]:
Ans:- In order to use Min-Max scaling to preprocess the data for a recommendation system in a food delivery service, we 
would first need to identify the features that need to be scaled. In this case, the features are price, rating, and delivery
time.

Min-Max scaling rescales the features to a range between 0 and 1. To apply Min-Max scaling to the features, we would follow
these steps:

=> Determine the minimum and maximum values for each feature.
=> Subtract the minimum value from each value of the feature.
=> Divide each value by the difference between the maximum and minimum values.
For example, let's say we have a dataset with the following values for the price, rating, and delivery time features:

In [5]:
price = [5, 10, 15, 20]
rating = [2.5, 3.2, 4.0, 4.5]
delivery_time = [20, 30, 45, 60]


In [None]:
To apply Min-Max scaling to these features, we would follow these steps:

=> Determine the minimum and maximum values for each feature:
min_price = 5, max_price = 20
min_rating = 2.5, max_rating = 4.5
min_delivery_time = 20, max_delivery_time = 60
=> Subtract the minimum value from each value of the feature:
price_scaled = [(5-5)/(20-5), (10-5)/(20-5), (15-5)/(20-5), (20-5)/(20-5)] = [0.0, 0.333, 0.667, 1.0]
rating_scaled = [(2.5-2.5)/(4.5-2.5), (3.2-2.5)/(4.5-2.5), (4.0-2.5)/(4.5-2.5), (4.5-2.5)/(4.5-2.5)] = [0.0, 0.4, 0.8, 1.0]
delivery_time_scaled = [(20-20)/(60-20), (30-20)/(60-20), (45-20)/(60-20), (60-20)/(60-20)] = [0.0, 0.333, 0.778, 1.0]
=> The scaled features can now be used for analysis.
By applying Min-Max scaling, we have rescaled the features to a range between 0 and 1, which can help to prevent features 
with larger numerical ranges from dominating the analysis. This can lead to more accurate results and better performance in
the recommendation system.

### Q6

In [None]:
Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

In [None]:
Ans:- PCA (Principal Component Analysis) is a popular technique used for dimensionality reduction in machine learning. It 
aims to reduce the number of features in a dataset while retaining as much of the original information as possible. In the 
context of a stock price prediction project, PCA can be used to identify the most important variables that explain the 
variation in the stock prices.

To use PCA for dimensionality reduction, you would follow these steps:

=> Standardize the data: It is important to standardize the data so that each feature has a mean of zero and a standard 
deviation of one. This is necessary because PCA is a variance-based method, and the variance of the features needs to be 
comparable.

=> Compute the covariance matrix: The covariance matrix describes the relationship between the features in the dataset. It 
can be computed by taking the dot product of the transpose of the standardized data and the standardized data itself.

=> Compute the eigenvalues and eigenvectors of the covariance matrix: The eigenvectors of the covariance matrix represent 
the principal components of the dataset, while the eigenvalues represent the amount of variance explained by each principal
component.

=> Choose the number of principal components: You can choose the number of principal components to retain based on the 
amount of variance explained by each component. Typically, you would choose the smallest number of components that explain a 
significant amount of the variance in the data.

=> Project the data onto the new feature space: Finally, you can use the eigenvectors corresponding to the selected 
principal components to project the data onto the new feature space.

In the context of a stock price prediction project, you might use PCA to identify the most important variables that explain
the variation in the stock prices. For example, you might start with a dataset that contains features such as company 
financial data (e.g., revenue, earnings, and assets) and market trends (e.g., interest rates, inflation, and GDP growth).
After standardizing the data, you would compute the covariance matrix and the eigenvalues and eigenvectors of the matrix. 
You might choose to retain the top three principal components, which together explain 80% of the variance in the data. You
would then project the data onto the new feature space defined by these principal components, and use these new features as 
inputs to your stock price prediction model.

### Q7

In [None]:
Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.

In [None]:
Ans:- To perform Min-Max scaling, we need to apply the following formula:

X_norm = (X - X_min) / (X_max - X_min)

where X_min and X_max are the minimum and maximum values in the dataset, respectively.

In this case, X_min = 1 and X_max = 20. Therefore, we have:

First, we calculate the numerator for each value in the dataset:

For 1: (1 - 1) = 0
For 5: (5 - 1) = 4
For 10: (10 - 1) = 9
For 15: (15 - 1) = 14
For 20: (20 - 1) = 19
Then, we calculate the denominator for each value in the dataset:

For 1: (20 - 1) = 19
For 5: (20 - 1) = 19
For 10: (20 - 1) = 19
For 15: (20 - 1) = 19
For 20: (20 - 1) = 19
Finally, we calculate the normalized value for each value in the dataset:

For 1: 0/19 = 0
For 5: 4/19 = 0.21
For 10: 9/19 = 0.47
For 15: 14/19 = 0.74
For 20: 19/19 = 1
Therefore, the Min-Max scaled dataset is [-1, -0.37, 0.05, 0.47, 1].

### Q8

In [None]:
Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

In [None]:
Ans:- The number of principal components to retain depends on the amount of variance explained by each component. To 
determine the number of components to retain, we can look at the explained variance ratio (EVR), which tells us the 
proportion of the total variance in the data that is explained by each principal component.

To perform PCA on the given dataset, we first need to standardize the features to have zero mean and unit variance:

In [6]:
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

# Standardize the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Fit PCA and transform the data
pca = PCA()
X_pca = pca.fit_transform(X_scaled)

# Look at the explained variance ratio
print(pca.explained_variance_ratio_)


[1. 0. 0.]


In [None]:
The output will show the proportion of variance explained by each principal component. We can then decide how many 
components to retain based on a certain threshold. For example, we might choose to retain enough components to explain
90% of the variance:

In [7]:
# Look at the cumulative explained variance ratio
cumulative_variance_ratio = np.cumsum(pca.explained_variance_ratio_)
print(cumulative_variance_ratio)

# Find the number of components needed to explain 90% of the variance
n_components = np.argmax(cumulative_variance_ratio >= 0.9) + 1
print(n_components)


[1. 1. 1.]
1


In [None]:
In this case, the output might show that the first two components explain 70% of the variance, while the third component 
explains 20%, and the remaining two components explain only 10% together. Therefore, we might choose to retain the first two
principal components, which capture most of the variance in the data. We would then use these two components as new features
in our model.