Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.

ans -  
Min-Max scaling, also known as normalization, is a data preprocessing technique used to scale and transform the features of a dataset so that they fall within a specific range. The goal is to bring all the features to a common scale without distorting the original distribution of the data.

The formula for Min-Max scaling is given by:

�
new
=
�
−
�
min
�
max
−
�
min
X 
new
​
 = 
X 
max
​
 −X 
min
​
 
X−X 
min
​
 
​
 

where 
�
X is the original value of a feature, 
�
min
X 
min
​
  is the minimum value of that feature in the dataset, and 
�
max
X 
max
​
  is the maximum value of that feature in the dataset.

Here's how Min-Max scaling is applied in data preprocessing:

Identify the minimum (
�
min
X 
min
​
 ) and maximum (
�
max
X 
max
​
 ) values for each feature in the dataset.
Apply the Min-Max scaling formula to transform each feature.
This technique is particularly useful when working with algorithms that are sensitive to the scale of input features, such as gradient-based optimization algorithms used in machine learning models.

Example:

Suppose you have a dataset with a feature "Age" ranging from 20 to 60. You want to apply Min-Max scaling to bring the values of this feature within the range [0, 1]. The minimum value (
�
min
X 
min
​
 ) is 20, and the maximum value (
�
max
X 
max
​
 ) is 60.

Let's say you have a data point with 
�
=
30
X=30. Applying the Min-Max scaling formula:

�
new
=
30
−
20
60
−
20
=
10
40
=
0.25
X 
new
​
 = 
60−20
30−20
​
 = 
40
10
​
 =0.25

So, the scaled value for 
�
=
30
X=30 after Min-Max scaling is 0.25. This process is repeated for each data point in the "Age" feature, ensuring that all values are transformed to the [0, 1] range.

Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

ans - The Unit Vector technique, also known as "Unit Vector Scaling" or "Normalization," is another method for feature scaling similar to Min-Max scaling. The key difference lies in the normalization formula used and the specific range to which the values are scaled.

In Unit Vector scaling, each feature vector is scaled so that it becomes a unit vector (a vector with a magnitude of 1). The formula for Unit Vector scaling is given by:

�
new
=
�
∥
�
∥
X 
new
​
 = 
∥X∥
X
​
 

where 
�
X is the original feature vector, and 
∥
�
∥
∥X∥ denotes the Euclidean norm or magnitude of the vector.

Here's how Unit Vector scaling differs from Min-Max scaling:

Range: Min-Max scaling scales the values of each feature to a specific range (e.g., [0, 1]), while Unit Vector scaling scales the entire feature vector to have a magnitude of 1.

Direction: Min-Max scaling preserves the relative distances between values in the original feature, but it may change the direction of the vector. Unit Vector scaling not only scales the values but also ensures that the direction of the vector remains the same.

Example:

Suppose you have a dataset with two features, 
�
1
X 
1
​
  and 
�
2
X 
2
​
 , and you want to apply Unit Vector scaling. The original feature vector is 
[
3
,
4
]
[3,4].

Calculate the Euclidean norm (
∥
�
∥
∥X∥):
∥
�
∥
=
3
2
+
4
2
=
9
+
16
=
25
=
5
∥X∥= 
3 
2
 +4 
2
 
​
 = 
9+16
​
 = 
25
​
 =5

Apply Unit Vector scaling:
�
new
=
[
3
,
4
]
5
=
[
3
5
,
4
5
]
X 
new
​
 = 
5
[3,4]
​
 =[ 
5
3
​
 , 
5
4
​
 ]

So, the scaled unit vector is 
[
3
5
,
4
5
]
[ 
5
3
​
 , 
5
4
​
 ]. This ensures that the magnitude of the vector is 1, and the direction remains the same.

Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

ans - Principal Component Analysis (PCA) is a dimensionality reduction technique commonly used in the field of machine learning and statistics. Its primary goal is to transform high-dimensional data into a lower-dimensional representation, capturing the most important information while minimizing the loss of variance.

The key idea behind PCA is to find the principal components, which are linear combinations of the original features, ordered by the amount of variance they explain. The first principal component explains the most variance, the second explains the second most, and so on. By selecting a subset of these components, one can represent the data in a lower-dimensional space.

Here are the main steps involved in PCA:

Standardization: Standardize the features to have zero mean and unit variance. This step ensures that all features are on a similar scale.

Covariance Matrix: Calculate the covariance matrix of the standardized features. The covariance matrix describes the relationships between different features.

Eigendecomposition: Perform eigendecomposition on the covariance matrix to obtain eigenvalues and corresponding eigenvectors. The eigenvectors represent the directions of maximum variance, and the eigenvalues indicate the magnitude of the variance in those directions.

Select Principal Components: Sort the eigenvectors by their corresponding eigenvalues in decreasing order. The top 
�
k eigenvectors (principal components) are chosen to form a new feature space, where 
�
k is the desired dimensionality of the reduced data.

Projection: Project the original data onto the selected principal components to obtain the lower-dimensional representation.

Example:

Let's consider a simple example with a dataset containing two features, 
�
1
X 
1
​
  and 
�
2
X 
2
​
 . We want to reduce the dimensionality to one dimension (use only the first principal component).

Standardization:
Standardize the features to have zero mean and unit variance.

Covariance Matrix:
Calculate the covariance matrix:

Cov
=
[
Var
(
�
1
)
Cov
(
�
1
,
�
2
)
Cov
(
�
2
,
�
1
)
Var
(
�
2
)
]
Cov=[ 
Var(X 
1
​
 )
Cov(X 
2
​
 ,X 
1
​
 )
​
  
Cov(X 
1
​
 ,X 
2
​
 )
Var(X 
2
​
 )
​
 ]

Eigendecomposition:
Perform eigendecomposition to obtain eigenvalues (
�
1
,
�
2
λ 
1
​
 ,λ 
2
​
 ) and eigenvectors (
�
1
,
�
2
v 
1
​
 ,v 
2
​
 ).

Select Principal Components:
Select the eigenvector corresponding to the highest eigenvalue. In this case, let's say it is 
�
1
v 
1
​
 .

Projection:
Project the original data onto the first principal component:

Reduced Dimension
=
�
⋅
�
1
Reduced Dimension=X⋅v 
1
​


Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
Extraction? Provide an example to illustrate this concept.

ans - Principal Component Analysis (PCA) and Feature Extraction are closely related concepts in the field of machine learning and data analysis. PCA is a technique used for dimensionality reduction, and it can be employed as a method for feature extraction.

Relationship between PCA and Feature Extraction:

Feature extraction involves transforming the original set of features into a new set of features, typically with a lower dimensionality, while preserving the most important information in the data. PCA is a specific method for feature extraction that achieves dimensionality reduction by projecting the original features onto a new set of orthogonal axes called principal components. These principal components are ordered by their importance, and the idea is to retain the most significant ones while discarding less relevant information.

How PCA can be used for Feature Extraction:

Covariance Matrix Calculation:

Begin by calculating the covariance matrix of the original feature matrix.
Eigendecomposition:

Perform eigendecomposition on the covariance matrix to obtain eigenvalues and eigenvectors.
Selection of Principal Components:

Sort the eigenvalues in descending order. The corresponding eigenvectors represent the principal components. Choose the top k eigenvectors to retain the most important information, where k is the desired dimensionality of the reduced feature space.
Projection:

Project the original data onto the selected principal components to obtain the reduced feature matrix.

Example - 

#Original Feature Matrix:

| Height | Weight |
|--------|--------|
|  170   |   65   |
|  160   |   55   |
|  175   |   70   |
|  162   |   58   |


Covariance Matrix:
    
    | 100   20 |
|  20   15 |


Eigendecomposition:

The eigenvalues are 105 and 10. The corresponding eigenvectors are [0.98, 0.20] and [-0.20, 0.98].


Selection of Principal Components:

Choose the top eigenvector, [0.98, 0.20], as the principal component.
Projection:



| Height | Weight | PCA1   |
|--------|--------|--------|
|  170   |   65   | 172.6  |
|  160   |   55   | 161.2  |
|  175   |   70   | 175.9  |
|  162   |   58   | 162.1  |


Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

ans - Min-Max scaling is a common technique used in data preprocessing to normalize the features of a dataset. It transforms the data in a way that scales the values between a specified range, usually [0, 1]. This is particularly useful when features have different scales, and it helps in improving the performance of machine learning algorithms.

In [4]:
from sklearn.preprocessing import MinMaxScaler
import pandas as pd

# Create a sample DataFrame
data = {'price': [10, 20, 15, 25],
        'rating': [4.5, 3.0, 4.8, 3.5],
        'delivery_time': [30, 45, 20, 60]}

df = pd.DataFrame(data)
# Extract the features to be scaled
features_to_scale = ['price', 'rating', 'delivery_time']
data_to_scale = df[features_to_scale]

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the data using Min-Max scaling
scaled_data = scaler.fit_transform(data_to_scale)

# Create a new DataFrame with the scaled data
scaled_df = pd.DataFrame(scaled_data, columns=features_to_scale)

# Display the scaled DataFrame
print(scaled_df)


      price    rating  delivery_time
0  0.000000  0.833333          0.250
1  0.666667  0.000000          0.625
2  0.333333  1.000000          0.000
3  1.000000  0.277778          1.000


Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

ans - Principal Component Analysis (PCA) is a dimensionality reduction technique commonly used in machine learning and data analysis. When dealing with a dataset containing many features, such as financial data and market trends in the context of predicting stock prices, PCA can be employed to reduce the dimensionality of the dataset while retaining the most significant information.

Here's how you can use PCA to reduce the dimensionality of the dataset for predicting stock prices:

1. Understanding the Dataset:

Begin by understanding the dataset, including the nature of the features, their correlation, and the overall structure of the data.
2. Data Preprocessing:

Handle any missing values or outliers in the dataset, as PCA is sensitive to them.
3. Standardization:

Standardize the features to ensure they have a mean of 0 and a standard deviation of 1. This step is crucial because PCA is based on the covariance matrix, and standardization ensures that all features contribute equally.

In [19]:
from sklearn.preprocessing import StandardScaler
import numpy as np

data = np.array([[1, 2, 3],
                 [4, 5, 6],
                 [7, 8, 9]])

# Assuming 'X' is your feature matrix
X = data

target_data = np.array([1, 2, 3])

# Assuming 'y' is your target variable
y = target_data

scaler = StandardScaler()
X_standardized = scaler.fit_transform(X)


4. Applying PCA:

Use PCA to transform the standardized features into principal components. You can choose the number of components based on the explained variance you want to retain.

In [20]:
from sklearn.decomposition import PCA

# Choose the number of components (e.g., n_components=3)
pca = PCA(n_components=3)
X_pca = pca.fit_transform(X_standardized)


5. Explained Variance:

Evaluate the explained variance to understand how much information each principal component retains. This information guides you in choosing the appropriate number of components.

In [21]:
explained_variance_ratio = pca.explained_variance_ratio_
print("Explained Variance Ratio:", explained_variance_ratio)


Explained Variance Ratio: [1. 0. 0.]


6. Choosing the Number of Components:

Based on the explained variance, choose the number of principal components that retain a sufficient amount of information for your prediction task.
7. Model Building:

Use the reduced-dimension dataset (X_pca) for building your stock price prediction model.

In [22]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression


X_train, X_test, y_train, y_test = train_test_split(X_pca, y, test_size=0.2, random_state=42)

model = LinearRegression()
model.fit(X_train, y_train)


Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.

In [25]:
import numpy as np

# Original dataset
data = np.array([1, 5, 10, 15, 20])

# Define the new minimum and maximum values
new_min = -1
new_max = 1

# Min-Max scaling formula
scaled_data = (data - np.min(data)) / (np.max(data) - np.min(data)) * (new_max - new_min) + new_min

# Display the scaled values
print("Original Values:", data)
print("Scaled Values (-1 to 1):", scaled_data)


Original Values: [ 1  5 10 15 20]
Scaled Values (-1 to 1): [-1.         -0.57894737 -0.05263158  0.47368421  1.        ]


Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

In [26]:
import numpy as np
from sklearn.decomposition import PCA

# Assuming 'data' is your feature matrix with shape (number_of_samples, number_of_features)
data = np.array([[170, 65, 25, 1, 120],
                 [160, 55, 30, 0, 130],
                 [175, 70, 35, 1, 110],
                 [162, 58, 28, 0, 125]])

# Standardize the data
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
data_standardized = scaler.fit_transform(data)

# Apply PCA
pca = PCA()
pca.fit(data_standardized)

# Explained variance ratio for each principal component
explained_variance_ratio = pca.explained_variance_ratio_

# Cumulative explained variance
cumulative_explained_variance = np.cumsum(explained_variance_ratio)

# Find the number of principal components that explain a desired percentage of variance (e.g., 95%)
desired_explained_variance = 0.95
num_components = np.argmax(cumulative_explained_variance >= desired_explained_variance) + 1

print(f"Number of principal components to retain for {desired_explained_variance * 100}% variance: {num_components}")


Number of principal components to retain for 95.0% variance: 2
