Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.

Ans.  Min-Max scaling, also known as normalization, is a data preprocessing technique used to rescale the values of a feature so that they fall within a specified range, typically 0 to 1. This technique is useful in various machine learning algorithms where the scale of the features can significantly impact the performance of the model.

The formula for min max scaling is:

X_scaled = (X-X_min)/(X_max - X_min)

where,
X - the original value:
X_min - minimum value in the dataset
X_max - maximum value in the datset

Example :

Suppose we have a dataset with a single feature, and the values are:
X=[10,20,30,40,50]

To apply Min-Max scaling, we first find the minimum and maximum values of X:
X_min = 10
X_max = 50

X[0]= (10-10)/(50-10) = 0

X[1] = (20-10)/(50-10) = 0.25

X[2] = (30-10)/(50-10) = 0.5

X[3] = (40-10)/(50-10) = 0.75

X[4] = (50-10)/(50-10) = 1

Thus, the scaled values are:

X_scaled=[0,0.25,0.5,0.75,1]



Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

Ans. The Unit Vector technique, also known as normalization or vector normalization, is a feature scaling method where each feature vector is scaled so that its magnitude becomes 1. This technique ensures that all the data points lie on the surface of a hypersphere, which can be useful in various machine learning algorithms, particularly those involving distance metrics.

Min-Max Scaling Rescales the data to a fixed range, typically [0, 1], based on the minimum and maximum values of the data.
Unit Vector Scaling Scales the data based on the magnitude of the vector, making the vector length equal to 1.

Formula of Unit vector:

X_scalaed = X/||x||

where , 
X is original feature vector

||X|| is the magnitude of X

Example,
Consider a dataset with a single feature vector:

X=[3,4]

||X|| = underroot(3^2 + 4^2) = 5

X_scaled = [3,4]/5 = [0.6,0.8]


Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

Ans. PCA is a dimensionality reduction technique that transforms a dataset into a set of linearly uncorrelated features called principal components. These components capture the most variance in the data, allowing for a reduced dimensionality while preserving important information.

Steps in PCA

Standardize the Data: Scale the data to have a mean of 0 and a variance of 1.

Compute the Covariance Matrix: Understand the variance and relationships between features.

Calculate Eigenvalues and Eigenvectors: Identify the directions (eigenvectors) and their importance (eigenvalues).

Sort Eigenvalues and Eigenvectors: Order by descending eigenvalues.

Form the Feature Vector: Select the top k eigenvectors.

Transform the Data: Multiply the original data by the feature vector to get the reduced dataset.

Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
Extraction? Provide an example to illustrate this concept.

Ans. Relationship between PCA and Feature Extraction:

Feature extraction involves transforming raw data into a set of features that better represent the underlying structure of the data, which can enhance the performance of machine learning models. PCA aids in feature extraction by identifying the directions (principal components) that capture the maximum variance in the data. These principal components can be used as new features, reducing dimensionality while retaining the most important information.

How PCA is Used for Feature Extraction

Standardize the Data: Ensure all features have the same scale.

Compute the Covariance Matrix: Understand how features vary with respect to each other.

Calculate Eigenvalues and Eigenvectors: Identify principal components.

Sort and Select Top k Components: Choose the principal components that capture the most variance.

Transform the Data: Project the original data onto the selected principal components to get the new features.

Example: Consider a dataset with three data points and two features:



In [1]:
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

# Given dataset
X = np.array([[2.5, 2.4],
              [0.5, 0.7],
              [2.2, 2.9]])

# Standardize the data
scaler = StandardScaler()
X_standardized = scaler.fit_transform(X)

# Perform PCA
pca = PCA(n_components=1)
X_pca = pca.fit_transform(X_standardized)

print("Original Data:\n", X)
print("\nStandardized Data:\n", X_standardized)
print("\nPrincipal Component:\n", pca.components_)
print("\nExplained Variance Ratio:\n", pca.explained_variance_ratio_)
print("\nTransformed Data:\n", X_pca)


Original Data:
 [[2.5 2.4]
 [0.5 0.7]
 [2.2 2.9]]

Standardized Data:
 [[ 0.87056284  0.4247954 ]
 [-1.40047065 -1.38058503]
 [ 0.52990781  0.95578964]]

Principal Component:
 [[-0.70710678 -0.70710678]]

Explained Variance Ratio:
 [0.96829339]

Transformed Data:
 [[-0.91595659]
 [ 1.96650334]
 [-1.05054674]]


Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

Ans.  To build a recommendation system for a food delivery service, preprocessing the data is a crucial step. Given features such as price, rating, and delivery time, Min-Max scaling can be used to normalize these features, ensuring they are on a similar scale and thus improving the performance of the recommendation algorithm.

Steps to Use Min-Max Scaling

Understand the Data:

Inspect the dataset to understand the range of each feature (price, rating, delivery time).

Calculate Minimum and Maximum Values:

For each feature, calculate the minimum and maximum values.

Apply Min-Max Scaling:

Use the Min-Max scaling formula to transform each feature to a specified range, typically [0, 1].

Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

Ans. To build a model to predict stock prices using a dataset with many features (such as company financial data and market trends), Principal Component Analysis (PCA) can be employed to reduce dimensionality. This process simplifies the dataset, reduces computational cost, and can improve model performance by removing noise and multicollinearity.

Steps to Use PCA for Dimensionality Reduction

Standardize the Data:

Since PCA is affected by the scales of the features, standardize the dataset to have a mean of 0 and a variance of 1.

Compute the Covariance Matrix:

Calculate the covariance matrix to understand how the features vary from the mean with respect to each other.

Calculate Eigenvalues and Eigenvectors:

Compute the eigenvalues and eigenvectors of the covariance matrix to identify the principal components.

Sort Eigenvalues and Eigenvectors:

Sort the eigenvalues in descending order and sort the eigenvectors accordingly. The eigenvectors corresponding to the highest eigenvalues are the principal components.

Choose the Number of Principal Components:

Decide how many principal components to keep. This can be based on the explained variance (e.g., keeping enough components to explain 95% of the variance).

Transform the Data:

Multiply the original standardized dataset by the selected eigenvectors (principal components) to obtain the reduced dataset.

Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.
Ans. 

In [8]:
import numpy as np

# Original dataset
data = np.array([1, 5, 10, 15, 20])

# Min and Max values of the original dataset
X_min = data.min()
X_max = data.max()

# Desired range
a, b = -1, 1

# Apply Min-Max scaling
scaled_data = (data - X_min) / (X_max - X_min) * (b - a) + a

print("Original Data:", data)
print("Scaled Data:", scaled_data)


Original Data: [ 1  5 10 15 20]
Scaled Data: [-1.         -0.57894737 -0.05263158  0.47368421  1.        ]


Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

Ans. 

In [9]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

# Sample dataset
data = pd.DataFrame({
    'height': [170, 160, 180, 175, 165],
    'weight': [70, 60, 80, 75, 65],
    'age': [30, 25, 35, 28, 40],
    'gender': [0, 1, 0, 1, 0],  # 0 for male, 1 for female
    'blood pressure': [120, 110, 130, 125, 115]
})

# Standardize the data
scaler = StandardScaler()
data_standardized = scaler.fit_transform(data)

# Perform PCA
pca = PCA()
data_pca = pca.fit_transform(data_standardized)

# Explained variance
explained_variance = pca.explained_variance_ratio_

# Cumulative explained variance
cumulative_explained_variance = np.cumsum(explained_variance)

# Display results
print("Explained Variance by each principal component:\n", explained_variance)
print("\nCumulative Explained Variance:\n", cumulative_explained_variance)

# Determine number of components to retain
num_components = np.argmax(cumulative_explained_variance >= 0.95) + 1
print(f"\nNumber of principal components to retain to explain 95% of the variance: {num_components}")


Explained Variance by each principal component:
 [6.51732034e-01 3.05663577e-01 4.26043893e-02 2.12883002e-33
 8.42951699e-38]

Cumulative Explained Variance:
 [0.65173203 0.95739561 1.         1.         1.        ]

Number of principal components to retain to explain 95% of the variance: 2


From the example output, you would retain 4 principal components to explain 95% of the variance. The exact number may vary based on your specific dataset. The goal is to retain enough components to capture most of the variance while reducing the dimensionality of the data. This helps in simplifying the model, reducing computational cost, and potentially improving the model's performance by eliminating noise and irrelevant features.