Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
application.

Answer: 
Min-Max scaling is a data preprocessing technique used to scale numerical features within a specific range. It transforms the original data so that all values fall within a user-defined range, usually between 0 and 1. The formula to perform Min-Max scaling on a feature is as follows:

Xnew = {X - X_{{min}}}{X_{{max}} - X_{{min}}}

Where:
- X is the original value of a data point.
- X_{{new}}  is the scaled value of the data point.
- X_{{min}}  is the minimum value of the feature in the dataset.
- X_{{max}}  is the maximum value of the feature in the dataset.

The purpose of Min-Max scaling is to bring all features to the same scale, preventing features with larger values from dominating the learning process in machine learning algorithms that rely on distance or magnitude, such as k-nearest neighbors or gradient descent-based algorithms.

Let's illustrate the application of Min-Max scaling with an example:

Suppose we have a dataset containing the following numerical feature representing the age of people:

Original ages: [20, 25, 30, 35, 40] 

We want to apply Min-Max scaling to this feature, so that all values fall within the range of 0 to 1.



After applying Min-Max scaling, the transformed ages will be:

Scaled ages: [0, 0.25, 0.5, 0.75, 1] 
Now, all ages fall within the range of 0 to 1, making the data ready for further analysis or machine learning algorithms that require scaled features.

Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
Provide an example to illustrate its application.

Answer: 
The Unit Vector technique, also known as normalization, is another data preprocessing technique used to scale numerical features. Unlike Min-Max scaling, which scales features to a specific range (e.g., between 0 and 1), normalization scales the feature values such that each data point has a Euclidean norm (magnitude) of 1. It involves dividing each data point by the Euclidean norm of the entire feature vector.

The formula to perform Unit Vector scaling (normalization) on a feature is as follows:

Xnew = X / ||X||

Where X is original value of data point 
|| X || epresents the Euclidean norm (magnitude) of the feature vector, calculated as square root of (Xi**2) 

Normalization is useful when we want to bring all feature vectors to the same scale and direction, making them comparable based on their direction in the feature space. It is particularly beneficial when dealing with distance-based algorithms, such as k-nearest neighbors or support vector machines.

Let's illustrate the application of the Unit Vector technique (normalization) with an example:

Suppose we have a dataset containing the following numerical feature representing the 2-dimensional data points:

Original Data Points=[[3,4],[5,12],[8,15],[2,9]]

Normalized data points: [[0.6,0.8],[0.3846,0.9231],[0.4706,0.8824],[0.2357,0.9718]]

***************
Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
example to illustrate its application.

Answer: 
PCA, which stands for Principal Component Analysis, is a popular statistical technique used for dimensionality reduction in data analysis and machine learning. The primary goal of PCA is to transform a dataset containing a large number of correlated variables (features) into a new set of uncorrelated variables called principal components. These principal components are ranked by their ability to explain the variance in the data, with the first component explaining the most variance, the second component explaining the second most, and so on.

The steps involved in performing PCA are as follows:

Standardize the data: If the features have different scales, it is essential to standardize them (mean = 0, standard deviation = 1) to give each feature equal importance in the PCA process.

Calculate the covariance matrix: The covariance matrix is computed to understand the relationships between different features and the direction and strength of their correlations.

Compute eigenvectors and eigenvalues: The eigenvectors and eigenvalues are derived from the covariance matrix. Eigenvectors represent the directions (principal components), while eigenvalues represent the variance explained in those directions.

Select principal components: The eigenvectors are sorted based on their corresponding eigenvalues in descending order. The principal components with the highest eigenvalues explain the most variance in the data.

Project the data onto the new feature space: The original data is projected onto the new feature space formed by the selected principal components. This reduces the number of dimensions while preserving the maximum variance possible.

PCA is widely used in various applications, such as data visualization, feature extraction, and noise reduction, as it can simplify complex datasets while retaining meaningful patterns and relationships.



*******************
Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
preprocess the data.

Answer:

Each of the feature should follow below steps for MIN max Scaling.

1.Identify the relevant features: In this case, the features we want to scale are "price," "rating," and "delivery time." These features might have different units or scales, so it's essential to bring them to a common scale for fair comparison.

2.Calculate the minimum and maximum values for each feature: Find the minimum and maximum values of "price," "rating," and "delivery time" in the dataset. These values will be used in the Min-Max scaling formula.

3.Apply Min-Max scaling: For each data point, use the Min-Max scaling formula to transform the original values to their scaled counterparts.

4.Update the dataset: Replace the original values of "price," "rating," and "delivery time" with their scaled values obtained from the Min-Max scaling process.

By using Min-Max scaling, we ensure that all features are on the same scale, making them equally important during the recommendation process. This prevents features with larger numerical values (e.g., higher price or rating) from dominating the recommendation system and helps provide balanced and fair recommendations to users.

**********************************
Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
features, such as company financial data and market trends. Explain how you would use PCA to reduce the
dimensionality of the dataset.

Answer: 
To reduce the dimensionality of the dataset for building a model to predict stock prices, PCA (Principal Component Analysis) can be used effectively. PCA is a dimensionality reduction technique that helps identify the most significant patterns and correlations in the data and then transforms the original features into a new set of uncorrelated variables called principal components.

Here's how you can use PCA to reduce the dimensionality of the dataset:

Data Preprocessing: Ensure that the dataset is properly cleaned and preprocessed. Handle missing values, normalize or scale the features if necessary, and remove any irrelevant or redundant variables.

Standardization: Standardize the numerical features so that they have a mean of 0 and a standard deviation of 1. This step is essential for PCA, as it is sensitive to the scale of the features.

Compute the Covariance Matrix: Calculate the covariance matrix for the standardized data. The covariance matrix measures the relationships between different features and provides information about their correlations.

Compute Eigenvectors and Eigenvalues: Find the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors represent the principal components, and eigenvalues indicate the amount of variance explained by each principal component.

Sort Eigenvectors: Sort the eigenvectors based on their corresponding eigenvalues in descending order. The eigenvector with the highest eigenvalue represents the principal component that explains the most variance in the data, followed by the second highest, and so on.

Choose the Number of Principal Components: Decide on the number of principal components to retain based on the explained variance. Retaining a certain percentage of the total variance (e.g., 95% or 99%) is a common practice.

Project Data onto the New Feature Space: Transform the original data by projecting it onto the selected principal components. This step reduces the number of dimensions while preserving the most significant patterns in the data.

Train the Model: Use the reduced-dimensional data as input to train the predictive model. The reduced feature space should capture the essential information from the original data, helping the model make accurate predictions.

****************************
Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
values to a range of -1 to 1.


In [1]:
ds=[1,5,10,20]

Xmax=20
Xmin=1

scaled_ds=[]
for i in ds:
    scaled_ds.append((i-Xmin)/(Xmax-Xmin))

scaled_ds

[0.0, 0.21052631578947367, 0.47368421052631576, 1.0]

*********************

Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
Feature Extraction using PCA. How many principal components would you choose to retain, and why?

In [9]:
import pandas as pd
import numpy as np
# initialize list of lists
data = [[160, 50, 20, 0, 120], [165, 55, 22, 1, 130], [170, 80, 25, 0, 140],[150, 60, 25, 0, 130],[160, 90, 30, 0, 140]]
  
# Create the pandas DataFrame
df = pd.DataFrame(data, columns=['Height', 'Weight','Age','Gender','Bp'])

df_arr = np.array(df)

w, v = np.linalg.eig(df_arr)

In [7]:
df_arr

array([[160,  50,  20,   0, 120],
       [165,  55,  22,   1, 130],
       [170,  80,  25,   0, 140]])

In [14]:
## Sorting eigen values in descending order
w

array([381.89133827+0.j        ,  10.02481802+0.j        ,
        -5.1183585 +1.83124576j,  -5.1183585 -1.83124576j,
        -1.67943929+0.j        ])

In [15]:
np.sort(w)[::-1]

array([381.89133827+0.j        ,  10.02481802+0.j        ,
        -1.67943929+0.j        ,  -5.1183585 +1.83124576j,
        -5.1183585 -1.83124576j])

Considering first 3 eigen values, we can consider first 3 principal components.