Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its 
application.

ans)

Min-Max Scaling also called as Normalization. It is a data preprocessing technique used to transform features of a dataset to a specific range, typically between 0 and 1. It is to standardize the input data before feeding it into algorithms, especially those that are sensitive to the magnitude of data, such as gradient-based optimizers or distance-based algorithms.

How it is used in data processing:

1. Load Data: Loading data set into the program

2. Handle Missing Values: Impute or remove any missing data if necessary.

3. Split Data: Divide your dataset into training and testing subsets. It's essential to split the data before scaling to avoid data leakage.

4. Initialize Min-Max Scaler: Import and initialize the MinMaxScaler from sklearn.preprocessing.

5. Fit and Transform Training Data: Use the fit_transform() method to scale your training data. This calculates the minimum and maximum values from the training set and scales it.

6. Transform Test Data: Apply the same scaling (using transform()) to the test set using the previously computed min and max from the training data.

7. Train Your Model: Train your machine learning model using the scaled training data.

8. Evaluate the Model: Evaluate the performance of your model on the scaled test data.

In [1]:
#Simlpe example in python code for illustrating the Min-Max scalling
from sklearn.preprocessing import MinMaxScaler
import numpy as np

# Sample data: Ages of individuals
ages = np.array([[22], [25], [28], [30], [35]])

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the data
ages_scaled = scaler.fit_transform(ages)

# Print the original and scaled data
print("Original Data:\n", ages)
print("\nScaled Data:\n", ages_scaled)


Original Data:
 [[22]
 [25]
 [28]
 [30]
 [35]]

Scaled Data:
 [[0.        ]
 [0.23076923]
 [0.46153846]
 [0.61538462]
 [1.        ]]


Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? 
Provide an example to illustrate its application.

ans)

The Unit Vector technique, also known as Normalization, is a feature scaling method that transforms a feature vector into a unit vector. This means that after normalization, the vector will have a length (or Euclidean norm) of 1.

Difference from Min-Max Scaling:

    1. Min-Max Scaling: Transforms features to a specific range (e.g., [0, 1]) by adjusting the data based on the minimum and maximum values of each feature.

    2. Unit Vector Scaling (Normalization): Transforms the feature vector so that its magnitude is 1, focusing on the direction rather than the range of values. This method is independent of the minimum and maximum values and instead normalizes based on the overall magnitude of the vector.

In [3]:
#Simlpe example in python code for illustrating the Unit vector
from sklearn.preprocessing import Normalizer
import numpy as np

# Sample data: height and weight
data = np.array([[170, 65],
                 [160, 70],
                 [180, 75]])

# Initialize the Normalizer (unit vector scaling)
normalizer = Normalizer()

# Fit and transform the data
data_normalized = normalizer.fit_transform(data)

# Print the original and normalized data
print("Original Data:\n", data)
print("\nNormalized Data (Unit Vectors):\n", data_normalized)


Original Data:
 [[170  65]
 [160  70]
 [180  75]]

Normalized Data (Unit Vectors):
 [[0.93405183 0.35713747]
 [0.91615733 0.40081883]
 [0.92307692 0.38461538]]


Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an 
example to illustrate its application.

ans)

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms a dataset with potentially correlated features into a set of linearly uncorrelated components, called principal components. These components are ordered so that the first few retain most of the variation present in the original dataset.

How is it used in dimensionality reduction:

    1. Center the Data: Subtract the mean from each feature to center the data around the origin.

    2. Compute Covariance Matrix: Calculate the covariance matrix to understand how the features vary with respect to each other.

    3. Compute Eigenvalues and Eigenvectors: Find the eigenvectors and eigenvalues of the covariance matrix. The eigenvectors represent the directions (principal components), and the eigenvalues represent the magnitude (variance) in these directions.

    4. Sort Principal Components: Rank the principal components by their eigenvalues, i.e., by the amount of variance they capture.

    5. Transform the Data: Project the data onto the top k principal components to reduce the dimensionality while preserving the majority of the variance.

In [4]:
#Simple illustration of  PCA
import numpy as np
from sklearn.decomposition import PCA

# Step 1: Sample dataset (e.g., height, weight, income)
data = np.array([[170, 65, 45000],
                 [160, 70, 48000],
                 [180, 75, 52000],
                 [175, 68, 50000]])

# Step 2: Initialize PCA (reduce to 2 components)
pca = PCA(n_components=2)

# Step 3: Fit and transform the data
data_pca = pca.fit_transform(data)

# Step 4: Print the transformed data
print("Original Data:\n", data)
print("\nPCA Transformed Data (2 components):\n", data_pca)

# Step 5: Print the explained variance ratio (how much variance each principal component captures)
print("\nExplained Variance Ratio:", pca.explained_variance_ratio_)


Original Data:
 [[  170    65 45000]
 [  160    70 48000]
 [  180    75 52000]
 [  175    68 50000]]

PCA Transformed Data (2 components):
 [[ 3.74999926e+03 -5.19737633e+00]
 [ 7.50017169e+02  1.00472452e+01]
 [-3.25001455e+03 -2.94608428e+00]
 [-1.25000187e+03 -1.90378454e+00]]

Explained Variance Ratio: [9.99994307e-01 5.24344819e-06]


Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature 
Extraction? Provide an example to illustrate this concept.

ans)

Relationship Between PCA and Feature Extraction:

Principal Component Analysis (PCA) is not only a dimensionality reduction technique but also a powerful feature extraction method. In feature extraction, the goal is to derive new features (called components in PCA) from the original features, which better represent the underlying patterns in the data.

    Feature Extraction: Refers to transforming the data into a new set of features that may or may not directly correspond to the original features. PCA achieves this by creating a new set of orthogonal features (principal components) that explain most of the variance in the data.

    Dimensionality Reduction: Is a side effect of PCA, where we reduce the number of features by selecting only the most significant principal components.
    
    
Relationship Between PCA and Feature Extraction:
Principal Component Analysis (PCA) is not only a dimensionality reduction technique but also a powerful feature extraction method. In feature extraction, the goal is to derive new features (called components in PCA) from the original features, which better represent the underlying patterns in the data. These new features capture the most important information while reducing redundancy and noise.

    Feature Extraction: Refers to transforming the data into a new set of features that may or may not directly correspond to the original features. PCA achieves this by creating a new set of orthogonal features (principal components) that explain most of the variance in the data.

    Dimensionality Reduction: Is a side effect of PCA, where we reduce the number of features by selecting only the most significant principal components.

How PCA is Used for Feature Extraction:

    1. Capture Maximum Variance: PCA identifies the directions (principal components) in which the data varies the most. These directions are linear combinations of the original features.

    2. New Feature Space: Instead of using the original features, PCA transforms the data into a new feature space where each principal component is a new feature. The first few components can capture most of the information in the data, allowing us to discard the less informative components.

    3. Uncorrelated Features: PCA generates features (principal components) that are uncorrelated (orthogonal). This can be beneficial when the original features are highly correlated, as the new features provide a more concise and independent representation of the data.

    4. Feature Selection: After applying PCA, you can choose the top k principal components as new features. These components often summarize the data better than the original features, especially when there are correlations between features.
    
Steps to Use PCA for Feature Extraction:

    1. Standardize the Data: PCA is sensitive to the scale of data, so standardization is often required.

    2. Apply PCA: Perform PCA to extract principal components from the dataset.

    3. Select Principal Components: Choose the top k principal components as new features based on how much variance they capture.

    4. Use New Features: Replace the original features with the selected principal components for further analysis or model training.

In [6]:
#Simple illustration example for the concept

import numpy as np
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler

# Step 1: Sample dataset (e.g., height, weight, income, age)
data = np.array([[170, 65, 45000, 30],
                 [160, 70, 48000, 32],
                 [180, 75, 52000, 29],
                 [175, 68, 50000, 35],
                 [165, 72, 47000, 28]])

# Step 2: Standardize the data
scaler = StandardScaler()
data_standardized = scaler.fit_transform(data)

# Step 3: Apply PCA (extract 2 principal components)
pca = PCA(n_components=2)
data_pca = pca.fit_transform(data_standardized)

# Step 4: Print the transformed data
print("Original Data (Standardized):\n", data_standardized)
print("\nPCA Transformed Data (2 Principal Components):\n", data_pca)

# Step 5: Print the explained variance ratio
print("\nExplained Variance Ratio:", pca.explained_variance_ratio_)


Original Data (Standardized):
 [[ 0.         -1.46805055 -1.40693001 -0.32232919]
 [-1.41421356  0.         -0.16552118  0.48349378]
 [ 1.41421356  1.46805055  1.4896906  -0.72524067]
 [ 0.70710678 -0.58722022  0.66208471  1.69222822]
 [-0.70710678  0.58722022 -0.57932412 -1.12815215]]

PCA Transformed Data (2 Principal Components):
 [[-1.71409327  0.0214244 ]
 [-0.86086578 -0.06449434]
 [ 2.56646191 -0.44103472]
 [ 0.31882658  2.00798075]
 [-0.31032944 -1.52387609]]

Explained Variance Ratio: [0.52319436 0.32766576]


Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset 
contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to 
preprocess the data.

ans)

Steps to Use Min-Max Scaling:
Given a dataset for a food delivery service with features like price, rating, and delivery time, Hence we can apply Min-max scaler:

    1. Understand the Features:

        Price: The cost of the food item.
        Rating: Customer rating of the food item.
        Delivery Time: Time taken to deliver the food item.
    
    2. Standardize Feature Ranges:

        Price: Typically ranges from low to high values (e.g., $5 to $100).
        Rating: Generally ranges from 1 to 5.
        Delivery Time: Can vary widely depending on distance and traffic (e.g., 10 to 60 minutes).
        
    3. Apply Min-Max Scaling:
        Min-Max scaling transforms each feature to a range between 0 and 1 using the formula:
        This ensures that all features are on a similar scale, making it easier to compare and use them in a recommendation algorithm.
    
    4. Preprocess the Data:
    
        Scale each feature independently using the Min-Max scaling formula

Q6. You are working on a project to build a model to predict stock prices. The dataset contains many 
features, such as company financial data and market trends. Explain how you would use PCA to reduce the 
dimensionality of the dataset.

ans)

Steps to Use PCA for Dimensionality Reduction

1. Understand the Dataset:

    Features: Your dataset might include features such as company financial data (e.g., earnings, revenue, debt levels) and market trends (e.g., interest rates, market indices, trading volumes).
    Objective: Reduce the number of features while retaining the most significant information that influences stock prices.

2. Preprocess the Data:

    Handle Missing Values: Ensure there are no missing values in your dataset or impute them if necessary.
    Standardize Features: PCA is sensitive to the scale of the data. Standardizing the features to have zero mean and unit variance is crucial before applying PCA.

3. Apply PCA:

    Fit PCA: Perform PCA on the standardized dataset to extract principal components.
    Select Components: Determine the number of principal components that capture the majority of the variance in the data.

4. Transform Data:

    Transform: Project the original data onto the selected principal components to obtain a reduced feature set.

5. Model Building:

    Use Reduced Features: Use the reduced feature set (principal components) for building and training your stock price prediction model.

In [8]:
"""Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the 
values to a range of -1 to 1."""

# ans)

import numpy as np

# Original data
data = np.array([1, 5, 10, 15, 20])

# Calculate min and max
X_min = np.min(data)
X_max = np.max(data)

# Apply Min-Max scaling to range [-1, 1]
data_scaled = 2 * (data - X_min) / (X_max - X_min) - 1

print("Original Data:", data)
print("Scaled Data:", data_scaled)


Original Data: [ 1  5 10 15 20]
Scaled Data: [-1.         -0.57894737 -0.05263158  0.47368421  1.        ]


In [10]:
""""Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform 
Feature Extraction using PCA. How many principal components would you choose to retain, and why?"""

#Ans)
#Note: Some additional data points taken for better illustration of results 

import numpy as np
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline

# Sample data
data = {
    'Height': [160, 175, 170, 180, 165],
    'Weight': [55, 80, 70, 85, 60],
    'Age': [25, 40, 35, 50, 30],
    'Gender': ['Female', 'Male', 'Male', 'Male', 'Female'],
    'Blood Pressure': [120, 140, 130, 150, 125]
}

# Create DataFrame
df = pd.DataFrame(data)

# Preprocessing
# Encode categorical feature 'Gender'
preprocessor = ColumnTransformer(
    transformers=[
        ('num', StandardScaler(), ['Height', 'Weight', 'Age', 'Blood Pressure']),
        ('cat', OneHotEncoder(), ['Gender'])
    ])

# Fit and transform the data
df_processed = preprocessor.fit_transform(df)

# Apply PCA
pca = PCA()
df_pca = pca.fit_transform(df_processed)

# Determine the number of components to retain
explained_variance = pca.explained_variance_ratio_
cumulative_variance = np.cumsum(explained_variance)
print("Explained Variance Ratio of each component:", explained_variance)
print("Cumulative Variance Ratio:", cumulative_variance)

# Choose number of components to keep (e.g., 95% variance)
num_components = np.argmax(cumulative_variance >= 0.95) + 1
print(f"Number of components to keep for 95% variance: {num_components}")

# Transform data using selected components
pca = PCA(n_components=num_components)
df_pca_reduced = pca.fit_transform(df_processed)

print("\nPCA Transformed Data (Reduced Features):\n", df_pca_reduced)


Explained Variance Ratio of each component: [9.61182835e-01 3.38382324e-02 3.47182215e-03 1.50711077e-03
 4.03340266e-36]
Cumulative Variance Ratio: [0.96118283 0.99502107 0.99849289 1.         1.        ]
Number of components to keep for 95% variance: 1

PCA Transformed Data (Reduced Features):
 [[-2.74234777]
 [ 1.45656735]
 [-0.02328803]
 [ 3.00305113]
 [-1.69398269]]


Choosing Number of Principal Components reasons:

    Variance Threshold: Typically, we would retain enough principal components to capture a significant portion of the total variance (e.g., 95% or 99%). This ensures that you maintain the most critical information in the dataset while reducing dimensionality.

    Trade-off: Retaining too few components might lead to loss of important information, while retaining too many might not achieve significant dimensionality reduction.