In [1]:
# Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its
# application.

Min-Max scaling, also known as Min-Max normalization, is a data preprocessing technique used to scale and transform features in a dataset to a specific range, typically between 0 and 1. The purpose of Min-Max scaling is to ensure that all features have the same scale, preventing some features from dominating others in machine learning algorithms that rely on feature magnitudes, such as gradient descent.

X_scaled= X- X_min / X_max - X_min

X_Min= is the minimum value of feature X in the dataset.
X_max=is the Maximum value of feature X in the dataset.

Here's an example to illustrate the application of Min-Max scaling:

Suppose you have a dataset of exam scores, where the scores range from 60 to 100. The goal is to scale these scores to a range between 0 and 1 using Min-Max scaling.

Let's scale a score of 80 using Min-Max scaling:

=80-100/100-60
0.5

Benefits and Considerations of Min-Max Scaling:

Min-Max scaling transforms all features to the same scale, which is important for many machine learning algorithms, especially those based on distances or gradients.
It preserves the relative relationships between values in each feature, which can be useful when the actual values are important.
However, Min-Max scaling is sensitive to outliers, and extreme values in the dataset can significantly affect the scaling, potentially compressing most values into a narrow range. In such cases, robust scaling techniques like Z-score scaling (standardization) may be preferred.

In [2]:
# Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling?
# Provide an example to illustrate its application.

The **Unit Vector** technique in feature scaling, also known as vector normalization or the L2 normalization, is a method used to scale features in a dataset such that each feature vector (observation) has a Euclidean length (L2 norm) of 1. This technique transforms the data in such a way that it lies on the surface of a unit hypersphere.

The formula for Unit Vector scaling of a feature vector \(X\) is as follows:

X/||X||

Where:
X = original vector
||X|| = unit vector of X


Unit Vector scaling ensures that each feature vector has a constant magnitude (length) of 1 while preserving the direction of the original vector. This scaling technique is particularly useful in scenarios where the direction of the feature vectors is more important than their magnitudes.


**Difference from Min-Max Scaling:**

The key difference between Unit Vector scaling and Min-Max scaling is the scaling approach:

- **Unit Vector Scaling:** Unit Vector scaling normalizes each feature vector to have a constant length of 1 while preserving direction.
  
- **Min-Max Scaling:** Min-Max scaling scales features to a specific range (typically between 0 and 1) while preserving the relative relationships between values in each feature.

Unit Vector scaling does not change the magnitude of the original feature values but rather their proportion to each other. Min-Max scaling, on the other hand, transforms feature values to a predefined range, which can be any specified interval. The choice between these scaling techniques depends on the specific requirements of the machine learning problem and the nature of the data.

In [4]:
#Q3 What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an
# example to illustrate its application.

**Principal Component Analysis (PCA)** is a dimensionality reduction technique commonly used in data analysis and machine learning. It is primarily used to reduce the number of features (dimensions) in a dataset while preserving as much of the original variance as possible. PCA achieves this by transforming the data into a new coordinate system defined by its principal components, which are linear combinations of the original features. The principal components are ordered in terms of their ability to explain the variance in the data, with the first principal component capturing the most variance.

Here's how PCA works:

1. **Centering the Data:** PCA starts by centering the data, which means subtracting the mean of each feature from the data points. This ensures that the transformed data has a mean of zero.

2. **Calculating Covariance Matrix:** PCA calculates the covariance matrix of the centered data. The covariance matrix describes the relationships between pairs of features.

3. **Eigenvalue Decomposition:** PCA performs an eigenvalue decomposition (or singular value decomposition) of the covariance matrix to obtain the eigenvectors and eigenvalues. The eigenvectors are the principal components, and the eigenvalues represent the amount of variance explained by each principal component.

4. **Selecting Principal Components:** The principal components are sorted in descending order of their corresponding eigenvalues. Typically, you select the top \(k\) principal components that explain most of the variance, where \(k\) is the desired reduced dimensionality.

5. **Projecting Data:** Finally, PCA projects the data onto the selected principal components to obtain a reduced-dimensional representation of the data.

Here's an example to illustrate the application of PCA:

Suppose you have a dataset with two features, "Height" and "Weight," and you want to reduce the dimensionality of the data using PCA. The goal is to represent the data in one dimension (from 2D to 1D) while preserving as much variance as possible.

- **Original Data (2D):**
  - Sample 1: [170 cm, 68 kg]
  - Sample 2: [160 cm, 55 kg]
  - Sample 3: [175 cm, 75 kg]
  - ...

After centering the data, calculating the covariance matrix, and performing eigenvalue decomposition, you find that the first principal component (PC1) explains 95% of the variance, and the second principal component (PC2) explains only 5% of the variance.

You decide to reduce the data to one dimension by selecting only PC1. Now, you project the original data onto PC1:

- **Reduced Data (1D):**
  - Sample 1: [5.31] (projected value)
  - Sample 2: [-1.12] (projected value)
  - Sample 3: [8.97] (projected value)
  - ...

The reduced data has only one feature, which is a linear combination of the original "Height" and "Weight" features. While dimensionality is reduced, much of the original variance is still captured in this reduced representation.

PCA is widely used in various applications, including image compression, data visualization, and noise reduction, as it allows you to reduce the dimensionality of complex datasets while preserving essential information.

In [5]:
# Q4 What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature
# Extraction? Provide an example to illustrate this concept.

**PCA (Principal Component Analysis)** and **feature extraction** are related concepts in the context of dimensionality reduction, but they serve different purposes and have distinct methodologies.

**PCA** is primarily a dimensionality reduction technique that transforms the original features into a new set of orthogonal features (principal components) while preserving as much of the original variance as possible. The principal components are linear combinations of the original features and are ranked by their ability to explain the variance in the data.

**Feature extraction**, on the other hand, is a broader concept that involves creating new features from the original ones to capture specific information or patterns in the data. It's not limited to dimensionality reduction but can also be used for enhancing certain aspects of the data representation, feature engineering, or creating informative representations for machine learning tasks.

Now, let's discuss how PCA can be used for feature extraction:

**Feature Extraction using PCA:**

PCA can be employed as a feature extraction technique when you are interested in creating a more compact and informative feature set while maintaining as much useful information as possible. Here's how it works:

1. **Data Preprocessing:** Start with your original dataset containing \(n\) samples and \(m\) original features.

2. **Center the Data:** Subtract the mean of each feature to center the data.

3. **Perform PCA:** Apply PCA to the centered data to obtain the principal components (PCs).

4. **Select Components:** Choose a subset of the top \(k\) principal components (where \(k < m\)) based on their corresponding eigenvalues or the amount of variance they explain. These selected PCs will serve as the new features.

5. **Create the Transformed Dataset:** Project the centered data onto the selected PCs to create a new dataset with \(k\) features, which are linear combinations of the original features.

6. **Use the Transformed Features:** The transformed dataset with \(k\) features can be used for further analysis, machine learning, or other tasks.

**Example:**

Let's consider an example with a dataset of images, where each image is represented by a high-dimensional vector of pixel values. You want to extract informative features from these images for a classification task while reducing the dimensionality.

1. **Original Data:** Images represented by pixel values (e.g., 1000-dimensional vectors).

2. **Center the Data:** Subtract the mean pixel vector from each image to center the data.

3. **PCA:** Apply PCA to the centered data to obtain principal components.

4. **Select Components:** Choose the top \(k\) principal components (e.g., 50) based on their explained variance.

5. **Create Transformed Dataset:** Project each centered image onto the selected principal components to create a new dataset with \(k\) features.

6. **Use Transformed Features:** Train a machine learning model on the reduced-dimensional dataset with \(k\) features for image classification.

In this example, PCA was used for feature extraction to create a more compact and informative representation of the images while reducing dimensionality. The extracted features can then be used as inputs to a machine learning model for classification or other tasks.

Overall, PCA can be a valuable tool for feature extraction when you want to reduce dimensionality while preserving important information in your data.

In [6]:
# Q5 You are working on a project to build a recommendation system for a food delivery service. The dataset
# contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to
# preprocess the data.

In [7]:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

# Sample dataset (replace this with your actual dataset)
data = {
    'Price': [10, 20, 30, 15, 25],
    'Rating': [4.2, 4.5, 3.9, 4.8, 4.0],
    'DeliveryTime': [30, 45, 60, 40, 55]
}

# Create a DataFrame
df = pd.DataFrame(data)

# Initialize the Min-Max scaler
scaler = MinMaxScaler()

# Specify the columns you want to scale (in this case, all columns)
columns_to_scale = df.columns

# Apply Min-Max scaling to the selected columns
df[columns_to_scale] = scaler.fit_transform(df[columns_to_scale])

# Display the scaled DataFrame
print(df)


   Price    Rating  DeliveryTime
0   0.00  0.333333      0.000000
1   0.50  0.666667      0.500000
2   1.00  0.000000      1.000000
3   0.25  1.000000      0.333333
4   0.75  0.111111      0.833333


In [8]:
# Q6. You are working on a project to build a model to predict stock prices. The dataset contains many
# features, such as company financial data and market trends. Explain how you would use PCA to reduce the
# dimensionality of the dataset.

In [None]:
import pandas as pd
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error

# Load your stock price dataset (replace with your dataset)
# Assume your dataset has features like company financial data and market trends.
# For demonstration, we'll generate a sample dataset.
data = {
    'Feature1': [100, 200, 300, 400],
    'Feature2': [50, 75, 125, 150],
    'Feature3': [5, 10, 15, 20],
    'StockPrice': [10.5, 20.3, 30.1, 40.2]
}

# Create a DataFrame
df = pd.DataFrame(data)

# Separate the target variable (StockPrice)
X = df.drop(columns=['StockPrice'])
y = df['StockPrice']

# Standardize the features (optional but recommended)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Apply PCA for dimensionality reduction
pca = PCA(n_components=2)  # Choose the number of components to retain
X_pca = pca.fit_transform(X_scaled)

In [9]:
# Q7 For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the
# values to a range of -1 to 1.

In [18]:
import pandas as pd
import numpy as np
from sklearn.preprocessing import MinMaxScaler

X=np.array([1, 5, 10, 15, 20]).reshape(-1, 1)

min_max=MinMaxScaler(feature_range=(-1,1))

min_max.fit_transform(X)

array([[-1.        ],
       [-0.57894737],
       [-0.05263158],
       [ 0.47368421],
       [ 1.        ]])

In [19]:
X

array([[ 1],
       [ 5],
       [10],
       [15],
       [20]])

In [20]:
#Q8 For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform
# Feature Extraction using PCA. How many principal components would you choose to retain, and why?

In [21]:
import numpy as np
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

In [22]:
data = np.array([
    [175, 70, 30, 1, 120],
    [160, 65, 28, 0, 130],
    [180, 80, 35, 1, 140],
    [165, 55, 25, 0, 110],
    [170, 75, 32, 1, 125]
])

In [26]:
df_new=pd.DataFrame(data,columns=['height','weight','age','gender','bp'])

In [30]:
scaler=StandardScaler()
data_scaled=scaler.fit_transform(df_new[['height','weight','age','gender','bp']])

In [31]:
data_scaled

array([[ 0.70710678,  0.11624764,  0.        ,  0.81649658, -0.5       ],
       [-1.41421356, -0.46499055, -0.58722022, -1.22474487,  0.5       ],
       [ 1.41421356,  1.27872403,  1.46805055,  0.81649658,  1.5       ],
       [-0.70710678, -1.62746694, -1.46805055, -1.22474487, -1.5       ],
       [ 0.        ,  0.69748583,  0.58722022,  0.81649658,  0.        ]])

In [32]:
pca = PCA(n_components=5)

In [33]:
data_pca = pca.fit_transform(data_scaled)

In [34]:
explained_variance_ratio = pca.explained_variance_ratio_

In [35]:
explained_variance_ratio

array([8.10494518e-01, 1.52749455e-01, 3.45256632e-02, 2.23036442e-03,
       9.64468529e-35])

we will choose 'height','weight','age','gender'