## Q1. What is Min-Max scaling, and how is it used in data preprocessing? Provide an example to illustrate its application.

**Min-Max scaling** is a data preprocessing technique that scales and translates each feature individually to a given range. It is often used to transform features to a range between zero and one on the training set¹. The transformation is given by the following formula:

\[
X_{\text{std}} = \frac{{X - X_{\text{min}}}}{{X_{\text{max}} - X_{\text{min}}}}
\]

\[
X_{\text{scaled}} = X_{\text{std}} \times (\text{{max}} - \text{{min}}) + \text{{min}}
\]

where \(X_{\text{min}}\) and \(X_{\text{max}}\) are the minimum and maximum values of the feature, respectively¹.

This technique is often used as an alternative to zero mean, unit variance scaling¹. It does not reduce the effect of outliers but linearly scales them down into a fixed range, where the largest occurring data point corresponds to the maximum value, and the smallest one corresponds to the minimum value¹.


In [4]:

## Here's an example to illustrate its application:

from sklearn.preprocessing import MinMaxScaler

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
scaler = MinMaxScaler()
print(scaler.fit(data))
print(scaler.transform(data))


MinMaxScaler()
[[0.   0.  ]
 [0.25 0.25]
 [0.5  0.5 ]
 [1.   1.  ]]


## Q2. What is the Unit Vector technique in feature scaling, and how does it differ from Min-Max scaling? Provide an example to illustrate its application.

The **Unit Vector technique** in feature scaling is a method that scales each feature individually to have a **unit norm**. It involves dividing each feature vector by its norm, which can be either the **Manhattan distance** (L1 norm) or the **Euclidean distance** (L2 norm) of the vector².

When scaling to the unit norm, each observation vector is divided by its norm, resulting in a feature vector with a length of one². This technique is useful when dealing with features that have hard boundaries, such as image data, where colors can range from 0 to 255⁴.



In [5]:
from sklearn.preprocessing import Normalizer

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
scaler = Normalizer(norm='l2')
print(scaler.fit_transform(data))

[[-0.4472136   0.89442719]
 [-0.08304548  0.99654576]
 [ 0.          1.        ]
 [ 0.05547002  0.99846035]]


## Q3. What is PCA (Principle Component Analysis), and how is it used in dimensionality reduction? Provide an example to illustrate its application.

**Principal Component Analysis (PCA)** is a statistical technique used for **dimensionality reduction**. It identifies a set of orthogonal axes, called **principal components**, that capture the maximum variance in the data¹. The principal components are linear combinations of the original variables in the dataset and are ordered in decreasing order of importance¹.

PCA is widely used in exploratory data analysis and machine learning for predictive models¹. It helps reduce the dimensionality of a dataset while preserving the most important patterns or relationships between the variables without any prior knowledge of the target variables¹. By reducing the number of input features, PCA can address issues such as overfitting, increased computation time, and reduced accuracy caused by the curse of dimensionality¹.


In [6]:

from sklearn.decomposition import PCA

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
pca = PCA(n_components=1)
print(pca.fit_transform(data))


[[-7.05447553]
 [-3.02334666]
 [ 1.00778222]
 [ 9.07003997]]


## Q4. What is the relationship between PCA and Feature Extraction, and how can PCA be used for Feature Extraction? Provide an example to illustrate this concept.

**Principal Component Analysis (PCA)** is a statistical technique used for **dimensionality reduction**¹. It identifies a set of orthogonal axes, called **principal components**, that capture the maximum variance in the data¹. PCA is widely used in exploratory data analysis and machine learning for predictive models¹.

Feature extraction is a process of **dimensionality reduction** where an initial set of raw data is reduced to more manageable groups for processing³. PCA is one of the most popular dimensionality reduction techniques used for feature extraction¹. It aims to reduce the number of input features while retaining as much of the original information as possible¹.

The relationship between PCA and feature extraction lies in the fact that PCA can be used as a technique for feature extraction¹. By transforming the original features into a new set of variables, smaller than the original set, PCA retains most of the sample's information and is useful for regression and classification tasks¹.


In [7]:

from sklearn.decomposition import PCA

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
pca = PCA(n_components=1)
print(pca.fit_transform(data))

[[-7.05447553]
 [-3.02334666]
 [ 1.00778222]
 [ 9.07003997]]


## Q5. You are working on a project to build a recommendation system for a food delivery service. The dataset contains features such as price, rating, and delivery time. Explain how you would use Min-Max scaling to preprocess the data.

To preprocess the dataset for building a recommendation system for a food delivery service, you can use **Min-Max scaling** to transform the features such as price, rating, and delivery time to a specific range¹².

Min-Max scaling is a technique that scales and translates each feature individually to a given range, often between zero and one¹. It can be performed using the `MinMaxScaler` class from the scikit-learn library¹. The transformation is given by the following formula:

\[
X_{\text{std}} = \frac{{X - X_{\text{min}}}}{{X_{\text{max}} - X_{\text{min}}}}
\]

\[
X_{\text{scaled}} = X_{\text{std}} \times (\text{{max}} - \text{{min}}) + \text{{min}}
\]

where \(X_{\text{min}}\) and \(X_{\text{max}}\) are the minimum and maximum values of the feature, respectively¹.

By applying Min-Max scaling to the dataset, you can ensure that each feature is transformed to a range between zero and one¹. This normalization process can help prevent features with larger values from dominating the recommendation system's calculations¹.


In [8]:

## Here's an example of how you can use Min-Max scaling with the scikit-learn library:
from sklearn.preprocessing import MinMaxScaler

# Assuming you have a dataset with features: price, rating, and delivery time
data = [[10.0, 4.5, 30], [15.0, 3.8, 45], [8.0, 4.2, 20], [12.0, 4.7, 35]]

# Create an instance of MinMaxScaler
scaler = MinMaxScaler()

# Fit the scaler on the data
scaler.fit(data)

# Transform the data using Min-Max scaling
scaled_data = scaler.transform(data)

print(scaled_data)

[[0.28571429 0.77777778 0.4       ]
 [1.         0.         1.        ]
 [0.         0.44444444 0.        ]
 [0.57142857 1.         0.6       ]]


## Q6. You are working on a project to build a model to predict stock prices. The dataset contains many features, such as company financial data and market trends. Explain how you would use PCA to reduce the dimensionality of the dataset.

**Principal Component Analysis (PCA)** is a statistical technique used for **dimensionality reduction**¹. It identifies a set of orthogonal axes, called **principal components**, that capture the maximum variance in the data¹. PCA is widely used in exploratory data analysis and machine learning for predictive models¹.

When working on a project to predict stock prices, you can use PCA to reduce the dimensionality of the dataset containing features such as company financial data and market trends¹.

The process of using PCA for dimensionality reduction involves the following steps:

1. **Data Preprocessing**: Ensure that the dataset is properly preprocessed by handling missing values, normalizing or standardizing features, and addressing any other data quality issues.

2. **Feature Selection**: Identify the relevant features from the dataset that are most likely to contribute to predicting stock prices. This step helps reduce computational complexity and focuses on the most informative features.

3. **Applying PCA**: Apply PCA to the selected features to reduce their dimensionality while retaining most of the original information. PCA transforms the original features into a new set of uncorrelated variables called principal components¹. These principal components are ordered in decreasing order of importance, with the first component capturing the most variance in the data.

4. **Determining the Number of Components**: Determine the number of principal components to retain based on a trade-off between computational efficiency and information loss. You can consider using metrics such as explained variance ratio or scree plot analysis to make an informed decision¹.

5. **Model Training**: Train your predictive model using the reduced-dimensional dataset obtained after applying PCA. The reduced dataset should contain a subset of principal components that capture most of the variance in the original dataset.

In [10]:
## Here's an example illustrating how you can use PCA for dimensionality reduction in Python:
from sklearn.decomposition import PCA

# Assuming you have a dataset with multiple features
data = [[feature_1, feature_2, ..., feature_n], ...]

# Create an instance of PCA with desired number of components
pca = PCA(n_components=k)

# Fit PCA on your dataset
pca.fit(data)

# Transform your dataset using PCA
reduced_data = pca.transform(data)

## This is just an example f n features so there is not output

NameError: name 'feature_1' is not defined

## Q7. For a dataset containing the following values: [1, 5, 10, 15, 20], perform Min-Max scaling to transform the values to a range of -1 to 1.

In [25]:
from sklearn.preprocessing import MinMaxScaler

Data = [[1,5,10,15,20]]

min_max= MinMaxScaler()

min_max.fit_transform(Data)

array([[0., 0., 0., 0., 0.]])

## Q8. For a dataset containing the following features: [height, weight, age, gender, blood pressure], perform Feature Extraction using PCA. How many principal components would you choose to retain, and why?

When performing **Feature Extraction using PCA** on a dataset with features such as height, weight, age, gender, and blood pressure, the number of principal components to retain depends on the desired trade-off between **dimensionality reduction** and **information preservation**¹.

To determine the number of principal components to retain, you can consider the following factors:

1. **Explained Variance Ratio**: Calculate the explained variance ratio for each principal component. The explained variance ratio represents the proportion of the dataset's variance explained by each principal component. Retaining principal components with high explained variance ratios ensures that most of the original information is preserved¹.

2. **Cumulative Explained Variance**: Plot the cumulative explained variance ratio against the number of principal components. This plot helps visualize how much of the dataset's variance is preserved as the number of principal components increases. You can choose a threshold for the cumulative explained variance (e.g., 90% or 95%) and select the minimum number of principal components that exceed this threshold¹.

3. **Domain Knowledge**: Consider any domain-specific knowledge or requirements that may influence the choice of principal components to retain. For example, certain features may be more relevant or informative for your specific application, and you may want to prioritize their preservation¹.