#1

Min-Max scaling, also known as min-max normalization, is a data preprocessing technique used to scale numerical features in a dataset to a specific range, typically between 0 and 1. This scaling helps to ensure that all features have the same scale, which can be important for many machine learning algorithms.

In [2]:
#Here’s an example of how Min-Max scaling can be applied using Python’s:
from sklearn.preprocessing import MinMaxScaler

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
        
scaler = MinMaxScaler()


scaler.fit(data)


scaled_data = scaler.transform(data)

print(scaled_data)


[[0.   0.  ]
 [0.25 0.25]
 [0.5  0.5 ]
 [1.   1.  ]]


#2

Unit Vector Technique:

In the Unit Vector technique, scaling is done considering the whole feature vector to be of unit length. This is achieved by dividing each observation vector by its norm (either the Manhattan distance (l1 norm) or the Euclidean distance (l2 norm) of the vector). This technique is particularly useful when dealing with features with hard boundaries.

Difference between Unit Vector Technique and Min-Max Scaling:

While both techniques scale values to a range of [0,1], they do so in different ways. The Unit Vector technique considers the whole feature vector for scaling, while Min-Max scaling operates on individual values within a feature.

#3

Principal Component Analysis (PCA) is a statistical technique used for dimensionality reduction. It is often used to reduce the dimensionality of large datasets by transforming a large set of variables into a smaller one that still contains most of the information in the large set.

PCA works by identifying a set of orthogonal axes, called principal components, that capture the maximum variance in the data. The first principal component captures the most variation in the data, but the second principal component captures the maximum variance that is orthogonal to the first principal component, and so on.


For example, let’s consider a dataset with many features. If we feed our model with an excessively large dataset (with a large number of features/columns), it gives rise to overfitting, wherein the model starts getting influenced by outlier values and noise. This is where PCA comes in handy. It maps a higher dimensional feature space to a lower-dimensional feature space while ensuring that maximum information of the original dataset is retained in the dataset with reduced dimensions.

#4


### Relationship between PCA and Feature Extraction
Feature Extraction is a process of dimensionality reduction by which an initial set of raw data is reduced to more manageable groups for processing. PCA is a type of feature extraction technique that aims to reduce the number of input features while retaining as much of the original information as possible. It works on the condition that while the data in a higher-dimensional space is mapped to data in a lower dimension space, the variance of the data in the lower-dimensional space should be maximum¹.

### How PCA is used for Feature Extraction
PCA uses an orthogonal transformation to convert a set of correlated variables to a set of uncorrelated variables. The main goal of PCA is to reduce the dimensionality of a dataset while preserving the most important patterns or relationships between the variables without any prior knowledge of the target variables. The new set of variables, smaller than the original set, retains most of the sample’s information, and is useful for regression and classification of data.

#5

Min-Max scaling is a technique often used to normalize the range of independent variables or features of data. In this case, the features are price, rating, and delivery time. The goal of Min-Max scaling is to scale the features to a specific range, typically 0 to 1.

Here's how you can apply Min-Max scaling to each feature:

1. **Identify the minimum and maximum values of the feature**: For each feature, you need to find the minimum and maximum values in your dataset. For example, if you're looking at the 'price' feature, you would find the lowest and highest prices in your dataset.

2. **Apply the Min-Max scaling formula**: The formula for Min-Max scaling is:

    $$X_{new} = \frac{X - X_{min}}{X_{max} - X_{min}}$$

    where `X` is an original value, `X_new` is the new value after scaling, `X_min` is the minimum value in the feature column, and `X_max` is the maximum value.

3. **Replace original values with scaled values**: After calculating the new scaled value for each data point in your feature, replace the original values with these new ones. This will result in a distribution of values between 0 and 1.

For example, let's say we have a dataset with 'price' ranging from $5 to $50. If we want to scale a price of $20 using Min-Max scaling, we would substitute these values into our formula:

$$X_{new} = \frac{20 - 5}{50 - 5} = 0.333$$

So, the scaled price of $20 is approximately 0.333 when using Min-Max scaling.

By applying this process to all features in your dataset (price, rating, delivery time), you ensure that all features have equal weight in your recommendation system model. This can help improve the performance of your model by preventing features with larger scales from dominating those with smaller scales.

In [9]:
#7
from sklearn.preprocessing import MinMaxScaler
min_max=MinMaxScaler()
Min_Max_scaling=min_max.fit_transform([[1, 5, 10, 15, 20]])
print(Min_Max_scaling)

[[0. 0. 0. 0. 0.]]
