In [None]:
Min-max scaling is very often simply called ‘normalization.’ It transforms features to a specified range, typically between 0 and 1.
The formula for min-max scaling is: Xnormalized = X – Xmin / Xmax – Xmin

X is the original feature value
X_min is the minimum value of the feature
X_max is the maximum value of the feature
X_scaled is the scaled feature value between 0 and 

In [None]:
The Unit Vector technique, also known as Vector Normalization or L2 Normalization, is another feature scaling method used in data preprocessing. Unlike Min-Max scaling, which scales features to a range of [0, 1], the Unit Vector technique normalizes each feature vector to have a magnitude or length of 1.

The formula for the Unit Vector technique is as follows:

x_normalized = x / ||x||
Where:

x is the original feature vector
||x|| is the L2 norm (Euclidean norm) of the feature vector
x_normalized is the normalized feature vector

In [None]:
Principal Component Analysis (PCA) is a widely used technique for dimensionality reduction in data preprocessing. It is a statistical procedure that transforms a set of possibly correlated variables into a set of linearly uncorrelated variables called principal components. The principal components are ordered such that the first principal component accounts for the maximum possible variance in the data,
the second principal component accounts for the next highest variance, and so on.

The main goal of PCA is to reduce the dimensionality of the data while retaining as much information (variance) as possible.
This is achieved by projecting the data onto a lower-dimensional subspace defined by the principal components. By selecting only the top principal components that account for a significant portion of the total variance, we can effectively reduce the number of dimensions while preserving the essential information in the data.

In [None]:
PCA (Principal Component Analysis) is closely related to feature extraction, and it can be used as a technique for feature extraction in data preprocessing.

Feature extraction is the process of deriving new features from the original features in a dataset. These new features, often called "extracted features," aim to capture the most relevant information from the original data while reducing dimensionality and potentially improving the performance of machine learning models.

PCA is a powerful tool for feature extraction because it identifies the directions (principal components) along which the data exhibits the most variance. These principal components can be considered as new features that effectively summarize the information from the original features.

In [12]:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

# Create a sample food delivery dataset (replace with your actual dataset)
food_data = pd.DataFrame({
    'price': [10.0, 20.0, 15.0, 25.0],
    'rating': [4.5, 3.8, 4.2, 4.9],
    'delivery_time': [30, 45, 20, 60]
})

# Initialize the MinMaxScaler
scaler = MinMaxScaler()

# Fit and transform the features
scaled_features = scaler.fit_transform(food_data[['price', 'rating', 'delivery_time']])

# Create a new DataFrame with scaled features
scaled_df = pd.DataFrame(scaled_features, columns=['scaled_price', 'scaled_rating', 'scaled_delivery_time'])

# Concatenate the scaled features with the original dataset
final_df = pd.concat([food_data, scaled_df], axis=1)

# Display the final DataFrame
final_df


Unnamed: 0,price,rating,delivery_time,scaled_price,scaled_rating,scaled_delivery_time
0,10.0,4.5,30,0.0,0.636364,0.25
1,20.0,3.8,45,0.666667,0.0,0.625
2,15.0,4.2,20,0.333333,0.363636,0.0
3,25.0,4.9,60,1.0,1.0,1.0


In [18]:
import pandas as pd
from sklearn.preprocessing import MinMaxScaler

# Create a DataFrame with the given values
data = pd.DataFrame({'values': [1, 5, 10, 15, 20]})

# Initialize the MinMaxScaler
scaler = MinMaxScaler(feature_range=(-1, 1))

# Fit and transform the data
scaled_values = scaler.fit_transform(data)

# Create a new DataFrame with the scaled values
scaled_df = pd.DataFrame(scaled_values, columns=['scaled_values'])

# Display the scaled DataFrame
scaled_df


Unnamed: 0,scaled_values
0,-1.0
1,-0.578947
2,-0.052632
3,0.473684
4,1.0
