In [1]:
#1.

# Min-Max scaling is a data normalization technique used in data preprocessing.
# It rescales the data to a specified range, typically between 0 and 1 or -1 and 1.
# It is achieved by subtracting the minimum value and dividing by the range of the data.

# Formula can be given as:
# (value - min_value)/(max_value - min_value)

# where,
# value = value which is to be normalized
# min_value = minimum value of the feature
# max_value = maximum value of the feature

# For example, consider a dataset with features such as age (ranging from 20 to 60) and income (ranging from 20,000 to 100,000).
# By applying Min-Max scaling, both features can be transformed to a common range (e.g., 0 to 1).
# This scaling ensures that features with larger numeric ranges do not dominate the learning process.
# Consequently, the scaled data can improve the performance of machine learning algorithms that are sensitive to the scale of the input data.

In [2]:
#2.

# The Unit Vector technique, or vector normalization, is a feature scaling method that scales data to have a unit norm or length of 1.
# It involves dividing each data point by the Euclidean norm of the feature vector.
# This ensures all vectors have the same scale and direction.
# Unlike Min-Max scaling, which transforms data to a specific range, Unit Vector scaling emphasizes the relative importance of each feature rather than the absolute values.

# For example, in a dataset with features like age, income, and education level.
# Unit Vector scaling would normalize each feature vector to a unit length, making them comparable in terms of their direction and magnitude.

In [3]:
#3.

# Principal Component Analysis, is a dimensionality reduction technique used to transform high-dimensional data into a lower-dimensional space while retaining the most important information.
# It identifies the principal components, which are new orthogonal axes that capture the maximum variance in the data.
# These components are ordered by their significance, and the lower-dimensional representation can be achieved by selecting the top components.

# For example, given a dataset with multiple correlated features such as age, income, and education level.
# PCA can be used to extract the most influential components that explain the majority of the variance in the data, reducing the dimensionality while preserving the most informative aspects.

In [4]:
#4.

# PCA plays a significant role in feature extraction.
# It can be used as a feature extraction technique to transform high-dimensional data into a lower-dimensional representation by identifying the most important features.
# By selecting the top principal components, which capture the maximum variance in the data, PCA effectively extracts the most informative features from the original dataset.
# This reduced feature set can then be used for various tasks such as classification or clustering.

# For example, in face recognition, PCA can extract key facial features from images, representing them with a lower-dimensional set of principal components.
# Principal components which can be used for subsequent analysis or classification tasks.

In [5]:
#5.

# To preprocess the data for building a recommendation system, Min-Max scaling can be used on the features of price, rating, and delivery time.
# By applying Min-Max scaling, each feature's values will be transformed to a common range, typically between 0 and 1.
# This ensures that all features are on the same scale, preventing one feature from dominating the others.
# It allows for fair comparisons and enables the recommendation system to consider the relative importance of each feature when making recommendations.

# For example, a high price would not outweigh other important factors like rating and delivery time, resulting in a more balanced and effective recommendation system.

In [6]:
#6.

# To reduce the dimensionality of the dataset for predicting stock prices, PCA can be applied.
# Then, PCA is used to extract the most significant components that capture the maximum variance in the data.
# By selecting a lower number of principal components, the dimensionality is reduced while retaining the most informative aspects of the data.
# This helps in reducing noise and redundancy, improving computational efficiency, and mitigating the curse of dimensionality.
# It ultimately aiding in building a more efficient and accurate stock price prediction model.

In [7]:
#7.

from sklearn.preprocessing import MinMaxScaler

data = [[1], [5], [10], [15], [20]]

min_max = MinMaxScaler(feature_range = (-1, 1))
min_max.fit_transform(data)

array([[-1.        ],
       [-0.57894737],
       [-0.05263158],
       [ 0.47368421],
       [ 1.        ]])

In [8]:
#8.

# Perform PCA on the dataset with features [height, weight, age, gender, blood pressure].
# Calculate the explained variance ratio for each principal component.
# Choose the number of principal components that collectively explain a significant portion of the variance, such as 90% or 95%.
# This ensures a balance between dimensionality reduction and information retention.
# The specific number of principal components to retain depends on the desired level of information retention and the trade-off between dimensionality reduction and predictive accuracy.
# It is important to consider the specific requirements and constraints of the project when deciding on the number of principal components to retain.