### 2. Unit Vector Normalization
Unit vector normalization (or vector normalization) scales each vector (or row in a dataset) so that its length (magnitude) is equal to 1. This process typically uses the Euclidean norm (L2 norm). The formula is: 𝑣^ = 𝑣 / ∥𝑣∥
<br> Characteristics of Unit Vector Normalization:
<br> Magnitude: The resulting vector has a magnitude of exactly 1.
<br> Direction Preserved: The direction of the vector is preserved.
<br> Useful for Direction-Based Tasks: Unit vectors are often used in contexts where the direction of the data matters, such as in machine learning tasks like clustering or calculating angles between vectors

In [1]:
import seaborn as sns
df = sns.load_dataset('iris')
df

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width,species
0,5.1,3.5,1.4,0.2,setosa
1,4.9,3.0,1.4,0.2,setosa
2,4.7,3.2,1.3,0.2,setosa
3,4.6,3.1,1.5,0.2,setosa
4,5.0,3.6,1.4,0.2,setosa
...,...,...,...,...,...
145,6.7,3.0,5.2,2.3,virginica
146,6.3,2.5,5.0,1.9,virginica
147,6.5,3.0,5.2,2.0,virginica
148,6.2,3.4,5.4,2.3,virginica


### Step-by-Step Breakdown:
#### 1. from sklearn.preprocessing import normalize:
You are importing the normalize function from sklearn.preprocessing. This function performs vector normalization, meaning it scales the rows of the dataset so that the Euclidean norm (or length) of each row is 1. This is sometimes referred to as unit vector normalization.

#### 2. normalize(df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]):
This applies the normalization to the columns: 'sepal_length', 'sepal_width', 'petal_length', and 'petal_width'.
The result is a new array where each row is scaled so that the sum of the squares of the values in each row equals 1.
Mathematically, for each row, it does: norm(𝑥) = sqrt(𝑥1^2 + 𝑥2^2 + ⋯ + 𝑥𝑛^2)
Each value in the row is divided by this norm so that the resulting row's length is 1.

#### 3. pd.DataFrame():
The normalized data is then wrapped in a pandas DataFrame to maintain the structure and column names, which are provided in the columns argument.

#### Result:
The result is a normalized DataFrame unit_df where each row represents a unit vector. All values are scaled relative to each other within their respective rows, but across columns, the values are not necessarily in a fixed range (e.g., between 0 and 1).

In [11]:
from sklearn.preprocessing import normalize
import pandas as pd

unit_df = pd.DataFrame(normalize(df[['sepal_length', 'sepal_width', 'petal_length', 'petal_width']]), columns = ['sepal_length', 'sepal_width', 'petal_length', 'petal_width'])
unit_df

Unnamed: 0,sepal_length,sepal_width,petal_length,petal_width
0,0.803773,0.551609,0.220644,0.031521
1,0.828133,0.507020,0.236609,0.033801
2,0.805333,0.548312,0.222752,0.034269
3,0.800030,0.539151,0.260879,0.034784
4,0.790965,0.569495,0.221470,0.031639
...,...,...,...,...
145,0.721557,0.323085,0.560015,0.247699
146,0.729654,0.289545,0.579090,0.220054
147,0.716539,0.330710,0.573231,0.220474
148,0.674671,0.369981,0.587616,0.250281
