<a id="top"></a>
<span><h1 style="font-family:verdana;"><center><img src="https://scikit-learn.org/stable/_static/scikit-learn-logo-small.png" alt="Scikit-Learn" width=200>Scikit-learn</center></h1></span>
<p><center style="color:#159364; font-family:cursive;">Scikit-learn is a popular machine learning library that provides tools for data pre-processing, including feature extraction, scaling, and normalization<br><br>1. Simple and efficient tools for predictive data analysis<br>2. Accessible to everybody, and reusable in various contexts<br> 3. Built on NumPy, SciPy, and matplotlib<br> 4. Open source, commercially usable - BSD license</center></p>

***



<h2><center>Installation of Scikit-learn</center></h2>

<h3>Requirements:</h3>
<p>1. You need to install Python and Pip in your system</p>
<p>2. Open your cmd and run <code>pip install scikit-learn</code></p>


<h2><center>Importing Scikit-learn</center></h2>

In [7]:
import numpy as np
from sklearn.preprocessing import StandardScaler, MinMaxScaler, OneHotEncoder
from sklearn.impute import SimpleImputer
from sklearn.decomposition import PCA

# Example dataset with missing values and categorical data
X = np.array([[1, 2, 3, 'A'], [4, np.nan, 6, 'B'], [7, 8, 9, 'C']])


In [8]:
X

array([['1', '2', '3', 'A'],
       ['4', 'nan', '6', 'B'],
       ['7', '8', '9', 'C']], dtype='<U11')

In [9]:
# Feature scaling
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X[:, :3])
print("Scaled data:\n", X_scaled)



Scaled data:
 [[-1.22474487 -1.         -1.22474487]
 [ 0.                 nan  0.        ]
 [ 1.22474487  1.          1.22474487]]


In [10]:
# Handling missing data
imputer = SimpleImputer(strategy='mean')
X_imputed = imputer.fit_transform(X[:, :2])
print("Imputed data:\n", X_imputed)



Imputed data:
 [[1. 2.]
 [4. 5.]
 [7. 8.]]


In [11]:
# Encoding categorical data
encoder = OneHotEncoder()
X_encoded = encoder.fit_transform(X[:, 3].reshape(-1, 1))
print("Encoded data:\n", X_encoded.toarray())



Encoded data:
 [[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]


In [12]:
# Data normalization
minmax_scaler = MinMaxScaler(feature_range=(0, 1))
X_normalized = minmax_scaler.fit_transform(X[:, :3])
print("Normalized data:\n", X_normalized)



Normalized data:
 [[0.  0.  0. ]
 [0.5 nan 0.5]
 [1.  1.  1. ]]


<p>In this example, the <code>X</code> array represents a simple dataset with missing values and categorical data. The code then demonstrates how to perform feature scaling, handling missing data, encoding categorical data, data normalization, and dimensionality reduction using Scikit-learn functions.</p>