# Machine Learning Lab Programs (1 to 10)
This notebook contains 10 basic Machine Learning lab programs with clear explanations and well-structured code using Python and libraries like `scikit-learn`, `pandas`, `numpy`, etc.

## Program 1: Display Versions of NumPy, Pandas, and Python
This program helps verify the installed versions of essential Python libraries used in data science.

In [1]:
import numpy as np
import pandas as pd
import sys

print("NumPy Version:", np.__version__)
print("Pandas Version:", pd.__version__)
print("Python Version:", sys.version)

NumPy Version: 1.24.0
Pandas Version: 1.5.3
Python Version: 3.11.8 (main, Mar 12 2024, 11:41:52) [GCC 12.2.0]


## Program 2: Display Versions of Scikit-learn, Scipy, and Matplotlib
Knowing the version of ML libraries ensures compatibility and reproducibility.

In [2]:
import sklearn
import scipy
import matplotlib

print("Scikit-learn Version:", sklearn.__version__)
print("SciPy Version:", scipy.__version__)
print("Matplotlib Version:", matplotlib.__version__)

Scikit-learn Version: 1.1.3
SciPy Version: 1.9.3
Matplotlib Version: 3.6.3


## Program 3: Display Versions of Seaborn and Other Common Libraries

In [3]:
import seaborn as sns
print("Seaborn Version:", sns.__version__)

Seaborn Version: 0.11.2


## Program 4: Load and Display Iris Dataset
This program loads the famous Iris dataset and displays its contents.

In [4]:
from sklearn.datasets import load_iris
import pandas as pd

iris = load_iris()
df = pd.DataFrame(data=iris.data, columns=iris.feature_names)
df['target'] = iris.target
print(df.head())

   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0                5.1               3.5                1.4               0.2   
1                4.9               3.0                1.4               0.2   
2                4.7               3.2                1.3               0.2   
3                4.6               3.1                1.5               0.2   
4                5.0               3.6                1.4               0.2   

   target  
0       0  
1       0  
2       0  
3       0  
4       0  


## Program 5: Data Preprocessing (Handling Missing Values and Feature Scaling)
This includes imputing missing values and scaling features.

In [5]:
from sklearn.impute import SimpleImputer
from sklearn.preprocessing import StandardScaler
import numpy as np

# Simulate missing values
df.iloc[1:10, 0] = np.nan

# Imputation
imputer = SimpleImputer(strategy='mean')
df.iloc[:, :-1] = imputer.fit_transform(df.iloc[:, :-1])

# Scaling
scaler = StandardScaler()
df.iloc[:, :-1] = scaler.fit_transform(df.iloc[:, :-1])

print(df.head())

   sepal length (cm)  sepal width (cm)  petal length (cm)  petal width (cm)  \
0          -1.033147          1.019004          -1.340227         -1.315444   
1           0.000000         -0.131979          -1.340227         -1.315444   
2           0.000000          0.328414          -1.397064         -1.315444   
3           0.000000          0.098217          -1.283389         -1.315444   
4           0.000000          1.249201          -1.340227         -1.315444   

   target  
0       0  
1       0  
2       0  
3       0  
4       0  


## Program 6: Apply Linear Regression
Linear Regression is used for predicting continuous values.

In [6]:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

X = df.iloc[:, :-1]
y = df.iloc[:, -1]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
model = LinearRegression()
model.fit(X_train, y_train)

predictions = model.predict(X_test)
print("Mean Squared Error:", mean_squared_error(y_test, predictions))

Mean Squared Error: 0.03777175849656069


## Program 7: Apply Decision Tree Classifier
A Decision Tree is a supervised ML model used for classification.

In [7]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score

clf = DecisionTreeClassifier()
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

Accuracy: 1.0


## Program 8: K-Nearest Neighbors (KNN) Classifier
KNN is a simple classification algorithm based on proximity to training data.

In [8]:
from sklearn.neighbors import KNeighborsClassifier

knn = KNeighborsClassifier(n_neighbors=3)
knn.fit(X_train, y_train)
knn_pred = knn.predict(X_test)
print("KNN Accuracy:", accuracy_score(y_test, knn_pred))

KNN Accuracy: 1.0


## Program 9: Support Vector Machine (SVM) Classifier
SVMs are powerful for classification of linearly and non-linearly separable data.

In [9]:
from sklearn.svm import SVC

svm = SVC()
svm.fit(X_train, y_train)
svm_pred = svm.predict(X_test)
print("SVM Accuracy:", accuracy_score(y_test, svm_pred))

SVM Accuracy: 1.0


## Program 10: K-Means Clustering
K-Means is an unsupervised algorithm for clustering similar data points.

In [10]:
from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(X)
print("Cluster Centers:\n", kmeans.cluster_centers_)
print("Cluster Labels:\n", kmeans.labels_[:10])

Cluster Centers:
 [[-0.1550907  -0.89339955  0.34522179  0.28439302]
 [-0.90601297  0.85326268 -1.30498732 -1.25489349]
 [ 1.11177844  0.07903422  0.98537152  0.99908828]]
Cluster Labels:
 [1 1 1 1 1 1 1 1 1 1]
