This repository contains a collection of Python scripts covering data preprocessing, machine learning, regression, and algorithm visualization.
The scripts were created for coursework and experiments in AI and ML.
.csv
/ .xlsx
) that are not included in this repository due to availability or licensing restrictions.
If you want to run those scripts, you will need to prepare or substitute similar datasets.
- Cleans and merges raw datasets (gas flows, process parameters, and specs).
- Handles outliers using the Interquartile Range (IQR).
- Imputes missing values with column means.
- Normalizes numerical features for machine learning.
- Saves a cleaned dataset as CSV.
⚠️ Requiresgas_flows.xlsx
,process_parameters.xlsx
, andIOP_specs.xlsx
, which are not provided:contentReference[oaicite:0]{index=0}.
- Loads the cleaned dataset.
- Fits Random Forest Regressors to predict
C
andMET
. - Evaluates model performance using Root Mean Squared Error (RMSE).
⚠️ Requirescleaned_merged_data.csv
, which is generated bydata_cleaning.py
:contentReference[oaicite:1]{index=1}.
- Implements Polynomial Regression (degree = 2) manually using matrix algebra (Normal Equation).
- Computes weights, predictions, and RMSE.
⚠️ Requires an externaldata.csv
file:contentReference[oaicite:2]{index=2}.
- Fits a Polynomial Regression model with scikit-learn.
- Splits dataset into train/test.
- Calculates RMSE, predicts values, and plots fitted curves.
⚠️ Depends ondata.csv
:contentReference[oaicite:3]{index=3}.
- Fits polynomial regression on the entire dataset.
- Evaluates RMSE across all points.
- Plots regression curve.
⚠️ Requiresdata.csv
:contentReference[oaicite:4]{index=4}.
- Custom implementation of K-Nearest Neighbors (KNN) from scratch.
- Includes performance metrics (confusion matrix, accuracy, precision, recall, F1).
- Uses a hardcoded sample dataset, so it can run without external data:contentReference[oaicite:5]{index=5}.
- Compares multiple classifiers on the Iris dataset:
- KNN
- Decision Trees
- Random Forests
- Naive Bayes
- Linear SVM
- Kernel SVM
- Reports classification accuracy for each.
✅ Uses scikit-learn’s built-in Iris dataset, no extra files needed:contentReference[oaicite:6]{index=6}.
- Performs hyperparameter tuning for KNN, Decision Trees, Random Forests, and SVMs.
- Finds optimal values for
k
, tree depth, estimator count, kernel, and gamma.
⚠️ Requiresdata.xlsx
containing moon, circle, and blob datasets:contentReference[oaicite:7]{index=7}.
- Tkinter GUI to visualize the Knight’s Tour problem on chessboards (5x5 → 8x8).
- Allows setting starting position and animation speed.
- Animates knight’s path with board highlighting.
✅ Runs without datasets.
⚠️ Requires a localknight.png
image to display the knight piece:contentReference[oaicite:8]{index=8}.
Clone the repository and run any script:
git clone https://github.com/yourusername/python_AI_Scripts.git
cd python_AI_Scripts
python knn_classifier.py