Skip to content

amirinjast/Machine-Learning-Projects

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Machine Learning Mini Projects (P2, P3, P4)

A collection of machine learning mini-projects completed for the K. N. Toosi University of Technology Machine Learning course. Each project script is self-contained and generates plots and reports to dedicated folders.

Internet access is required for datasets fetched from OpenML and public repositories.

Repository Structure

Machine-Learning-Projects/
├── P2.py                        # Mini Project 2: Naive Bayes, KNN+PCA, Decision Tree
├── P3.py                        # Mini Project 3: SVM (Air Quality) + PCA/LDA (Fashion-MNIST)
├── P4.py                        # Mini Project 4: McCulloch-Pitts, Weather NN, Q-learning (optional)
├── requirements.txt             # Base Python dependencies
├── project2_plots/              # Generated plots for P2
├── project3_data/               # Engineered dataset artifacts for P3
├── project3_plots/              # Generated plots for P3
├── project4_plots/              # Generated plots for P4
├── report_plots/                # Duplicated/curated plots for reports
└── Figure_1.png, Figure_2.png   # Additional figures

Environment Setup

It is recommended to use Python 3.9–3.10 on Windows. TensorFlow support may be easiest with Python 3.10.

  1. Create and activate a virtual environment (PowerShell on Windows):
py -3.10 -m venv .venv
.\.venv\Scripts\Activate.ps1
  1. Install base requirements:
pip install -r requirements.txt
  1. Install extra packages used by P3 and P4:
pip install imbalanced-learn liac-arff tensorflow

Notes:

  • If fetch_openml raises a parser warning, installing liac-arff helps (already listed above).
  • If you face issues installing TensorFlow on Windows/CPU, try a specific version, e.g. pip install tensorflow==2.13.* with Python 3.10.

How to Run

All scripts save their plots into their respective projectX_plots/ folders and print progress to the console. Ensure you have an active internet connection for dataset downloads.

P2 — Naive Bayes (Spam), KNN on MNIST with PCA, Decision Tree

File: P2.py

This script runs three tasks sequentially:

  • Spam detection via Multinomial Naive Bayes (from scratch and scikit-learn).
  • KNN digit classification on MNIST-784 with k-tuning and PCA-based dimensionality reduction.
  • Decision Tree classification on the Carseats dataset with hyperparameter tuning and visualization.

Run:

python P2.py

Outputs (saved to project2_plots/):

  • chart1_knn_vs_k.png
  • chart2_knn_vs_pca.png
  • chart3_dt_confusion_matrix.png
  • chart4_optimized_tree.png

Datasets Used:

  • SMS Spam: https://raw.githubusercontent.com/justmarkham/pycon-2016-tutorial/master/data/sms.tsv
  • MNIST-784: OpenML via sklearn.datasets.fetch_openml
  • Carseats: https://raw.githubusercontent.com/JWarmenhoven/ISLR-python/master/Notebooks/Data/Carseats.csv

P3 — SVM (Beijing Air Quality) + PCA/LDA (Fashion-MNIST)

File: P3.py

This script has two parts:

  1. SVM classification of engineered air quality categories for Beijing with SMOTE, scaling, and GridSearchCV.
    • Saves engineered data to project3_data/beijing_aq_engineered.csv.
    • Saves evaluation plots to project3_plots/.
  2. Dimensionality Reduction on Fashion-MNIST:
    • Explained variance analysis with PCA.
    • Denoising reconstruction using PCA.
    • 2D visualizations comparing PCA vs. LDA.

Run:

python P3.py

Outputs (saved to project3_plots/):

  • plot_pca_explained_variance.png
  • plot_pca_reconstruction.png
  • plot_pca_vs_lda.png
  • plot_svm_final_confusion_matrix.png

Datasets Used:

  • Beijing PM2.5 Data: UCI ML Repository https://archive.ics.uci.edu/ml/machine-learning-databases/00381/PRSA_data_2010.1.1-2014.12.31.csv
  • Fashion-MNIST: OpenML via sklearn.datasets.fetch_openml

Tips:

  • The OpenML downloads may take a while and consume memory; consider limiting samples if needed.

P4 — McCulloch-Pitts, Weather Forecasting NN, Q-learning (Optional)

File: P4.py

This script includes:

  • McCulloch-Pitts neuron network to classify points inside a triangle. Generates a plot.
  • Weather prediction using a feedforward neural network (Keras/TensorFlow) on a sliding-window dataset.
    • Requires weather_prediction_dataset.csv in the project root. If missing, the script will print a helpful message and skip training.
  • Optional Q-learning agent for the Wumpus World (kept commented by default).

Run:

python P4.py

Outputs (saved to project4_plots/):

  • plot1_mcculloch_pitts.png
  • plot2_simple_nn_loss.png
  • plot3_deep_nn_loss.png
  • plot4_q_learning_reward.png (only if you uncomment the Q-learning section)

To enable Q-learning in P4.py, uncomment the line near the bottom:

# question_three_wumpus_world()

Troubleshooting

  • OpenML/Network:
    • Ensure you are online; some datasets are fetched at runtime.
    • If downloads are slow or fail, retry later or configure an OpenML API key/cache.
  • TensorFlow on Windows:
    • Prefer Python 3.10 and TensorFlow 2.13+ for CPU installs.
    • If GPU is desired, install CUDA/cuDNN versions compatible with your TensorFlow version.
  • Virtual environment activation policy (PowerShell):
    • If activation is blocked, run PowerShell as Administrator and execute:
      Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Contributing / License

This is a coursework repository. If you plan to extend it or accept contributions, consider adding a LICENSE file and contribution guidelines.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages