Skip to content

Machine Learning & Parallel Computing projects, including NVIDIA DGX A100 supercomputer analysis and an ML model for extreme weather classification (Decision Tree & Random Forest).

Notifications You must be signed in to change notification settings

Remmy04/Machine-Learning-Parallel-Computing

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

10 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

⚑ Machine Learning & Parallel Computing (MLPC)

This repository contains both major assessments completed for the Machine Learning & Parallel Computing (ITS66604) module.

Across these projects, I explored the foundations of supercomputing, GPU architecture, parallel processing, and machine learning workflows, applying them to real-world analytical tasks.

The repository includes:

  • πŸ–₯ High-Performance Computing (HPC) Supercomputer Analysis
  • πŸ€– Machine Learning Classification Project (Weather Extreme Detection)

πŸš€ What I Learned in This Module

🧠 Machine Learning Foundations

  • Exploratory Data Analysis (EDA)
  • Data preprocessing & cleaning
  • Handling anomalies and missing values
  • Label engineering (custom rule-based classification)
  • Feature engineering (interaction features, distractor features)
  • Gaussian noise injection for robustness
  • Class imbalance handling (oversampling)
  • Encoding & scaling techniques (LabelEncoder, StandardScaler)
  • Machine learning workflows using:
    • Decision Tree
    • Random Forest
  • Model evaluation using:
    • Confusion matrix
    • Precision/Recall/F1-score
    • ROC Curve & AUC scores
      (Random Forest AUC = 0.99 β€” excellent performance)

πŸ”₯ Parallel Computing & HPC Concepts

From the Supercomputer project:

  • Supercomputer architecture (NVIDIA DGX A100 + SuperPOD)
  • Parallel GPU processing (CUDA cores, Tensor Cores)
  • NVLink + NVSwitch high-speed interconnect
  • Distributed computing & InfiniBand networking
  • GPU virtualization (MIG)
  • HPC benchmark interpretation (HPL, HPCG, MLPerf)
  • Storage hierarchy & parallel file systems
  • Energy-efficient computing (Green500 insights)
  • Use cases:
    • AI training at scale
    • Autonomous driving simulation
    • Weather & climate modeling
    • Scientific computing & medical research

πŸ“˜ Projects in This Repository

πŸ–₯ 1. Supercomputer Architecture Analysis (NVIDIA DGX A100 + SuperPOD)

A complete study of HPC design, including:

  • GPU/CPU topology
  • NVLink / NVSwitch connectivity
  • InfiniBand networking
  • Multi-node scaling
  • Power & cooling strategies
  • AI-accelerated workloads

πŸ“ Location: /Supercomputer/


🌦 2. Extreme Weather Classification (Machine Learning)

A full ML pipeline to classify Normal vs Extreme weather using global meteorological data.

Includes:

  • 40+ weather features
  • Full preprocessing pipeline
  • Noise injection & advanced feature engineering
  • Oversampling to fix class imbalance
  • Decision Tree vs Random Forest comparison
  • Achieved up to 95% accuracy and AUC 0.99

πŸ“ Location: /Extreme Weather Classification/


🧩 Skills Demonstrated

πŸ”§ Technical Skills

  • Machine Learning (scikit-learn)
  • Data cleaning & feature engineering
  • EDA visualization (Seaborn, Matplotlib)
  • GPU architecture & parallel computing
  • HPC system design & benchmarking
  • Jupyter/Colab development
  • CSV data processing at scale

πŸ“Š Analytical Skills

  • Performance comparison & interpretation
  • Model explainability (feature importance)
  • Handling real-world ML challenges (noise, imbalance, anomalies)

πŸ“ Professional Skills

  • Technical reporting
  • Academic poster design
  • Structured documentation
  • Scientific analysis & evaluation

✨ Thank you for exploring my MLPC repository!

About

Machine Learning & Parallel Computing projects, including NVIDIA DGX A100 supercomputer analysis and an ML model for extreme weather classification (Decision Tree & Random Forest).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published