Skip to content

Akshay9715/miniML

Repository files navigation

MiniML Advanced

A Machine Learning and Deep Learning Library Built Completely From Scratch Using NumPy.

Python NumPy License Status


Overview

MiniML Advanced is a fully modular machine learning library implemented from scratch using NumPy without relying on high-level ML frameworks like scikit-learn for core algorithm implementations.

The project was built to deeply understand the mathematical and engineering foundations behind modern machine learning and deep learning systems.

It includes implementations of:

  • Classical Machine Learning Algorithms
  • Clustering Algorithms
  • Dimensionality Reduction
  • Neural Networks
  • Optimizers
  • Backpropagation Engine
  • Visualization Utilities
  • Benchmarking Infrastructure
  • CLI Tooling

The project focuses not only on algorithm implementation, but also on:

  • Software Architecture
  • Modular Design
  • Numerical Stability
  • Optimization
  • Vectorized Computation
  • Testing and Benchmarking
  • ML Engineering Practices

Installation

Install from PyPI

pip install miniml-adv

Install from Source

git clone https://github.com/Akshay9715/miniML

cd MiniML

pip install -r requirements.txt

pip install -e .

Features

Classical Machine Learning

  • Linear Regression
  • Logistic Regression
  • K-Nearest Neighbors (KNN)

Clustering Algorithms

  • KMeans
  • DBSCAN

Dimensionality Reduction

  • PCA (Principal Component Analysis)

Neural Network Engine

  • Dense Layers
  • Forward Propagation
  • Backpropagation
  • Binary Cross Entropy
  • Mean Squared Error
  • Gradient Descent
  • Mini-Batch Training

Optimizers

  • SGD
  • Momentum
  • Adam

Engineering Features

  • Visualization Tools
  • Benchmarking System
  • CLI Interface
  • Automated Testing
  • Modular Architecture
  • Vectorized Computation
  • Numerical Stability Improvements

Quick Start

Linear Regression Example

from miniml.linear_regression import LinearRegression

model = LinearRegression(
    learning_rate=0.01,
    epochs=1000
)

model.fit(X, y)

predictions = model.predict(X)

score = model.score(X, y)

print(score)

Logistic Regression Example

from miniml.logistic_regression import LogisticRegression

model = LogisticRegression()

model.fit(X, y)

predictions = model.predict(X)

accuracy = model.score(X, y)

print(accuracy)

Neural Network Example

import numpy as np

from miniml.neural_network import NeuralNetwork

from miniml.layers import Dense
from miniml.activations import ReLU, Sigmoid

from miniml.optimizers import Adam

X = np.array([
    [0, 0],
    [0, 1],
    [1, 0],
    [1, 1]
])

y = np.array([
    [0],
    [1],
    [1],
    [0]
])

model = NeuralNetwork()

model.add(Dense(2, 8))
model.add(ReLU())

model.add(Dense(8, 1))
model.add(Sigmoid())

model.compile(
    loss="binary_cross_entropy",
    optimizer=Adam(learning_rate=0.01)
)

model.fit(
    X,
    y,
    epochs=2000,
    batch_size=4
)

predictions = model.predict(X)

print(predictions)

CLI Usage

MiniML Advanced includes a command-line interface for quickly running models.

Run Linear Regression

python cli.py --model linear_regression

Run Logistic Regression

python cli.py --model logistic_regression

Run KMeans

python cli.py --model kmeans

Run Neural Network

python cli.py --model neural_network

Project Architecture

MiniML Advanced follows a modular ML framework design inspired by modern ML libraries.

Core Components

  • Layers
  • Activations
  • Loss Functions
  • Optimizers
  • Training Engine
  • Visualization System
  • Benchmarking Infrastructure
  • CLI System

Neural Network Internals

The neural network engine includes:

  • Forward Propagation
  • Backpropagation
  • Gradient Flow
  • Parameter Updates
  • Mini-Batch Training
  • Loss Computation
  • Optimizer State Management

Mathematical Concepts Implemented

The project implements core mathematical concepts behind machine learning systems:

  • Gradient Descent
  • Binary Cross Entropy
  • Mean Squared Error
  • Covariance Matrices
  • Eigen Decomposition
  • Distance Metrics
  • PCA Projection
  • Clustering Optimization
  • Numerical Stability
  • Vectorized Linear Algebra
  • Chain Rule and Backpropagation

Visualization Tools

MiniML Advanced includes visualization utilities for debugging and understanding models.

Available Visualizations

  • Loss Curves
  • Accuracy Curves
  • Decision Boundaries
  • Cluster Visualizations
  • PCA Projections

Benchmarking

The project includes benchmarking utilities comparing MiniML implementations against scikit-learn implementations.

Benchmarked Metrics

  • Training Time
  • Prediction Time
  • Accuracy
  • Memory Usage

Folder Structure

MiniML/
│
├── miniml/
│   ├── activations.py
│   ├── layers.py
│   ├── losses.py
│   ├── neural_network.py
│   ├── optimizers.py
│   ├── linear_regression.py
│   ├── logistic_regression.py
│   ├── knn.py
│   ├── kmeans.py
│   ├── dbscan.py
│   ├── pca.py
│   └── ...
│
├── benchmarks/
├── visualizations/
├── examples/
├── tests/
├── datasets/
│
├── .github/workflows/
├── cli.py
├── README.md
├── requirements.txt
├── setup.py
├── pyproject.toml
├── LICENSE
└── CHANGELOG.md

Future Improvements

Planned future additions include:

  • CNN Layers
  • Transformer Architecture
  • Automatic Differentiation Engine
  • GPU Support
  • CUDA Acceleration
  • Distributed Training
  • Model Serialization
  • Hyperparameter Tuning
  • Tensor Engine
  • Attention Mechanisms

Why This Project Matters

MiniML Advanced was built to understand the internals of machine learning systems instead of only using high-level frameworks.

This project demonstrates:

  • ML Fundamentals
  • Deep Learning Internals
  • Optimization Techniques
  • Software Engineering
  • Numerical Computing
  • Modular Architecture
  • Benchmarking and Testing
  • Engineering-Oriented ML Development

License

This project is licensed under the MIT License.

About

MiniML is a machine learning library implemented completely from scratch using NumPy without relying on scikit-learn for core algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages