
# MicroGrad: Optimized Minimalistic Deep Learning Library

This notebook is an optimized version of micrograd inspired by __"Andrej Karpathy's" MicroGrad project__. 
I have extended the core functionality, enhanced performance, and added extensive documentation for clarity. 
The main goals of this notebook include:



## Features Implemented:
- **Core Engine**: Handles reverse-mode automatic differentiation using the `Value` class.
- **Activation Functions**: Includes ReLU, Sigmoid, and Tanh.
- **Loss Functions**: Supports Mean Squared Error (MSE).
- **Optimizers**: Stochastic Gradient Descent (SGD).
- **Neural Networks**: Build and train simple neural networks.


In [1]:

import numpy as np
import math
import random


In [2]:

class Value:
    def __init__(self, data, _children=(), _op='', _label=''):
        self.data = data
        self.grad = 0.0
        self._backward = lambda: None
        self._prev = set(_children)
        self._op = _op
        self._label = _label

    def __repr__(self):
        return f"Value(data={self.data:.4f}, grad={self.grad:.4f})"
    
    def __add__(self, other):
        other = other if isinstance(other, Value) else Value(other)
        out = Value(self.data + other.data, (self, other), '+')
        
        def _backward():
            self.grad += 1.0 * out.grad
            other.grad += 1.0 * out.grad
        out._backward = _backward
        return out

    def __mul__(self, other):
        other = other if isinstance(other, Value) else Value(other)
        out = Value(self.data * other.data, (self, other), '*')
        
        def _backward():
            self.grad += other.data * out.grad
            other.grad += self.data * out.grad
        out._backward = _backward
        return out

    def __pow__(self, other):
        assert isinstance(other, (int, float)), "only supporting int/float powers for now"
        out = Value(self.data ** other, (self,), f'^{other}')
        
        def _backward():
            self.grad += other * (self.data ** (other - 1)) * out.grad
        out._backward = _backward
        return out

    def tanh(self):
        x = self.data
        t = (math.exp(2 * x) - 1) / (math.exp(2 * x) + 1)
        out = Value(t, (self,), 'tanh')
        
        def _backward():
            self.grad += (1 - t ** 2) * out.grad
        out._backward = _backward
        return out

    def relu(self):
        out = Value(max(0, self.data), (self,), 'ReLU')
        
        def _backward():
            self.grad += (out.data > 0) * out.grad
        out._backward = _backward
        return out

    def backward(self):
        topo = []
        visited = set()
        
        def build_topo(v):
            if v not in visited:
                visited.add(v)
                for child in v._prev:
                    build_topo(child)
                topo.append(v)

        build_topo(self)
        self.grad = 1.0
        for node in reversed(topo):
            node._backward()



## Additional Functions

### Activation Functions
- `tanh()`: Hyperbolic tangent activation function.
- `relu()`: Rectified Linear Unit activation function.

### Example Usage
```python
a = Value(2.0)
b = Value(-3.0)
c = a * b
d = c + a ** 2
e = d.tanh()
e.backward()
print(e)
print(a.grad, b.grad)
```


In [3]:

class SGD:
    def __init__(self, parameters, lr=0.01):
        self.parameters = parameters
        self.lr = lr

    def step(self):
        for p in self.parameters:
            p.data -= self.lr * p.grad

    def zero_grad(self):
        for p in self.parameters:
            p.grad = 0.0



### Optimizer: Stochastic Gradient Descent (SGD)

This class provides a simple implementation of the Stochastic Gradient Descent algorithm. It updates each parameter based on its gradient.

**Usage Example**:
```python
params = [Value(0.5), Value(-0.5)]
optimizer = SGD(params, lr=0.01)
optimizer.step()
```
