# cudagrad

Tensor-valued autograd engine for Python

# Warnings 🐲🐉

This is an experimental learning project and will be unstable until version 1.0.0 as per [SemVer-4](https://semver.org/):

> Anything MAY change at any time. The public API SHOULD NOT be considered stable.

In [1]:
import cudagrad

cudagrad.__version__

'0.0.46'

Long-term, only nvcc will be supported.

## 0.0.48+ – nvcc only

These are **broken** experiments as work is done to make pybind11 use nvcc.

## 0.0.47 – gcc or clang

Support for gcc was added to make the transition to nvcc easier, as nvcc uses gcc as the host compiler on Ubuntu.

### Broken on newer versions of pip

There was a change to pip causing a runtime warning to become a runtime error during installation. The error is an (unused) external dependency being declared outside of the pyproject.toml.

### Broken if nvcc command is not found

While you only truly need gcc (or clang), there is an unneeded check during installation (Python `which`) to see if nvcc is present during installation. The check can be bypassed by making a dummy nvcc command:

```sh
echo 'export PATH=$PATH:/usr/local/bin' >> ~/.bashrc && source ~/.bashrc && echo -e '#!/bin/bash\necho "Dummy nvcc command"' > /usr/local/bin/nvcc && chmod +x /usr/local/bin/nvcc
```

## 0.0.46- – clang only

Can only be installed with clang but not gcc. Tested on:

In [2]:
from os import system

system("clang --version")
system("python --version")
system("pip --version")

Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin22.3.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Python 3.11.1
pip 23.1.2 from /Users/ryan/.pyenv/versions/3.11.1/lib/python3.11/site-packages/pip (python 3.11)


0


# Installation

[Available on PyPi](https://pypi.org/project/cudagrad/), use `pip install cudagrad` to install.

As a warning, NVIDIA's [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html) compiler must be installed on the system for `pip install cudagrad` to work, as cudagrad is a [C++ extension to Python](https://docs.python.org/3/extending/building.html) (using [pybind11](https://github.com/pybind/pybind11)).

# Tensor

cudagrad tensors are like PyTorch tensors, except:

- Tensors only use `float32`
- Tensors `requires_grad` by default
- The `Tensor` constructor takes two lists instead of a nested list: cg.Tensor([size], [data])

## Tensor `__init__`

The data list is loaded in [row-major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order) (left to right, top to bottom)

In [3]:
from cudagrad import Tensor

T = Tensor([2, 1], range(2))
T

<cudagrad.Tensor([2, 1, ], [0, 1, ]) object at 0x10382ea20>

Great! We made a tensor that is a column matrix with the values of 0, and 1. This would be the same as the following in PyTorch for example:

In [4]:
import torch

torch.tensor([[0], [1]], dtype=torch.float32, requires_grad=True)

tensor([[0.],
        [1.]], requires_grad=True)

If we `print` this tensor two matrixes are printed, first the `data`, then the `grad`:

In [5]:
print(T)

[[0],
 [1]]
[[0],
 [0]]


Various operations are supported, far fewer than PyTorch, but I plan to grow this over time... At the moment some basics are supported:

In [6]:
loss = (T + T).sum()
loss

<cudagrad.Tensor([1, ], [2, ]) object at 0x133945168>

You might wondering why I show the address of the Tensor object, unlike PyTorch. It's because it's helpful for debugging, I use this myself for cudagrad's development.

In [7]:
loss.graph()

0x133945168 s
  0x1339450a8 +
    0x10382ea20  
    0x10382ea20  


I'm a big fan of introspection.

## Tensor Methods

Below is some gross stuff to turn the `help` into a string.

In [8]:
import contextlib
import io
import re

with io.StringIO() as buf, contextlib.redirect_stdout(buf):
    help(Tensor)
    HELP = re.split("-{5,}", buf.getvalue())

In [9]:
[x[2:].strip() for x in HELP[0].splitlines() if "(self:" in x]

['__add__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__getitem__(self: cudagrad.tensor.Tensor, arg0: List[int]) -> cudagrad.tensor.Tensor',
 '__init__(self: cudagrad.tensor.Tensor, arg0: List[int], arg1: List[float]) -> None',
 '__matmul__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__mul__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__repr__(self: cudagrad.tensor.Tensor) -> str',
 '__setitem__(self: cudagrad.tensor.Tensor, arg0: List[int], arg1: float) -> None',
 '__str__(self: cudagrad.tensor.Tensor) -> str',
 '__sub__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__truediv__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 'backward(self: cudagrad.tensor.Tensor) -> None',
 'foo(self: int) -> int',
 'get_shared(self: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tenso

Right now this includes the barebones to make a Multi-Layer perceptron:

In [10]:
a = Tensor([2, 2], [2.0, 3.0, 4.0, 5.0])
b = Tensor([2, 2], [6.0, 7.0, 8.0, 9.0])
c = Tensor([2, 2], [10.0, 10.0, 10.0, 10.0])
d = Tensor([2, 2], [11.0, 11.0, 11.0, 11.0])
e = Tensor.relu(((a @ b) + c) * d)
f = e.sum()
f.backward()

print(f.data[[0]])  # awful I know, working on it!
print(f.size)
print(a.grad)
print(b.grad)

[2794]
[0]
[1]
[143.0, 187.0, 143.0, 187.0]
[66.0, 66.0, 88.0, 88.0]


In [11]:
at = torch.tensor(((2.0, 3.0), (4.0, 5.0)), requires_grad=True)
bt = torch.tensor(((6.0, 7.0), (8.0, 9.0)), requires_grad=True)
ct = torch.tensor(((10.0, 10.0), (10.0, 10.0)), requires_grad=True)
dt = torch.tensor(((11.0, 11.0), (11.0, 11.0)), requires_grad=True)
et = torch.relu(((at @ bt) + ct) * dt)
ft = et.sum()
ft.backward()

print(ft.data)
print(ft.size())
print(at.grad)
print(bt.grad)

tensor(2794.)
torch.Size([])
tensor([[143., 187.],
        [143., 187.]])
tensor([[66., 66.],
        [88., 88.]])


## Tensor static methods

In [12]:
[x[2:].strip() for x in HELP[1].splitlines() if "(arg0:" in x]

['explode(arg0: List[int], arg1: float) -> cudagrad.tensor.Tensor',
 'ones(arg0: List[int]) -> cudagrad.tensor.Tensor',
 'rand(arg0: List[int]) -> cudagrad.tensor.Tensor',
 'zeros(arg0: List[int]) -> cudagrad.tensor.Tensor']

These turn out to be very helpful, `explode` is the only way to do broadcast at the moment:

In [13]:
Tensor.explode([2], 4.2)

<cudagrad.Tensor([2, ], [4.2, 4.2, ]) object at 0x1223d1de8>

## Tensor readonly properties

In [14]:
[x[2:].strip() for x in HELP[2].splitlines()[2:] if x[2:].strip() != ""]

['data', 'grad', 'size']

This is what that would look like using PyTorch:

# Neural Networks

The neural networks this project provides will be written purely in Python, using only the `cudagrad.Tensor`. Ideally, this helps improve the `Tensor` class over time.

Please see the [GitHub repository](https://github.com/yrmo/cudagrad) for examples of it's current capabilities (flexing).