The neural networks included in [cudagrad](https://github.com/yrmo/cudagrad) are written purely in Python, using only the `cudagrad.Tensor` for learning. Ideally, this helps improve the `Tensor` class over time. Please see the repository for examples of the neural networks that can be made with `Tensor` currently ([flexing](https://youtu.be/VMj-3S1tku0?t=271s)).

# Warning 🐲🐉

This is an experimental learning project and will be unstable until version 1.0.0 as per [SemVer-4](https://semver.org/):

> Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable.

In [1]:
import cudagrad

cudagrad.__version__

'0.0.52'

# Installation

Available on [PyPi](https://pypi.org/project/cudagrad/) using `pip install cudagrad`.

As cudagrad is a [C++ extension to Python](https://docs.python.org/3/extending/building.html) (using [pybind11](https://github.com/pybind/pybind11)) that builds from source at installation time, you need to have a C++ compiler. Currently both `clang` and `gcc` are supported, but in the future, installation will also require NVIDIA's CUDA C++ compiler [`nvcc`](https://docs.nvidia.com/cuda/cuda-compiler-driver-nvcc/index.html), as well as `cmake`.

In [2]:
from os import system
from shutil import which

system("c++ --version")
system("python --version")
system("pip --version")

if which("cmake") and which("nvcc"):
    system("cmake --version")
    system("nvcc --version");

Apple clang version 14.0.3 (clang-1403.0.22.14.1)
Target: arm64-apple-darwin23.2.0
Thread model: posix
InstalledDir: /Library/Developer/CommandLineTools/usr/bin
Python 3.11.1
pip 23.1.2 from /Users/ryan/.pyenv/versions/3.11.1/lib/python3.11/site-packages/pip (python 3.11)


# Example

In [3]:
from cudagrad import Tensor

a = Tensor([2, 2], [2.0, 3.0, 4.0, 5.0])
b = Tensor([2, 2], [6.0, 7.0, 8.0, 9.0])
c = Tensor([2, 2], [10.0, 10.0, 10.0, 10.0])
d = Tensor([2, 2], [11.0, 11.0, 11.0, 11.0])
e = Tensor.relu(((a @ b) + c) * d)
f = e.sum()
f.backward()

print(f.data())
print(f.size)
print(a.grad())
print(b.grad())

[2794.0]
[1]
[143.0, 187.0, 143.0, 187.0]
[66.0, 66.0, 88.0, 88.0]


This is what that would look like using PyTorch:

In [4]:
from torch import tensor, relu

at = tensor(((2.0, 3.0), (4.0, 5.0)), requires_grad=True)
bt = tensor(((6.0, 7.0), (8.0, 9.0)), requires_grad=True)
ct = tensor(((10.0, 10.0), (10.0, 10.0)), requires_grad=True)
dt = tensor(((11.0, 11.0), (11.0, 11.0)), requires_grad=True)
et = relu(((at @ bt) + ct) * dt)
ft = et.sum()
ft.backward()

print(ft.data)
print(ft.size())
print(at.grad)
print(bt.grad)

tensor(2794.)
torch.Size([])
tensor([[143., 187.],
        [143., 187.]])
tensor([[66., 66.],
        [88., 88.]])


# Tensor

Tensors in cudagrad are like PyTorch tensors, except:

- Tensors only use `float32`
- Tensors `requires_grad` by default
- The `Tensor` constructor takes two lists instead of a nested list: `cudagrad.Tensor([size], [data])`
  
Known limitations:

- Implicit broadcasting of tensors of rank > 2 during backpropagation has not yet been implemented, and will raise a runtime error

## Tensor `__init__`

The data list is loaded in [row-major order](https://en.wikipedia.org/wiki/Row-_and_column-major_order) (left to right, top to bottom):

In [5]:
from cudagrad import Tensor

t = Tensor([2, 1], range(2))
t

<Tensor([2, 1, ], [0, 1, ]) object at 0x169d95b40 DefaultBackward>

Great! We made a tensor that is a column matrix with the values of 0, and 1. This would be the same as the following in PyTorch for example:

In [6]:
from torch import tensor, float32

tensor([[0], [1]], dtype=float32, requires_grad=True)

tensor([[0.],
        [1.]], requires_grad=True)

## Tensor Intro

If we `print` this tensor two matrixes are printed, first the `data`, then the `grad`:

In [7]:
print(t)

[[0],
 [1]]
[[0],
 [0]]


Various operations are supported, far fewer than PyTorch, but I plan to grow this over time... At the moment some basics are supported:

In [8]:
loss = (t + t).sum()
loss

<Tensor([1, ], [2, ]) object at 0x118159538 SumBackward>

You might wondering why I show the address of the Tensor object, unlike PyTorch. It's because it's helpful for debugging, I use this myself for cudagrad's development.

In [9]:
loss.graph()

0x118159538 SumBackward
  0x118194b78 AddBackward
    0x169d95b40  
    0x169d95b40  


I'm a big fan of introspection.

Below is some gross stuff to turn the `help` into a string:

In [10]:
import contextlib
import io
import re

with io.StringIO() as buf, contextlib.redirect_stdout(buf):
    help(Tensor)
    HELP = re.split("-{5,}", buf.getvalue())

## Tensor Methods

Right now this includes the barebones to make a Multi-Layer perceptron:

In [11]:
[x[2:].strip() for x in HELP[0].splitlines() if "(self:" in x]

['__add__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__init__(self: cudagrad.tensor.Tensor, arg0: List[int], arg1: List[float]) -> None',
 '__matmul__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__mul__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__repr__(self: cudagrad.tensor.Tensor) -> str',
 '__str__(self: cudagrad.tensor.Tensor) -> str',
 '__sub__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 '__truediv__(self: cudagrad.tensor.Tensor, arg0: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 'backward(self: cudagrad.tensor.Tensor) -> None',
 'get_shared(self: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 'graph(self: cudagrad.tensor.Tensor) -> None',
 'item(self: cudagrad.tensor.Tensor) -> float',
 'relu(self: cudagrad.tensor.Tensor) -> cudagrad.tensor.Tensor',
 'sigmoid(self: cudagrad.tensor.Te

In [12]:
t + t

<Tensor([2, 1, ], [0, 2, ]) object at 0x169d9fcf8 AddBackward>

In [13]:
# FIXME repr truncates
print(t @ Tensor([1, 2], range(2)))

[[0, 0],
 [0, 1]]
[[0, 0],
 [0, 0]]


In [14]:
t * t

<Tensor([2, 1, ], [0, 1, ]) object at 0x118159608 MulBackward>

In [15]:
t - t

<Tensor([2, 1, ], [0, 0, ]) object at 0x11925a768 MinusBackward>

In [16]:
# FIXME nan
t / t

<Tensor([2, 1, ], [nan, 1, ]) object at 0x1192ea538 DivBackward>

In [17]:
t.backward

<bound method PyCapsule.backward of <Tensor([2, 1, ], [0, 1, ]) object at 0x169d95b40 DefaultBackward>>

In [18]:
# FIXME remove binding
t.get_shared()

<Tensor([2, 1, ], [0, 1, ]) object at 0x169d95b40 DefaultBackward>

In [19]:
# FIXME non scalar tensor item shouldn't be allowed
#       why does this return data?
t.item()

0.0

In [20]:
t.data[[0, 0]]

<Tensor([1, ], [0, ]) object at 0x1192ea608 SelectBackward>

In [21]:
t.data[[0, 0]].item(), t.data[[1, 0]].item()

(0.0, 1.0)

In [22]:
# FIXME bounds check?
t.data[[1, 1]].item()

0.0

In [23]:
Tensor.relu(Tensor([2], [-0.5, 0.5]))

<Tensor([2, ], [0, 0.5, ]) object at 0x118195728 ReluBackward>

In [24]:
Tensor.sigmoid(Tensor([2], [-0.5, 0.5]))

<Tensor([2, ], [0.377541, 0.622459, ]) object at 0x1192e9168 SigmoidBackward>

In [25]:
t.sum()

<Tensor([1, ], [1, ]) object at 0x1192e9238 SumBackward>

In [26]:
t.grad[[0, 1]] = 4.2

print(t)

[[0],
 [1]]
[[0],
 [4.2]]


In [27]:
t.zero_grad()
print(t)

[[0],
 [1]]
[[0],
 [0]]


## Tensor static methods

In [28]:
[x[2:].strip() for x in HELP[1].splitlines() if "(arg0:" in x]

['explode(arg0: List[int], arg1: float) -> cudagrad.tensor.Tensor',
 'ones(arg0: List[int]) -> cudagrad.tensor.Tensor',
 'rand(arg0: List[int]) -> cudagrad.tensor.Tensor',
 'zeros(arg0: List[int]) -> cudagrad.tensor.Tensor']

These turn out to be very helpful, `explode` is the only way to do broadcast at the moment:

In [29]:
Tensor.zeros([2])

<Tensor([2, ], [0, 0, ]) object at 0x12e24fa38 DefaultBackward>

In [30]:
Tensor.ones([2])

<Tensor([2, ], [1, 1, ]) object at 0x1192fea98 DefaultBackward>

In [31]:
Tensor.rand([2])

<Tensor([2, ], [0.798007, 0.938286, ]) object at 0x12a06f3f8 DefaultBackward>

Notice how `explode` has a slightly different signature: 

In [32]:
Tensor.explode([2], 4.2)

<Tensor([2, ], [4.2, 4.2, ]) object at 0x169da9238 DefaultBackward>

## Tensor readonly properties

In [33]:
[x[2:].strip() for x in HELP[2].splitlines()[2:] if x[2:].strip() != ""]

['data', 'grad', 'size']

In [34]:
t.data()

[0.0, 1.0]

In [35]:
t.grad()

[0.0, 0.0]

In [36]:
t.size

[2, 1]

In [37]:
type(t.size), t.size[0], t.size[1]

(list, 2, 1)