# Addition

In [1]:
import torch
import ttnn

## Open the device

Use `ttnn.open` to get a handle to the device

In [2]:
device_id = 0
device = ttnn.open(device_id)

[38;2;000;128;000m                  Metal[0m | [1m[38;2;100;149;237mINFO    [0m | Initializing device 0
[38;2;000;128;000m                 Device[0m | [1m[38;2;100;149;237mINFO    [0m | Opening user mode device driver
[32m2024-01-29 16:40:57.755[0m | [1m[38;2;100;149;237mINFO    [0m | [36mSiliconDriver  [0m - Detected 1 PCI device : {0}
[0;33m---- ttSiliconDevice::init_hugepage: bind_area_to_memory_nodeset() failed (physical_device_id: 0 ch: 0). Hugepage allocation is not on NumaNode matching TT Device. Side-Effect is decreased Device->Host perf (Issue #893).
[0m[38;2;000;128;000m                  Metal[0m | [1m[38;2;100;149;237mINFO    [0m | AI CLK for device 0 is:   1202 MHz


## Configuration

In [3]:
h = 31
w = 22

## Initialize tensors a and b with random values using torch

To create a tensor that can be used by a `ttnn` operation:
1. Create a tensor using torch
2. Use `ttnn.from_torch` to convert the tensor from `torch.Tensor` to `ttnn.Tensor`
3. Copy the tensor onto to the device using `to_device`

In [4]:
torch.manual_seed(0)

torch_a = torch.rand((h, w), dtype=torch.bfloat16)
torch_b = torch.rand((h, w), dtype=torch.bfloat16)

a = ttnn.from_torch(torch_a, layout=ttnn.TILE_LAYOUT, device=device)
b = ttnn.from_torch(torch_b, layout=ttnn.TILE_LAYOUT, device=device)

[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.from_torch                                    in          125109 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           41469 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.to_layout                                     in          126840 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           29630 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.to_device                                     in           93529 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;

## Add tensor a and b

`ttnn` supports operator overloading, therefore operator `+` can be used instead of `torch.add`

In [5]:
output = a + b

[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           29700 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           23570 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Program   tt::tt_metal::EltwiseBinary                        in       501037419 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation tt::tt_metal::EltwiseBinary                        in       501149148 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           59050 nanoseconds


## Convert the tensor to ROW_MAJOR_LAYOUT as the output of ttnn.add is in TILE_LAYOUT

In [6]:
output = ttnn.to_layout(output, layout=ttnn.ROW_MAJOR_LAYOUT)

[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           31060 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Program   tt::tt_metal::UntilizeWithUnpadding                in       452708718 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation tt::tt_metal::UntilizeWithUnpadding                in       452808757 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.reshape                                       in           62359 nanoseconds


## Inspect the result of the add

In [7]:
print(f"shape: {output.shape}")
print(f"dtype: {output.dtype}")
print(f"layout: {output.layout}")
print(f"first row: {output[:1]}")

shape: ttnn.Shape([31, 22])
dtype: DataType.BFLOAT16
layout: Layout.ROW_MAJOR
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.from_device                                   in           80960 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.to_torch                                      in          449357 nanoseconds
first row: Tensor([ [0.851562, 0.183594, 0.863281, 1.45312, 0.410156, 1.32812, 1.05469, 1.35156, 0.917969, 1.30469, 0.421875, 1.32031, 0.464844, 0.441406, 1.07812, 0.96875, 0.792969, 1.32031, 0.605469, 1.42969, 1.13281, 0.375]], dtype=bfloat16 )

[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation torch.Tensor.__getitem__                           in          678126 nanoseconds
[38;2;000;128;000m                     Op[0m | [1m[38;2;100;149;237mINFO    [0m | Finished Operation ttnn.

## Close the device

Close the handle the device. This is a very important step as the device can hang if not closed properly

In [8]:
ttnn.close(device)

[38;2;000;128;000m                  Metal[0m | [1m[38;2;100;149;237mINFO    [0m | Closing device 0
