# Tensor and Add Operation

ttnn.Tensor is the central type of ttnn.

It is similar to torch.Tensor in the sense that it represents multi-dimensional matrix containing elements of a single data type.

The are a few key differences:

- ttnn.Tensor can be stored in the SRAM or DRAM of Tenstorrent devices
- ttnn.Tensor doesn't have a concept of the strides, however it has a concept of row-major and tile layout
- ttnn.Tensor has support for data types not supported by torch such as `bfp8` for example
- ttnn.Tensor's shape stores the padding added to the tensor due to TILE_LAYOUT

## Creating a tensor

The recommended way to create a tensor is by using `ttnn` API

In [1]:
import ttnn

2025-07-01 01:32:08.837 | DEBUG    | ttnn:<module>:83 - Initial ttnn.CONFIG:
Config{cache_path=/home/ubuntu/.cache/ttnn,model_cache_path=/home/ubuntu/.cache/ttnn/models,tmp_dir=/tmp/ttnn,enable_model_cache=false,enable_fast_runtime_mode=true,throw_exception_on_fallback=false,enable_logging=false,enable_graph_report=false,enable_detailed_buffer_report=false,enable_detailed_tensor_report=false,enable_comparison_mode=false,comparison_mode_should_raise_exception=false,comparison_mode_pcc=0.9999,root_report_path=generated/ttnn/reports,report_name=std::nullopt,std::nullopt}


## Open the device

Use `ttnn.open_device` to get a handle to the device and create tensors on it.

In [2]:
device_id = 0
device = ttnn.open_device(device_id=device_id)

2025-07-01 01:32:09.417 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.418 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.429 | info     |          Device | Opening user mode device driver (tt_cluster.cpp:174)
2025-07-01 01:32:09.429 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.429 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.433 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.434 | info     |   SiliconDriver | Opened PCI device 0; KMD version: 2.0.0; API: 2; IOMMU: disabled (pci_device.cpp:198)
2025-07-01 01:32:09.439 | info     |   Silicon

And now let's create a ttnn Tensor directly on device

In [3]:
ttnn_tensor = ttnn.rand((3, 4), device=device)

print(f"shape: {ttnn_tensor.shape}")
print(f"layout: {ttnn_tensor.layout}")
print(f"dtype: {ttnn_tensor.dtype}")

shape: Shape([3, 4])
layout: Layout.TILE
dtype: DataType.BFLOAT16


As expected we get a tensor of shape [3, 4] in tile layout with a data type of float16.

## Data Type

The data type of the ttnn tensor can be controlled explicitly when conversion from torch.

In [4]:
ttnn_tensor = ttnn.rand((3, 4), dtype=ttnn.float32, device=device)
torch_tensor = ttnn.to_torch(ttnn_tensor)
print(f"torch_tensor.dtype: {torch_tensor.dtype}")
print(f"ttnn_tensor.dtype: {ttnn_tensor.dtype}")

torch_tensor.dtype: torch.float32
ttnn_tensor.dtype: DataType.FLOAT32


## Layout

Tenstorrent hardware is most efficiently utilized when running tensors using [tile layout](https://tenstorrent.github.io/ttnn/latest/ttnn/tensor.html#layout).
The current tile size is hard-coded to [32, 32]. It was determined to be the optimal size for a tile given the compute, memory and data transfer constraints.


ttnn provides easy and intuitive way to convert from row-major layout to tile layout and back.

In [5]:
ttnn_tensor = ttnn.rand((3, 4), layout=ttnn.ROW_MAJOR_LAYOUT, device=device)
print(f"Tensor in row-major layout:\nShape {ttnn_tensor.shape}\nLayout: {ttnn_tensor.layout}\n{ttnn_tensor}")
ttnn_tensor = ttnn.to_layout(ttnn_tensor, ttnn.TILE_LAYOUT)
print(f"Tensor in tile layout:\nShape {ttnn_tensor.shape}\nLayout: {ttnn_tensor.layout}\n{ttnn_tensor}")
ttnn_tensor = ttnn.to_layout(ttnn_tensor, ttnn.ROW_MAJOR_LAYOUT)
print(f"Tensor back in row-major layout:\nShape {ttnn_tensor.shape}\nLayout: {ttnn_tensor.layout}\n{ttnn_tensor}")

Tensor in row-major layout:
Shape Shape([3, 4])
Layout: Layout.ROW_MAJOR
ttnn.Tensor([[ 0.03174,  0.01587,  0.12695,  0.06348],
             [ 0.72656,  0.86328,  0.91406,  0.45703],
             [ 0.72266,  0.36133,  0.90234,  0.45117]], shape=Shape([3, 4]), dtype=DataType::BFLOAT16, layout=Layout::ROW_MAJOR)
Tensor in tile layout:
Shape Shape([3, 4])
Layout: Layout.TILE
ttnn.Tensor([[ 0.03174,  0.01587,  0.12695,  0.06348],
             [ 0.72656,  0.86328,  0.91406,  0.45703],
             [ 0.72266,  0.36133,  0.90234,  0.45117]], shape=Shape([3, 4]), dtype=DataType::BFLOAT16, layout=Layout::TILE)
Tensor back in row-major layout:
Shape Shape([3, 4])
Layout: Layout.ROW_MAJOR
ttnn.Tensor([[ 0.03174,  0.01587,  0.12695,  0.06348],
             [ 0.72656,  0.86328,  0.91406,  0.45703],
             [ 0.72266,  0.36133,  0.90234,  0.45117]], shape=Shape([3, 4]), dtype=DataType::BFLOAT16, layout=Layout::ROW_MAJOR)


Note that padding is automatically inserted to put the tensor into tile layout and it automatically removed after the tensor is converted back to row-major layout

The conversion to tile layout can be done when caling `ttnn.from_torch`

In [6]:
torch_tensor = ttnn.to_torch(ttnn.rand((3, 4), device=device, layout=ttnn.ROW_MAJOR_LAYOUT))
ttnn_tensor = ttnn.from_torch(torch_tensor, layout=ttnn.TILE_LAYOUT)
print(f"Tensor in row-major layout:\nShape {torch_tensor.shape}; Layout: {torch_tensor.layout}")
print(f"Tensor in tile layout:\nShape {ttnn_tensor.shape}; Layout: {ttnn_tensor.layout}")

Tensor in row-major layout:
Shape torch.Size([3, 4]); Layout: torch.strided
Tensor in tile layout:
Shape Shape([3, 4]); Layout: Layout.TILE


Note that `ttnn.to_torch` will always convert to row-major layout

## Initialize tensors a and b with random values using torch

Create a tensor that can be used by a `ttnn` operation with the `ttnn.TILE_LAYOUT` and put the tensor on the `device`

In [7]:
input_tensor_a = ttnn.rand((32, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)
input_tensor_b = ttnn.rand((32, 32), dtype=ttnn.bfloat16, layout=ttnn.TILE_LAYOUT, device=device)

print(f"input_tensor_a:\n{input_tensor_a}")
print(f"input_tensor_b:\n{input_tensor_b}")

input_tensor_a:
ttnn.Tensor([[ 0.58594,  0.29297,  ...,  0.60156,  0.80078],
             [ 0.67969,  0.83984,  ...,  0.56641,  0.78125],
             ...,
             [ 0.67969,  0.83984,  ...,  0.67969,  0.83984],
             [ 0.09229,  0.54297,  ...,  0.93359,  0.96484]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)
input_tensor_b:
ttnn.Tensor([[ 0.42578,  0.71094,  ...,  0.28516,  0.64062],
             [ 0.71875,  0.35938,  ...,  0.00223,  0.00111],
             ...,
             [ 0.57031,  0.28516,  ...,  0.71875,  0.35938],
             [ 0.03564,  0.01782,  ...,  0.28711,  0.14355]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)


## Add tensor a and b

`ttnn` supports operator overloading, therefore operator `+` can be used instead of `torch.add`

In [8]:
output_tensor = input_tensor_a + input_tensor_b
print(f"output_tensor:\n{output_tensor}")

output_tensor:
ttnn.Tensor([[ 1.01562,  1.00781,  ...,  0.88672,  1.44531],
             [ 1.39844,  1.20312,  ...,  0.57031,  0.78125],
             ...,
             [ 1.25000,  1.12500,  ...,  1.39844,  1.20312],
             [ 0.12793,  0.56250,  ...,  1.21875,  1.10938]], shape=Shape([32, 32]), dtype=DataType::BFLOAT16, layout=Layout::TILE)


## Inspect the output tensor of the add in ttnn

As can be seen the tensor of the same shape, layout and dtype is produced

In [9]:
print(f"shape: {output_tensor.shape}")
print(f"dtype: {output_tensor.dtype}")
print(f"layout: {output_tensor.layout}")

shape: Shape([32, 32])
dtype: DataType.BFLOAT16
layout: Layout.TILE


In general we expect layout and dtype to stay the same when running most operations unless explicit arguments to modify them are passed in. However, there are obvious exceptions like an embedding operation that takes in `ttnn.uint32` and produces `ttnn.bfloat16`

## Convert to torch and inspect the attributes of the torch tensor

When converting the tensor to torch, `ttnn.to_torch` will move the tensor from the device, convert to tile layout and figure out the best data type to use on the torch side

In [10]:
output_tensor = ttnn.to_torch(output_tensor)
print(f"shape: {output_tensor.shape}")
print(f"dtype: {output_tensor.dtype}")

shape: torch.Size([32, 32])
dtype: torch.bfloat16


## Close the device

Close the handle the device. This is a very important step as the device can hang currently if not closed properly

In [11]:
ttnn.close_device(device)

2025-07-01 01:32:12.856 | info     |           Metal | Closing mesh device 1 (mesh_device.cpp:488)
2025-07-01 01:32:12.858 | info     |           Metal | Closing mesh device 0 (mesh_device.cpp:488)
2025-07-01 01:32:12.858 | info     |           Metal | Closing device 0 (device.cpp:469)
2025-07-01 01:32:12.858 | info     |           Metal | Disabling and clearing program cache on device 0 (device.cpp:781)
