# Open3D Guide: 10. Tensor

Source: [https://www.open3d.org/docs/latest/tutorial/Basic/tensor.html](https://www.open3d.org/docs/latest/tutorial/Basic/tensor.html).

> Tensor is a “view” of a data Blob with shape, stride, and a data pointer. It is a multidimensional and homogeneous matrix containing elements of single data type. It is used in Open3D to perform numerical operations. It supports GPU operations as well.

Summary of contents:

- Tensor creation
- Properties of a tensor
- Copy & device transfer
- Data types
- Type casting
- Numpy I/O with direct memory map
- PyTorch I/O with DLPack memory map
- Binary element-wise operations
- Unary element-wise operations
- Reduction
- Slicing, indexing, getitem, and setitem
- Advanced indexing
- Logical operations
- Comparision Operations
- Nonzero operations

In [1]:
import sys
import os
import copy

# Add the directory containing 'examples' to the Python path
notebook_directory = os.getcwd()
parent_directory = os.path.dirname(notebook_directory)  # Parent directory
sys.path.append(parent_directory)

In [2]:
import open3d as o3d
import open3d.core as o3c
from examples import open3d_example as o3dex
import numpy as np
import matplotlib.pyplot as plt

Jupyter environment detected. Enabling Open3D WebVisualizer.
[Open3D INFO] WebRTC GUI backend enabled.
[Open3D INFO] WebRTCWindowSystem: HTTP handshake server disabled.


## Tensor creation

In [5]:
# Tensor from list.
a = o3c.Tensor([0, 1, 2])
print("Created from list:\n{}".format(a))

# Tensor from Numpy.
a = o3c.Tensor(np.array([0, 1, 2]))
print("\nCreated from numpy array:\n{}".format(a))

# Dtype and inferred from list.
a_float = o3c.Tensor([0.0, 1.0, 2.0])
print("\nDefault dtype and device:\n{}".format(a_float))

# Specify dtype.
a = o3c.Tensor(np.array([0, 1, 2]), dtype=o3c.Dtype.Float64)
print("\nSpecified data type:\n{}".format(a))

# Specify device: CPU, CUDA
#a = o3c.Tensor(np.array([0, 1, 2]), device=o3c.Device("CUDA:0"))
a = o3c.Tensor(np.array([0, 1, 2]), device=o3c.Device("CPU:0"))
print("\nSpecified device:\n{}".format(a))

Created from list:
[0 1 2]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c300]

Created from numpy array:
[0 1 2]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c900]

Default dtype and device:
[0.0 1.0 2.0]
Tensor[shape={3}, stride={1}, Float64, CPU:0, 0x2817319c480]

Specified data type:
[0.0 1.0 2.0]
Tensor[shape={3}, stride={1}, Float64, CPU:0, 0x2817319c400]

Specified device:
[0 1 2]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c840]


In [6]:
# Shallow copy constructor
# Shallow copy: the data_ptr will be copied but the memory it points to will not be copied
vals = np.array([1, 2, 3])
src = o3c.Tensor(vals)
dst = src
src[0] += 10

# Changes in one will get reflected in other.
print("Source tensor:\n{}".format(src))
print("\nTarget tensor:\n{}".format(dst))

Source tensor:
[11 2 3]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c220]

Target tensor:
[11 2 3]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c220]


## Properties of a tensor

In [8]:
vals = np.array((range(24))).reshape(2, 3, 4)
a = o3c.Tensor(vals,
               dtype=o3c.Dtype.Float64,
               device=o3c.Device("CPU:0"))
print(f"a.shape: {a.shape}")
print(f"a.strides: {a.strides}")
print(f"a.dtype: {a.dtype}")
print(f"a.device: {a.device}")
print(f"a.ndim: {a.ndim}")

a.shape: SizeVector[2, 3, 4]
a.strides: SizeVector[12, 4, 1]
a.dtype: Float64
a.device: CPU:0
a.ndim: 3


## Copy & device transfer

```python
# Host -> Device.
a_cpu = o3c.Tensor([0, 1, 2])
a_gpu = a_cpu.cuda(0)
print(a_gpu)

# Device -> Host.
a_gpu = o3c.Tensor([0, 1, 2], device=o3c.Device("CUDA:0"))
a_cpu = a_gpu.cpu()
print(a_cpu)

# Device -> another Device.
a_gpu_0 = o3c.Tensor([0, 1, 2], device=o3c.Device("CUDA:0"))
a_gpu_1 = a_gpu_0.cuda(0)
print(a_gpu_1)
```

## Data types

![Tensor Data Types](../assets/tensor_data_types.png)

## Type casting

In [10]:
# E.g. float -> int
a = o3c.Tensor([0.1, 1.5, 2.7])
b = a.to(o3c.Dtype.Int32)
print(a)
print(b)

[0.1 1.5 2.7]
Tensor[shape={3}, stride={1}, Float64, CPU:0, 0x2817319c2c0]
[0 1 2]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c6c0]


In [11]:
# E.g. int -> float
a = o3c.Tensor([1, 2, 3])
b = a.to(o3c.Dtype.Float32)
print(a)
print(b)

[1 2 3]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x2817319c1c0]
[1.0 2.0 3.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x2817319c760]


## Numpy I/O with direct memory map

Tensors created by passing numpy array to the constructor (`o3c.Tensor(np.array(...)`) do not share memory with the numpy aray. To have shared memory, you can use `o3c.Tensor.from_numpy(...)` and `o3c.Tensor.numpy(...)`. Changes in either of them will get reflected in other.

In [12]:
# Using constructor
np_a = np.ones((5,), dtype=np.int32)
o3_a = o3c.Tensor(np_a)
print(f"np_a: {np_a}")
print(f"o3_a: {o3_a}")
print("")

# Changes to numpy array will not reflect as memory is not shared
np_a[0] += 100
o3_a[1] += 200
print(f"np_a: {np_a}")
print(f"o3_a: {o3_a}")

np_a: [1 1 1 1 1]
o3_a: [1 1 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x2817319c7a0]

np_a: [101   1   1   1   1]
o3_a: [1 201 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x2817319c7a0]


In [13]:
# From numpy.
np_a = np.ones((5,), dtype=np.int32)
o3_a = o3c.Tensor.from_numpy(np_a)

# Changes to numpy array reflects on open3d Tensor and vice versa.
np_a[0] += 100
o3_a[1] += 200
print(f"np_a: {np_a}")
print(f"o3_a: {o3_a}")

np_a: [101 201   1   1   1]
o3_a: [101 201 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x2817319c200]


In [15]:
# To numpy.
o3_a = o3c.Tensor([1, 1, 1, 1, 1], dtype=o3c.Dtype.Int32)
np_a = o3_a.numpy()

# Changes to numpy array reflects on open3d Tensor and vice versa.
np_a[0] += 100
o3_a[1] += 200
print(f"np_a: {np_a}")
print(f"o3_a: {o3_a}")

# For CUDA Tensor, call cpu() before calling numpy().
o3_a = o3c.Tensor([1, 1, 1, 1, 1], device=o3c.Device("CPU:0"))
print(f"\no3_a.cpu().numpy(): {o3_a.cpu().numpy()}")

np_a: [101 201   1   1   1]
o3_a: [101 201 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x2817319c6e0]

o3_a.cpu().numpy(): [1 1 1 1 1]


## PyTorch I/O with DLPack memory map

We can convert tensors from/to [DLManagedTensor](https://dmlc.github.io/dlpack/latest/index.html); from the official DLPack website:

> In order for an ndarray system to interact with a variety of frameworks, a stable in-memory data structure is needed.
> DLPack is one such data structure that allows exchange between major frameworks. It is developed with inputs from many deep learning system core developers. Highlights include:
>
> - Minimum and stable: simple header
> - Designed for cross hardware: CPU, CUDA, OpenCL, Vulkan, Metal, VPI, ROCm, WebGPU, Hexagon
> - Already a standard with wide community adoption and support:
>   - NumPy
>   - CuPy
>   - PyTorch
>   - Tensorflow
>   - MXNet
>   - TVM
>   - mpi4py
> - Clean C ABI compatible.
>   - Means you can create and access it from any language.
>   - It is also essential for building JIT and AOT compilers to support these data types.

In [17]:
import torch
import torch.utils.dlpack

# From PyTorch
#th_a = torch.ones((5,)).cuda(0)
th_a = torch.ones((5,)).cpu()
o3_a = o3c.Tensor.from_dlpack(torch.utils.dlpack.to_dlpack(th_a))
print(f"th_a: {th_a}")
print(f"o3_a: {o3_a}")
print("")

# Changes to PyTorch array reflects on open3d Tensor and vice versa
th_a[0] = 100
o3_a[1] = 200
print(f"th_a: {th_a}")
print(f"o3_a: {o3_a}")

th_a: tensor([1., 1., 1., 1., 1.])
o3_a: [1.0 1.0 1.0 1.0 1.0]
Tensor[shape={5}, stride={1}, Float32, CPU:0, 0x42044801180]

th_a: tensor([100., 200.,   1.,   1.,   1.])
o3_a: [100.0 200.0 1.0 1.0 1.0]
Tensor[shape={5}, stride={1}, Float32, CPU:0, 0x42044801180]


In [19]:
# To PyTorch
#o3_a = o3c.Tensor([1, 1, 1, 1, 1], device=o3c.Device("CUDA:0"))
o3_a = o3c.Tensor([1, 1, 1, 1, 1], device=o3c.Device("CPU:0"))
th_a = torch.utils.dlpack.from_dlpack(o3_a.to_dlpack())
o3_a = o3c.Tensor.from_dlpack(torch.utils.dlpack.to_dlpack(th_a))
print(f"th_a: {th_a}")
print(f"o3_a: {o3_a}")
print("")

# Changes to PyTorch array reflects on open3d Tensor and vice versa
th_a[0] = 100
o3_a[1] = 200
print(f"th_a: {th_a}")
print(f"o3_a: {o3_a}")

th_a: tensor([1, 1, 1, 1, 1], dtype=torch.int32)
o3_a: [1 1 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x28111583e70]

th_a: tensor([100, 200,   1,   1,   1], dtype=torch.int32)
o3_a: [100 200 1 1 1]
Tensor[shape={5}, stride={1}, Int32, CPU:0, 0x28111583e70]


## Binary element-wise operations

In [20]:
a = o3c.Tensor([1, 1, 1], dtype=o3c.Dtype.Float32)
b = o3c.Tensor([2, 2, 2], dtype=o3c.Dtype.Float32)
print("a + b = {}".format(a + b))
print("a - b = {}".format(a - b))
print("a * b = {}".format(a * b))
print("a / b = {}".format(a / b))

a + b = [3.0 3.0 3.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583e90]
a - b = [-1.0 -1.0 -1.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583bd0]
a * b = [2.0 2.0 2.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583c30]
a / b = [0.5 0.5 0.5]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583750]


In [21]:
# Automatic broadcasting: as in Numpy
a = o3c.Tensor.ones((2, 3), dtype = o3c.Dtype.Float32)
b = o3c.Tensor.ones((3,), dtype = o3c.Dtype.Float32)
print("a + b = \n{}\n".format(a + b))

# Automatic type casting.
a = a[0]
print("a + 1 = {}".format(a + 1)) # Float + Int -> Float.
print("a + True = {}".format(a + True)) # Float + Bool -> Float.

# Inplace.
a -= True
print("a = {}".format(a))

a + b = 
[[2.0 2.0 2.0],
 [2.0 2.0 2.0]]
Tensor[shape={2, 3}, stride={3, 1}, Float32, CPU:0, 0x28111583c50]

a + 1 = [2.0 2.0 2.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583b50]
a + True = [2.0 2.0 2.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583ed0]
a = [0.0 0.0 0.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583850]


## Unary element-wise operations

In [22]:
a = o3c.Tensor([4, 9, 16], dtype=o3c.Dtype.Float32)
print("a = {}\n".format(a))
print("a.sqrt = {}\n".format(a.sqrt()))
print("a.sin = {}\n".format(a.sin()))
print("a.cos = {}\n".format(a.cos()))

# Inplace operation
a.sqrt_()
print(a)

a = [4.0 9.0 16.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583af0]

a.sqrt = [2.0 3.0 4.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583ed0]

a.sin = [-0.756802 0.412119 -0.287903]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583850]

a.cos = [-0.653644 -0.91113 -0.95766]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583ed0]

[2.0 3.0 4.0]
Tensor[shape={3}, stride={1}, Float32, CPU:0, 0x28111583af0]


## Reduction

In [23]:
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)
print("a.sum = {}\n".format(a.sum()))
print("a.min = {}\n".format(a.min()))
print("a.ArgMax = {}\n".format(a.argmax()))

a.sum = 276
Tensor[shape={}, stride={}, Int32, CPU:0, 0x281119d8e00]

a.min = 0
Tensor[shape={}, stride={}, Int32, CPU:0, 0x281119d9040]

a.ArgMax = 23
Tensor[shape={}, stride={}, Int64, CPU:0, 0x281119d8f30]



In [24]:
# With specified dimension.
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)

print("Along dim=0\n{}".format(a.sum(dim=(0))))
print("Along dim=(0, 2)\n{}\n".format(a.sum(dim=(0, 2))))

# Retention of reduced dimension.
print("Shape without retention : {}".format(a.sum(dim=(0, 2)).shape))
print("Shape with retention : {}".format(a.sum(dim=(0, 2), keepdim=True).shape))

Along dim=0
[[12 14 16 18],
 [20 22 24 26],
 [28 30 32 34]]
Tensor[shape={3, 4}, stride={4, 1}, Int32, CPU:0, 0x2811184d4f0]
Along dim=(0, 2)
[60 92 124]
Tensor[shape={3}, stride={1}, Int32, CPU:0, 0x28111583930]

Shape without retention : SizeVector[3]
Shape with retention : SizeVector[1, 3, 1]


## Slicing, indexing, getitem, and setitem

In [25]:
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)
print("a = \n{}\n".format(a))

# Indexing __getitem__.
print("a[1, 2] = {}\n".format(a[1, 2]))

# Slicing __getitem__.
print("a[1:] = \n{}\n".format(a[1:]))

# slice object.
print("a[:, 0:3:2, :] = \n{}\n".format(a[:, 0:3:2, :]))

# Combined __getitem__
print("a[:-1, 0:3:2, 2] = \n{}\n".format(a[:-1, 0:3:2, 2]))

a = 
[[[0 1 2 3],
  [4 5 6 7],
  [8 9 10 11]],
 [[12 13 14 15],
  [16 17 18 19],
  [20 21 22 23]]]
Tensor[shape={2, 3, 4}, stride={12, 4, 1}, Int32, CPU:0, 0x2811113ab00]

a[1, 2] = [20 21 22 23]
Tensor[shape={4}, stride={1}, Int32, CPU:0, 0x2811113ab50]

a[1:] = 
[[[12 13 14 15],
  [16 17 18 19],
  [20 21 22 23]]]
Tensor[shape={1, 3, 4}, stride={12, 4, 1}, Int32, CPU:0, 0x2811113ab30]

a[:, 0:3:2, :] = 
[[[0 1 2 3],
  [8 9 10 11]],
 [[12 13 14 15],
  [20 21 22 23]]]
Tensor[shape={2, 2, 4}, stride={8, 4, 1}, Int32, CPU:0, 0x28111c81b70]
Tensor[shape={2, 2, 4}, stride={12, 8, 1}, Int32, CPU:0, 0x2811113ab00]

a[:-1, 0:3:2, 2] = 
[[2 10]]
Tensor[shape={1, 2}, stride={2, 1}, Int32, CPU:0, 0x281119d90e0]
Tensor[shape={1, 2}, stride={12, 8}, Int32, CPU:0, 0x2811113ab08]



In [26]:
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)

# Changes get reflected.
b = a[:-1, 0:3:2, 2]
b[0] += 100
print("b = {}\n".format(b))
print("a = \n{}".format(a))

b = [[102 110]]
Tensor[shape={1, 2}, stride={2, 1}, Int32, CPU:0, 0x281119d8fc0]
Tensor[shape={1, 2}, stride={12, 8}, Int32, CPU:0, 0x2811113a9b8]

a = 
[[[0 1 102 3],
  [4 5 6 7],
  [8 9 110 11]],
 [[12 13 14 15],
  [16 17 18 19],
  [20 21 22 23]]]
Tensor[shape={2, 3, 4}, stride={12, 4, 1}, Int32, CPU:0, 0x2811113a9b0]


In [28]:
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)

# Example __setitem__
a[:, :, 2] += 100
print(a)

[[[0 1 102 3],
  [4 5 106 7],
  [8 9 110 11]],
 [[12 13 114 15],
  [16 17 118 19],
  [20 21 122 23]]]
Tensor[shape={2, 3, 4}, stride={12, 4, 1}, Int32, CPU:0, 0x2811113ac50]


## Advanced indexing

> Advanced indexing is triggered while passing an index array or a boolean array or their combination with integer/slice object. Note that advanced indexing always returns a copy of the data (contrast with basic slicing that returns a view).

See also:

- [Combining advanced and basic indexing](https://www.open3d.org/docs/latest/tutorial/Basic/tensor.html#Combining-advanced-and-basic-indexing)
- [Boolean array indexing](https://www.open3d.org/docs/latest/tutorial/Basic/tensor.html#Boolean-array-indexing)


In [29]:
vals = np.array(range(24)).reshape((2, 3, 4))
a = o3c.Tensor(vals)

# Along each dimension, a specific element is selected.
print("a[[0, 1], [1, 2], [1, 0]] = {}\n".format(a[[0, 1], [1, 2], [1, 0]]))

# Changes not reflected as it is a copy.
b = a[[0, 0], [0, 1], [1, 1]]
b[0] += 100
print("b = {}\n".format(b))
print("a[[0, 0], [0, 1], [1, 1]] = {}".format(a[[0, 0], [0, 1], [1, 1]]))

a[[0, 1], [1, 2], [1, 0]] = [5 20]
Tensor[shape={2}, stride={1}, Int32, CPU:0, 0x281119d8e10]

b = [101 5]
Tensor[shape={2}, stride={1}, Int32, CPU:0, 0x281119d9170]

a[[0, 0], [0, 1], [1, 1]] = [1 5]
Tensor[shape={2}, stride={1}, Int32, CPU:0, 0x281119d8f10]


## Logical operations

In [32]:
a = o3c.Tensor(np.array([True, False, True, False]))
b = o3c.Tensor(np.array([True, True, False, False]))

print("a AND b = {}".format(a.logical_and(b)))
print("a OR b = {}".format(a.logical_or(b)))
print("a XOR b = {}".format(a.logical_xor(b)))
print("NOT a = {}\n".format(a.logical_not()))

# Only works for boolean tensors.
print("a.any = {}".format(a.any()))
print("a.all = {}\n".format(a.all()))

# If tensor is not boolean, 0 will be treated as False, while non-zero as true.
# The tensor will be filled with 0 or 1 casted to tensor's dtype.
c = o3c.Tensor(np.array([2.0, 0.0, 3.5, 0.0]))
d = o3c.Tensor(np.array([0.0, 3.0, 1.5, 0.0]))
print("c AND d = {}".format(c.logical_and(d)))

a AND b = [True False False False]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d8e30]
a OR b = [True True True False]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d90c0]
a XOR b = [False True True False]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d8db0]
NOT a = [False True False True]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d8f90]

a.any = True
Tensor[shape={}, stride={}, Bool, CPU:0, 0x281119d8f70]
a.all = False
Tensor[shape={}, stride={}, Bool, CPU:0, 0x281119d8dd0]

c AND d = [False False True False]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d8f10]


In [33]:
a = o3c.Tensor(np.array([1, 2, 3, 4]), dtype=o3c.Dtype.Float64)
b = o3c.Tensor(np.array([1, 1.99999, 3, 4]))

# Throws exception if the device/dtype is not same.
# Returns false if the shape is not same.
print("allclose : {}".format(a.allclose(b)))

# Throws exception if the device/dtype/shape is not same.
print("isclose : {}".format(a.isclose(b)))

# Returns false if the device/dtype/shape/ is not same.
print("issame : {}".format(a.issame(b)))

allclose : True
isclose : [True True True True]
Tensor[shape={4}, stride={1}, Bool, CPU:0, 0x281119d8e20]
issame : False


## Comparision Operations

In [34]:
a = o3c.Tensor([0, 1, -1])
b = o3c.Tensor([0, 0, 0])

print("a > b = {}".format(a > b))
print("a >= b = {}".format(a >= b))
print("a < b = {}".format(a < b))
print("a <= b = {}".format(a <= b))
print("a == b = {}".format(a == b))
print("a != b = {}".format(a != b))

# Throws exception if device/dtype is not shape.
# If shape is not same, then tensors should be broadcast compatible.
print("a > b = {}".format(a > b[0]))

a > b = [False True False]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d9140]
a >= b = [True True False]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d90c0]
a < b = [False False True]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d90f0]
a <= b = [True False True]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d8e50]
a == b = [True False False]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d9050]
a != b = [False True True]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d8eb0]
a > b = [False True False]
Tensor[shape={3}, stride={1}, Bool, CPU:0, 0x281119d90a0]


## Nonzero operations

In [35]:
a = o3c.Tensor([[3, 0, 0], [0, 4, 0], [5, 6, 0]])

print("a = \n{}\n".format(a))
print("a.nonzero() = \n{}\n".format(a.nonzero()))
print("a.nonzero(as_tuple = 1) = \n{}".format(a.nonzero(as_tuple = 1)))

a = 
[[3 0 0],
 [0 4 0],
 [5 6 0]]
Tensor[shape={3, 3}, stride={3, 1}, Int32, CPU:0, 0x28111b38d80]

a.nonzero() = 
[[0 1 2 2]
Tensor[shape={4}, stride={1}, Int64, CPU:0, 0x28111b38c60], [0 1 0 1]
Tensor[shape={4}, stride={1}, Int64, CPU:0, 0x28111b38a50]]

a.nonzero(as_tuple = 1) = 
[[0 1 2 2],
 [0 1 0 1]]
Tensor[shape={2, 4}, stride={4, 1}, Int64, CPU:0, 0x28111c82160]
