# Demo Tensors Dataset

This notebook demonstrates the generation and usage of the Demo Tensors dataset, which now includes a diverse range of tensor dimensions (0D to 5D).

The dataset is generated by `scripts/generate_tensors.py` and results in a compressed NumPy file (`.npz`) located at `data/synthetic/demo_tensors/demo_tensors.npz` (relative to the repository root).

## 1. Generate the Dataset

The following cell executes the script `scripts/generate_tensors.py`.
This script now generates several tensors: `tensor_a`, `tensor_b`, `scalar_data`, `vector_data`, `image_grayscale_data`, `image_rgb_data`, `video_frames_data`, and `simulation_data`.
Since this notebook is in the `notebooks/` directory, we use `../` to correctly path to the script and the output directory.
This command will create (or overwrite) `../data/synthetic/demo_tensors/demo_tensors.npz`.

In [1]:
!python ../scripts/generate_tensors.py ../data/synthetic/demo_tensors

Generated demo_tensors.npz in /app/data/synthetic/demo_tensors


## 2. Load and Inspect the Data

Now, we'll load the generated `demo_tensors.npz` file using NumPy.
We will then list all tensors stored within the file and inspect each one, showing its shape, data type (dtype), and for the scalar, its value.

In [2]:
import numpy as np

# Define the path to the data file (relative to this notebook)
data_path = '../data/synthetic/demo_tensors/demo_tensors.npz'

# Load the data
data = np.load(data_path)

print(f"Files in NPZ: {sorted(list(data.files))}\n")

# Define the tensors we expect to find (and their descriptions for printing)
tensor_info = {
    "tensor_a": "Original general 2D tensor",
    "tensor_b": "Original general 3D tensor",
    "scalar_data": "0D tensor (scalar)",
    "vector_data": "1D tensor (vector)",
    "image_grayscale_data": "2D tensor (grayscale image)",
    "image_rgb_data": "3D tensor (RGB image)",
    "video_frames_data": "4D tensor (sequence of grayscale frames)",
    "simulation_data": "5D tensor (higher-dimensional data)"
}

for name, description in tensor_info.items():
    if name in data:
        tensor = data[name]
        print(f"Tensor: '{name}' ({description})")
        print(f"  Shape: {tensor.shape}")
        print(f"  dtype: {tensor.dtype}")
        if name == 'scalar_data':
            print(f"  Value: {tensor}")
        print("---")
    else:
        print(f"Tensor: '{name}' was not found in the file.")
        print("---")

Files in NPZ: ['image_grayscale_data', 'image_rgb_data', 'scalar_data', 'simulation_data', 'tensor_a', 'tensor_b', 'vector_data', 'video_frames_data']

Tensor: 'tensor_a' (Original general 2D tensor)
  Shape: (100, 3)
  dtype: float64
---
Tensor: 'tensor_b' (Original general 3D tensor)
  Shape: (50, 10, 3)
  dtype: float64
---
Tensor: 'scalar_data' (0D tensor (scalar))
  Shape: ()
  dtype: float64
  Value: 42.0
---
Tensor: 'vector_data' (1D tensor (vector))
  Shape: (150,)
  dtype: float64
---
Tensor: 'image_grayscale_data' (2D tensor (grayscale image))
  Shape: (32, 32)
  dtype: float64
---
Tensor: 'image_rgb_data' (3D tensor (RGB image))
  Shape: (16, 16, 3)
  dtype: float64
---
Tensor: 'video_frames_data' (4D tensor (sequence of grayscale frames))
  Shape: (10, 8, 8, 1)
  dtype: float64
---
Tensor: 'simulation_data' (5D tensor (higher-dimensional data))
  Shape: (5, 6, 6, 3, 2)
  dtype: float64
---
