Be sure to make a copy of the notebook in your drive so you don't lose your work!

# Concept Questions

1. Why do we use tokens to represent language?
2. True or False: The only valid tokenizations are word and character-level tokenizations.
3. What is the benefit of using token IDs instead of raw tokens?
4. Determine whether the following are valid assignments of token IDs. If invalid, give a reason why.

    a. `The quick brown fox jumps high` → `110, 125, 42, 71, 8, 96`

    b. `The quick brown fox jumps high` → `143, 54, -73, 62, 83, 24`

    c. `The quick brown fox jumps high` → `110, 125, 83, 71, 83, 96`

    d. `The big plane flies over a small plane` → `54, 75, 86, 12, 77, 90, 123, 170`

    e. `The big plane flies over a small plane` → `54, 75, 86, 12, 77, 90, 123, 86`

5. Suppose the text is `The quick brown fox jumps high`. For each input context window, give a corresponding target context window.

    a. `The quick brown fox`

    b. `The quick brown`

    c. `brown fox jumps`

    d. `quick brown fox`

6. Why do we use embeddings?

1. We want to be able to convert words or text in general into numbers since LLMs can easily process numbers but cannot directly process raw text.
2. False. Word and character level tokenizations are two types of tokenizations, but there are also others (parts of words). Different LLMs can actually have different tokenization methods and even different dictionaries (e.g., ChatGPT might want to use a larger vocabulary than GPT we're making in this class).
3. We want to index the words to convert them into numbers. Once we have the indices, it is easy for a model to map them to embeddings.
4.

  a. Valid. Each token is different and is mapped to a different token ID.

  b. Invalid. Token IDs should be nonnegative.

  c. Invalid. `brown` and `jumps` are both mapped to token ID 83, which violates the rule that different tokens are mapped to different token IDs.

  d. Invalid. The word `plane` appears twice, and has two different token IDs, which violates the rule that the same token should be mapped to the same token ID.

  e. Valid. Different tokens are mapped to different token IDs, and equivalent tokens are mapped to the same token ID.

5.

  a. `quick brown fox jumps`

  b. `quick brown fox`

  c. `fox jumps high`

  d. `brown fox jumps`

# Tensor Introduction

Importing PyTorch

In [1]:
import torch
import torchvision

Vectors are represented in PyTorch using **tensors**:

In [2]:
x = torch.tensor([3, -7, 5])
print(x)
y = torch.tensor([2.4, 1.6, -4])
print(y)

tensor([ 3, -7,  5])
tensor([ 2.4000,  1.6000, -4.0000])


Tensors of the same size can be added together (same as vector addition). Note that we have to run the previous cell for the following cell to work.

In [4]:
z = x + y
w = x - y
print(z)
print(w)

tensor([ 5.4000, -5.4000,  1.0000])
tensor([ 0.6000, -8.6000,  9.0000])


Tensors of different sizes cannot be added or subtracted:

In [8]:
x = torch.tensor([1, 2, 3])
y = torch.tensor([3, 5, 7, 2])
# print(x + y) # error
# print(x - y) # error

You can get the size of a tensor:

In [9]:
x = torch.tensor([1, 5, 3, 6])
s = x.shape
print(s)

torch.Size([4])


**Exercise 1:** Create three tensors of size 7. Then, create another tensor equal to the sum of those three tensors and print it.

In [10]:
# Exercise 1: Your code here
a = torch.tensor([1, 2, 3, 4, 5, 6, 7])
b = torch.tensor([-5, 6.7, 3.5, 2.4, 5.8, 1, 5])
c = torch.tensor([2.5, 4.6, 7.3, 9.2, 3.4, -4.6, 15])
d = a + b + c
print(d)

tensor([-1.5000, 13.3000, 13.8000, 15.6000, 14.2000,  2.4000, 27.0000])


Tensors can also be multiplied by a constant (same as vector scalar multiplication):

In [11]:
x = torch.tensor([4.5, 2, 1.6, 7])
y = 2 * x
z = -3.7 * x
print(y)
print(z)

tensor([ 9.0000,  4.0000,  3.2000, 14.0000])
tensor([-16.6500,  -7.4000,  -5.9200, -25.9000])


Tensors of the same size can be multiplied together (component-wise multiplication):

In [13]:
x = torch.tensor([1, 2, 5])
y = torch.tensor([2, 3, 4])
print(x * y)

tensor([ 2,  6, 20])


You can also take the dot product of two tensors:

In [14]:
z = x @ y
print(z)

tensor(28)


You can get each component of a tensor:

In [16]:
x = torch.tensor([3, 4, 1])
print(x[0])
print(x[1])
print(x[2])
print(x[-1])
print(x[0:2])

tensor(3)
tensor(4)
tensor(1)
tensor(1)
tensor([3, 4])


**Exercise 2:** Create a function that takes a list of tensors and returns the number of tensors that represent a vector of 3 components.

In [18]:
def get_3_component_vectors(tensor_list):
  # Your code here
  count = 0
  for t in tensor_list:
    if t.shape == torch.Size([3]):
      count += 1
  return count

tl = [torch.tensor([6]), torch.tensor([1, 2, 3]), torch.tensor([2, 3, 5]), torch.tensor([1, 3, 6, 8])]
print(get_3_component_vectors(tl))

2


You can change the type of a tensor:

In [None]:
print(x)
y = x.float()
print(y)
z = y.int()
print(z)

tensor([3, 4, 1])
tensor([3., 4., 1.])
tensor([3, 4, 1], dtype=torch.int32)


You can get the magnitude of a tensor that represents a vector (needs a float tensor):

In [20]:
print(torch.linalg.norm(x.float()))
# print(torch.linalg.norm(x))

tensor(5.0990)


**Exercise 3:** Create a function that takes in a list of tensors representing vectors and returns the one with largest magnitude.

In [24]:
def largest_magnitude_tensor(tensor_list):
  # Your code here
  result = None
  for t in tensor_list:
    if result is None or torch.linalg.norm(t.float()) > torch.linalg.norm(result.float()):
      result = t
  return result

tl = [torch.tensor([5]), torch.tensor([4, 6]), torch.tensor([7]), torch.tensor([6, 6]), torch.tensor([2, 2, 2, 2, 2, 2, 2])]
print(largest_magnitude_tensor(tl))

tensor([6, 6])


Tensors can also be multidimensional themselves, which is useful for representing context windows or batches:

In [25]:
x = torch.tensor([[3, 4, 2], [7, 5, 6]])
print(x)
print(x.shape)
y = torch.tensor([[[45, 78, 65], [24, 123, 207]], [[23, 0, 50], [255, 127, 6]]]) # 2x2x3 image (2x2 colored image)
print(y)
print(y.shape)

tensor([[3, 4, 2],
        [7, 5, 6]])
torch.Size([2, 3])
tensor([[[ 45,  78,  65],
         [ 24, 123, 207]],

        [[ 23,   0,  50],
         [255, 127,   6]]])
torch.Size([2, 2, 3])


You can get a specific dimension of the size:

In [26]:
x = torch.tensor([[3, 4, 2], [7, 5, 6]]) # 2x3 greyscale image
print(x)
print(x.shape)
print(x.shape[0])
print(x.shape[1])
print(type(x.shape[0]))

tensor([[3, 4, 2],
        [7, 5, 6]])
torch.Size([2, 3])
2
3
<class 'int'>


**Check your knowledge:**

What is the size of the tensor <font face="Courier New">torch.tensor([5, 7, 1, 6, 8, 2])</font>?

What is the size of the tensor <font face="Courier New">torch.tensor([[2, 3, 5, 7], [4, 3, -1, 6]])</font>?

What is the size of the tensor <font face="Courier New">torch.tensor([[[5, 2], [1, 2]], [[3, 4], [5, 6]]])</font>?

Suppose x is a 5-element 1D tensor. How do you get the number of elements in x as an int instead of a <font face="Courier New">torch.Size</font> object?

Suppose x is a 5x4 tensor. How do you get the number of rows in x? What about columns?

1. 6
2. 2x4
3. 2x2x2
4. x.shape[0]

(x.shape is torch.Size([5]), and when we index into it we get an int)

5. x.shape[0], x.shape[1]

You can flatten a multidimensional tensor into a vector tensor:

In [27]:
x_flat = x.flatten()
print(x_flat)
print(x_flat.shape)

tensor([3, 4, 2, 7, 5, 6])
torch.Size([6])


Flattening also works for tensors with 3 or more dimensions:

In [28]:
y_flat = y.flatten()
print(y_flat)
print(y_flat.shape)

tensor([ 45,  78,  65,  24, 123, 207,  23,   0,  50, 255, 127,   6])
torch.Size([12])


**Exercise 4:** Create a function that takes a list of tensors and returns a copy of the list except with all elements flattened.

In [30]:
def all_flattened(tensor_list):
  # Your code here
  result = []
  for t in tensor_list:
    result.append(t.flatten())
  return result

tl = [torch.tensor([2, 3, 4]), torch.tensor([[2, 3, 4], [5, 6, 7], [9, 8, 7]]), torch.tensor([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])]
print(all_flattened(tl))

[tensor([2, 3, 4]), tensor([2, 3, 4, 5, 6, 7, 9, 8, 7]), tensor([1, 2, 3, 4, 5, 6, 7, 8])]


You can take the transpose of a tensor with two dimensions (swaps the two dimensions):

In [31]:
x = torch.tensor([[3, 4, 2], [7, 5, 6]]) # 2x3 greyscale image
y = x.T # 3x2 greyscale image
print(y)
print(y.shape)

tensor([[3, 7],
        [4, 5],
        [2, 6]])
torch.Size([3, 2])


You can swap the dimensions of a 3D tensor:

In [None]:
y = torch.tensor([[[45, 78, 65], [24, 123, 207]], [[23, 0, 50], [255, 127, 6]]])
print(torch.transpose(y, 1, 2)) # different size
print(torch.transpose(y, 0, 1)) # note this is the same size but a different tensor

tensor([[[ 45,  24],
         [ 78, 123],
         [ 65, 207]],

        [[ 23, 255],
         [  0, 127],
         [ 50,   6]]])
tensor([[[ 45,  78,  65],
         [ 23,   0,  50]],

        [[ 24, 123, 207],
         [255, 127,   6]]])


**Check your knowledge:**

Suppose x is a 2x3x4 tensor. What is the size of <font face="Courier New">torch.transpose(x, 1, 2)</font>? What about <font face="Courier New">torch.transpose(x, 0, 1)</font>?

2x4x3, 3x2x4

You can easily create a tensor filled with zeros:

In [None]:
x = torch.zeros(5)
y = torch.zeros(3, 5)
z = torch.zeros(2, 3, 4)
print(x)
print(y)
print(z)

tensor([0., 0., 0., 0., 0.])
tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])
tensor([[[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]],

        [[0., 0., 0., 0.],
         [0., 0., 0., 0.],
         [0., 0., 0., 0.]]])


You can also easily create a tensor filled with ones:

In [32]:
x = torch.ones(5)
y = torch.ones(3, 5)
z = torch.ones(2, 3, 4)
print(x)
print(y)
print(z)

tensor([1., 1., 1., 1., 1.])
tensor([[1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1.]])
tensor([[[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]],

        [[1., 1., 1., 1.],
         [1., 1., 1., 1.],
         [1., 1., 1., 1.]]])


**Check your knowledge:**

What is the magnitude of <font face="Courier New">torch.zeros(5)</font>? What about <font face="Courier New">torch.ones(5)</font>?

0, $\sqrt{5}$

You can reshape tensors into the size you want:

In [36]:
x = torch.ones(12)
print(x)
y = x.resize_((2, 6))
print(y)
z = y.resize_((4, 3))
print(z)
w = z.resize_((3, 2, 2))
print(w)

tensor([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
tensor([[1., 1., 1., 1., 1., 1.],
        [1., 1., 1., 1., 1., 1.]])
tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])
tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]])


**Check your knowledge:**

Can you resize a 8x3 tensor into a 3x4x2 tensor? What about 3x5x2?

Yes

No (Technically yes, but you don't want to do that)

You can combine two vectors if their sizes are compatible:

In [None]:
a = torch.zeros(5)
b = torch.zeros(11)
c = torch.cat([a, b])
print(c)
print(c.shape)

tensor([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
torch.Size([16])


Every dimension must match except for the specified one, and the number of dimensions must match:

In [41]:
a = torch.zeros((5, 7))
b = torch.zeros((3, 7))
c = torch.cat([a, b], dim=0)
print(a)
print(b)
print(c)
print(c.shape)
#d = torch.cat([a, b], dim=1) # error since dimension 0 doesn't match (5 != 3)
#e = torch.cat(torch.zeros((2, 4)), torch.zeros((2, 4, 1)))
# error since a 2D tensor cannot be concatenated with a 3D tensor

tensor([[0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.]])
tensor([[0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0.]])
torch.Size([8, 7])


**Exercise 5:** Create a function that takes a list of tensors and a specified dimension and returns whether they can be concatenated

In [None]:
def can_concatenate(tensor_list, dim):
  # Your code here

**Exercise 6:** Create a function that takes a list of tensors and returns the dot product of the two vectors with largest magnitude.

In [None]:
def largest_magnitude_dot_product(tensor_list):
  # Your code here

# Handling Data

In [42]:
# Install dependencies
# ! in colab means a terminal command
!pip install tiktoken



In [49]:
# Import libraries
import tiktoken
from torch.utils.data import Dataset, DataLoader
import torch

tokenizer = tiktoken.get_encoding("gpt2")

# Dataset class
class MyData(Dataset):
    # Init function, called when the dataset is created
    # dataset = MyData(text, tokenizer, context_length=4, stride=1)
    def __init__(self, text, tokenizer, context_length, stride=1):
        self.input_ids = []
        self.target_ids = []
        token_ids = tokenizer.encode(text)
        for i in range(0, len(token_ids) - context_length, stride):
            self.input_ids.append(torch.tensor(token_ids[i : i + context_length]))
            self.target_ids.append(torch.tensor(token_ids[i + 1 : i + context_length + 1]))

    # Length function
    # len(dataset)
    def __len__(self):
        return len(self.input_ids)

    # Get item function
    # dataset[idx]
    def __getitem__(self, idx):
        return self.input_ids[idx], self.target_ids[idx]

In [51]:
text = "Hello my name is Daniel Li"
dataset = MyData(text, tokenizer, context_length=4, stride=1)
print(dataset[0]) # First input, target pair
print(dataset[1]) # Second input, target pair
print(dataset[0][0]) # First input window
print(dataset[0][1]) # First target window
print(dataset[0][1][2]) # Third word of first target window

(tensor([15496,   616,  1438,   318]), tensor([ 616, 1438,  318, 7806]))
(tensor([ 616, 1438,  318, 7806]), tensor([1438,  318, 7806, 7455]))
tensor([15496,   616,  1438,   318])
tensor([ 616, 1438,  318, 7806])
tensor(318)


In [53]:
def my_batch(text, batch_size, context_length, stride, shuffle=True, drop_last=True, num_workers=0):
    tokenizer = tiktoken.get_encoding("gpt2")

    # Create the dataset object
    dataset = MyData(text, tokenizer, context_length, stride)

    # Use the DataLoader library to create a dataloader that batches the data
    dataloader = DataLoader(dataset,
                            batch_size=batch_size,
                            shuffle=shuffle,
                            drop_last=drop_last,
                            num_workers=num_workers)

    return dataloader

text = "Hello my name is Daniel Li and I am currently teaching how to build an LLM"
dataloader = my_batch(text, batch_size=2, context_length=4, stride=1)
for batch in dataloader:
    print(batch)
    input = batch[0]    # 2x4 tensor where 2 is batch size and 4 is context length
    target = batch[1]
    # print(input)
    # print(target)

[tensor([[ 716, 3058, 7743,  703],
        [7743,  703,  284, 1382]]), tensor([[3058, 7743,  703,  284],
        [ 703,  284, 1382,  281]])]
[tensor([[15496,   616,  1438,   318],
        [  290,   314,   716,  3058]]), tensor([[ 616, 1438,  318, 7806],
        [ 314,  716, 3058, 7743]])]
[tensor([[ 314,  716, 3058, 7743],
        [1438,  318, 7806, 7455]]), tensor([[ 716, 3058, 7743,  703],
        [ 318, 7806, 7455,  290]])]
[tensor([[ 7806,  7455,   290,   314],
        [  284,  1382,   281, 27140]]), tensor([[ 7455,   290,   314,   716],
        [ 1382,   281, 27140,    44]])]
[tensor([[3058, 7743,  703,  284],
        [ 616, 1438,  318, 7806]]), tensor([[7743,  703,  284, 1382],
        [1438,  318, 7806, 7455]])]
[tensor([[ 703,  284, 1382,  281],
        [ 318, 7806, 7455,  290]]), tensor([[  284,  1382,   281, 27140],
        [ 7806,  7455,   290,   314]])]


In [None]:
for input, target in dataloader:
    # print(input)
    # print(target)
    pass

In [56]:
# torch.manual_seed(1337)

vocab_size = 50257 # Number of different tokens in our vocabulary
output_size = 256 # Length of each vector we are mapping the tokens to
# For example, "My" gets mapped to a 256-dimension vector
embedding_layer = torch.nn.Embedding(vocab_size, output_size)
# 1593 -> [0, 0, 0, 0, ..., 0, 0, 1, 0, 0, ..., 0, 0, 0, 0]
# 1 is at index 1593, everything else is 0
# Apply Embedding:
# Word transforms from 50257-dim vector to 256-dim vector
# Multiply the 50257-dim vector by a 256x50257 size matrix

first_input_batch, first_target_batch = next(iter(dataloader))
print(first_input_batch.shape)
first_input_token_embeddings = embedding_layer(first_input_batch)
# Embedding layers (and other things in Pytorch) automatically take the batch
# dimension into account
# Example: Batch size 8
first_target_token_embeddings = embedding_layer(first_target_batch)
print(first_input_token_embeddings)
print(first_input_token_embeddings.shape) # [2, 4, 256]
# 2 is batch size
# 4 is context size
# 256 is embedding size
# Embedding changed 2x4 into 2x4x256

torch.Size([2, 4])
tensor([[[ 1.2685, -0.4092, -1.0066,  ...,  0.3432, -0.2911, -0.2079],
         [ 0.0692,  0.5860,  0.6057,  ..., -0.8112, -0.3368,  0.3634],
         [-0.3293, -2.1463,  0.4657,  ...,  0.5346,  0.4552,  0.4563],
         [-0.7555, -1.1228,  0.4135,  ..., -0.6309,  2.3381, -0.0665]],

        [[-0.3141, -0.5543, -0.2461,  ..., -1.0251, -1.3119,  0.1267],
         [ 0.9996,  0.6607,  1.0238,  ..., -1.3405,  1.4056, -0.1574],
         [ 0.2652,  0.4399,  0.6939,  ...,  0.2246,  0.3664,  0.4520],
         [ 0.1291, -0.9978,  0.2440,  ..., -0.0531,  0.8474, -0.2260]]],
       grad_fn=<EmbeddingBackward0>)
torch.Size([2, 4, 256])


In [59]:
context_length = 4
pos_embedding_layer = torch.nn.Embedding(context_length, output_size)
first_pos_embeddings = pos_embedding_layer(torch.arange(context_length))
print(torch.arange(4))
print(first_pos_embeddings)
print(first_pos_embeddings.shape)
# Position embedding is the same for every batch so there is no batch dimension

tensor([0, 1, 2, 3])
tensor([[-1.1189, -0.1744, -1.2138,  ..., -0.9903,  0.8553, -0.9537],
        [ 0.0557,  0.5771, -0.9318,  ..., -0.3697,  0.2683, -1.1061],
        [-0.1161,  0.6960, -1.9962,  ...,  0.8197,  1.2191, -2.9642],
        [ 0.6579,  1.0407, -0.3774,  ...,  0.8236, -0.7525, -0.9307]],
       grad_fn=<EmbeddingBackward0>)
torch.Size([4, 256])


In [60]:
print(first_pos_embeddings.shape)

torch.Size([4, 256])


In [61]:
print(torch.arange(4))

tensor([0, 1, 2, 3])


In [64]:
first_input_embedding = first_input_token_embeddings + first_pos_embeddings
# We're adding 2x4x256 to 4x256
# Pytorch treats the 4x256 as a 2x4x256 composed of two copies of the 4x256
first_target_embedding = first_target_token_embeddings + first_pos_embeddings
print(first_input_embedding)
print(first_input_embedding.shape)
print(first_target_embedding)
print(first_target_embedding.shape)

tensor([[[ 0.1497, -0.5836, -2.2204,  ..., -0.6470,  0.5641, -1.1616],
         [ 0.1249,  1.1631, -0.3261,  ..., -1.1809, -0.0685, -0.7427],
         [-0.4454, -1.4503, -1.5305,  ...,  1.3544,  1.6743, -2.5079],
         [-0.0976, -0.0821,  0.0360,  ...,  0.1928,  1.5856, -0.9973]],

        [[-1.4330, -0.7287, -1.4599,  ..., -2.0154, -0.4566, -0.8270],
         [ 1.0553,  1.2378,  0.0921,  ..., -1.7102,  1.6739, -1.2635],
         [ 0.1491,  1.1360, -1.3023,  ...,  1.0443,  1.5854, -2.5121],
         [ 0.7870,  0.0430, -0.1334,  ...,  0.7706,  0.0949, -1.1568]]],
       grad_fn=<AddBackward0>)
torch.Size([2, 4, 256])
tensor([[[-1.0497,  0.4116, -0.6081,  ..., -1.8015,  0.5185, -0.5903],
         [-0.2737, -1.5692, -0.4661,  ...,  0.1649,  0.7235, -0.6499],
         [-0.8716, -0.4268, -1.5827,  ...,  0.1889,  3.5572, -3.0307],
         [ 2.3901,  1.1450,  0.2383,  ...,  0.9267, -2.1957, -1.0663]],

        [[-0.1192,  0.4862, -0.1899,  ..., -2.3308,  2.2609, -1.1111],
         [ 0.320

# Exercises

**Exercise 7:** Create your own text (must be at least 10 words long). Make a dataloader of the text with batch size 3, context size 5, and stride 1.

In [65]:
my_text = "This is a completely useless sentence that is over ten words long"
# Your code here
loader = my_batch(text, batch_size=3, context_length=5, stride=1)

**Exercise 8:** Make input and target embeddings of the text with embedding size 512. Be sure to utilize both token and position embeddings.

In [66]:
# Your code here
tok_emb = torch.nn.Embedding(50257, 512)
pos_emb = torch.nn.Embedding(50257, 512)
input_embeddings = []
target_embeddings = []
for input, target in loader:
  input_embedding = tok_emb(input) + pos_emb(torch.arange(5))
  target_embedding = tok_emb(target) + pos_emb(torch.arange(5))
  input_embeddings.append(input_embedding)
  target_embeddings.append(target_embedding)


print(input_embeddings)
print(target_embeddings)

[tensor([[[ 2.6686, -1.1294,  0.4105,  ..., -1.2605,  0.3527,  1.1787],
         [-0.1549,  0.4581, -0.0148,  ..., -0.0151,  2.3719,  1.2134],
         [ 1.3821, -0.0352,  0.4479,  ...,  0.6514,  0.1744,  3.3764],
         [-1.0268,  0.2352,  0.4442,  ...,  0.0957,  0.9448,  1.6880],
         [-1.2771,  0.4618,  1.0017,  ..., -0.8073, -0.2826, -3.4715]],

        [[ 1.3718,  0.6366,  1.5172,  ...,  1.1867,  1.2931,  0.1920],
         [ 0.1375,  0.8624, -0.2414,  ...,  0.8072,  0.9970,  3.0352],
         [-0.4100, -0.3262, -0.6987,  ...,  1.8427,  0.1628,  2.6847],
         [ 0.7345,  0.3516,  0.0804,  ..., -0.6139,  0.3109,  0.5245],
         [-0.2748, -0.7117, -1.0574,  ...,  0.6490, -0.5392, -2.0689]],

        [[ 0.5294,  0.5207,  0.9040,  ...,  0.3925,  0.0146,  1.1417],
         [-0.6442,  1.0822,  1.1887,  ...,  0.5585,  0.6905,  1.9299],
         [ 0.6552, -1.1436, -0.8208,  ...,  1.6406, -0.6324,  0.8812],
         [ 1.0263, -0.6653,  0.4868,  ..., -0.1060, -0.2947,  1.1716],
 