# PyTorch Basics

- ~~Creating and manipulating tensors~~
- Dataset, Dataloader, TensorDataset
- Building a lightweight example [save this for chp 13?]

Chp 13: Going Deeper - mechanics of PyTorch 

- Computational Graphs and Auograd 
- Modules: Torch.nn: nn.Sequenetial, nn.Module 
- Custom Layers using nn.Module

### Tensors

* Pytorch is built on _tensors_ which are enriched arrays containing data and model parameters.
* You can create _tensors_ from 
    * lists
    * _numpy_ arrays
    * Using built in initialization methods

In [16]:
import torch
import numpy as np

# from list
a = [1, 2, 3]
t_a = torch.tensor(a)
print(f"list: {t_a}")

# from numpy
b = torch.from_numpy(np.array([4, 5, 6], dtype=np.int32))
t_b = torch.tensor(b)
print(f"numpy: {t_b}")

# from ones
c = torch.ones(2, 3)
t_c = torch.tensor(c)
print(f"ones: {t_c}")

# from ones
torch.manual_seed(1985)
d = torch.rand(2, 3)
t_d = torch.tensor(d)
print(f"rand: {t_d}")

list: tensor([1, 2, 3])
numpy: tensor([4, 5, 6], dtype=torch.int32)
ones: tensor([[1., 1., 1.],
        [1., 1., 1.]])
rand: tensor([[0.3651, 0.4031, 0.1296],
        [0.0609, 0.5880, 0.4440]])


  t_b = torch.tensor(b)
  t_c = torch.tensor(c)
  t_d = torch.tensor(d)


You can access elements of the tensor using standard _.loc()_ accessing. 

In [32]:
torch.manual_seed(1985)
x = torch.rand(2, 3)

print(f"""
      x: {x}
      shape: {x.shape}
      first row: {x[0, :]}
      first column: {x[:, 0]}
      last element: {x[-1, -1]}
    """)


      x: tensor([[0.3651, 0.4031, 0.1296],
        [0.0609, 0.5880, 0.4440]])
      shape: torch.Size([2, 3])
      first row: tensor([0.3651, 0.4031, 0.1296])
      first column: tensor([0.3651, 0.0609])
      last element: 0.4439687728881836
    


You can manipulate data types using _torch.to_: 

In [7]:
t_a_new = t_a.to(torch.int64)
print(t_a_new.dtype)

torch.int64


You can reshape and transpose tensors using _torch.reshape()_ and _torch.transpose()_. 

In [10]:
# transpose
t = torch.rand(3, 5)
t_tr = t.transpose(0, 1)  # which two dimensions to transpose
print(t.shape, "-->", t_tr.shape)

# reshape
t = torch.zeros(30)
t_rs = t.reshape(5, 6)  # which two dimensions to transpose
print(t.shape, "-->", t_rs.shape)

torch.Size([3, 5]) --> torch.Size([5, 3])
torch.Size([30]) --> torch.Size([5, 6])


_torch.squeeze()_ drops all dimensions of a tensor that are redudant to reduce its rank.

In [15]:
t = torch.ones(1, 1, 2)
print(f"unsqueezed: {t}")
t_sqz = t.squeeze()  # removes redundant ranks
print(f"squeezed: {t_sqz}")


unsqueezed: tensor([[[1., 1.]]])
squeezed: tensor([1., 1.])


Linear algebraic operations are supported including 
    * vector/matrix multiplication 
    * statistical summaries (mean, standard deviation, etc) across axes 

In [26]:
# generate some data
torch.manual_seed(1985)
x = 2.0 * torch.rand(5, 2) - 1.0
y = torch.normal(mean=0.5, std=0.1, size=(5, 2))

# element-wise multiplication
mul = x * y
print(f"mul: {mul.shape}")

# matrix multiplication
inner = torch.mm(x.transpose(0, 1), y)
outter = torch.mm(x, y.transpose(0, 1))
print(f"inner: {inner.shape}, outer: {outter.shape}")

# row means and standard deviation
row_means = x.mean(dim=1)
row_stds = x.std(dim=1)
print(f"row_means: {row_means}\nrow_stds: {row_stds}")

mul: torch.Size([5, 2])
inner: torch.Size([2, 2]), outer: torch.Size([5, 5])
row_means: tensor([-0.2318, -0.8095,  0.0320, -0.2714, -0.3531])
row_stds: tensor([0.0538, 0.0972, 0.2037, 0.8031, 0.5477])


We can also combine and manipulate tensors utilizing either: 
* Chunking - spliting the tensor into equal sizes (if possible)
* Splitting - splitting the tensor into specified sizes (must be _exact_)
* Concatenating - combining tensors along an existing dimension
* Stacking - creates a new tensor by adding new dimensions to an existing tensor


In [41]:
# generate the data
torch.manual_seed(1985)
t = torch.rand(4, 2)
print(f"t: {t}")

# chunking
print("--- CHUNKING --- ")
t_splits = torch.chunk(t, 2, dim=0)
[print(item.numpy()) for item in t_splits]

# splitting
print("\n--- SPLITTING --- ")
t_splits = torch.split(t, split_size_or_sections=[2, 2], dim=0)
[print(item.numpy()) for item in t_splits]


t: tensor([[0.3651, 0.4031],
        [0.1296, 0.0609],
        [0.5880, 0.4440],
        [0.6482, 0.0804]])
--- CHUNKING --- 
[[0.36506873 0.40311843]
 [0.12958086 0.06087303]]
[[0.5880055  0.44396877]
 [0.64821994 0.08037758]]

--- SPLITTING --- 
[[0.36506873 0.40311843]
 [0.12958086 0.06087303]]
[[0.5880055  0.44396877]
 [0.64821994 0.08037758]]


[None, None]

In [48]:
# concatenation
x = torch.ones(3)
y = torch.ones(3)
z = torch.cat([x, y], axis=0)
print(f"Concatenation: {z}")

# stacking
z = torch.stack([x, y], axis=0)
print(f"Stacking: {z}")


Concatenation: tensor([1., 1., 1., 1., 1., 1.])
Stacking: tensor([[1., 1., 1.],
        [1., 1., 1.]])


### Preparing Training Dataset

There are a few popular ways to build datasets. We'll focus on three primarily: 

* _Dataset_ stores samples and labels. 
* _TensorDataSet_ is a _Dataset_ build directly from torch tensors (typical for tabular ML) 
* _DataLoader_ loads provides an iterable on _Dataset_ that allows you to load data in batches, shuffle samples, and load data in batches.


### Computational Graphs and Autograd

### Neural Network Modules

### PyTorch Training Loops

### 

In [49]:
n_epochs = 1 
for epoch in range(n_epochs):
    for x_batch, y_batch in train_loader:
        # 1. generate predictions 
        
        # 2. calculate the loss 
        
        # 3. back propogate loss 
        
        # 4. update the weights
        
        # 5. reset the gradients to zero
        
        # 6. log the updated performance
    
    # 7. Accumulate the performance metrics for the epoch 

IndentationError: expected an indented block (3927148816.py, line 16)