<a href="https://colab.research.google.com/github/sdr0598/ml-drug-discovery/blob/main/day2_batch_operations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Week 1: Day 2- Batch Operations
## Goal
- Extend existing Day 1 notebook (which operates on a single molecule / single value) to work on a batch of 50 molecules represented as a 2D tensor.

## Key Concept
A tensor is not a matrix by default unless a column dimension is specified.
So (50,) is a 1D tensor with 50 values (no rows & no columns) while (50,1) is a 2D tensor with 50 rows and 1 column.

In [1]:
import torch
print(torch.__version__)
print("GPU available", torch.cuda.is_available())

2.10.0+cpu
GPU available False


In [4]:
# Molecule Matrix
# 50 molecules, 5 descriptors each (e.g Lipinski)
torch.manual_seed(42) # set seed for reproducibility
molecules = torch.randn(50,5) # 50 rows and 5 columns
print(molecules.shape)

torch.Size([50, 5])


In [8]:
# Indexing practice
# Grab 1st molecule
print(molecules[0])
# Grab 2nd descriptor of first molecule
print(molecules[0][1])
# Grab 2nd descriptor of all molecules
print(molecules[:, 1])
print(molecules[:, 1].shape)
# Grab last descriptor of 1st 5 molecules
print(molecules[:5,-1])

tensor([ 1.9269,  1.4873,  0.9007, -2.1055,  0.6784])
tensor(1.4873)
tensor([ 1.4873, -0.0431, -1.4036,  1.6423,  1.0783,  0.6105,  0.8599,  0.3189,
         0.9956, -1.2347,  0.5258, -1.4032,  1.8446,  2.2181,  1.2780,  0.5750,
        -0.3387,  1.1412,  0.6039, -2.5095,  0.5832,  0.4048, -0.3036,  0.7262,
         0.2408,  0.7200, -1.1299,  1.1415,  1.0331,  1.6192,  0.1043, -2.4801,
         0.2286,  0.1748,  0.0523,  0.0358, -1.5910, -1.9267,  0.7099, -1.1753,
         2.1120,  1.3065, -0.5923, -0.3980, -0.5003,  0.1539,  1.9892,  0.6427,
         0.5636, -0.7129])
torch.Size([50])
tensor([ 0.6784,  1.6487, -0.7688,  0.4396,  1.2791])


In [16]:
# Simple batch operation
torch.manual_seed(42)
weights = torch.randn(5)
print(weights)
predictions = molecules @ weights
# matrix multiply: (50,5) X (5,1) = (50,) - 1D tensor
# Key distinction: (100,) vs (100, 1)
# Former is a 1D tensor while the latter is a 2D tensor
print(predictions.shape)

tensor([ 0.3367,  0.1288,  0.2345,  0.2303, -1.1229])
torch.Size([50])
