In [5]:
import torch

pytorch tensors - 
1. Tensor - similar to array or matrix, building blocks of neural network

In [6]:
my_list = [[1,2,3],[4,5,6]]

In [7]:
tensor = torch.tensor(my_list)

In [8]:
tensor

tensor([[1, 2, 3],
        [4, 5, 6]])

In [9]:
tensor.shape

torch.Size([2, 3])

In [10]:
tensor.dtype

torch.int64

compatible tensors - when their shapes align

### Building Neural Network using Pytorch

input layer - hidden layer - output layer

### 1. first neural network 
- this does not have hidden layer
- output layer is linear layer
- every output neuron connects to every input neurons - **fully connected network**
- this is equivalent to a linear model - helps us understand without adding complexity

In [11]:
import torch.nn as nn 

when designing a neural network, the input and output layer dimensions are pre-defined. 
- input neurons = features
- output neurons = classes (we want to predict)

In [12]:
# create input_tensor with three features - our input layer
input_tensor = torch.tensor([[0.3471,0.4547,-0.2356]])

In [13]:
# define our linear layer
linear_layer = nn.Linear(
    in_features =3,
    out_features=2
)


In [14]:
# pass input through linear layer
output = linear_layer(input_tensor)
print(output)

tensor([[-0.5322,  0.1088]], grad_fn=<AddmmBackward0>)


when input_tensor is passed to linear_layer, a linear function is performed to include weights and biases, each linear_layer has sets of weights and biases - these are the key quantities that define a neuron
- **weight** : reflects the importance of different features
- **bias** : provides the neuron with a baseline output
    - bias are independent of the weights
- at first, linear layer assigns random weights and biases and these are tuned later

In [15]:
print(linear_layer.weight)

Parameter containing:
tensor([[-0.3334, -0.2678, -0.4806],
        [ 0.0719, -0.2007, -0.2185]], requires_grad=True)


In [16]:
print(linear_layer.bias)

Parameter containing:
tensor([-0.4080,  0.1236], requires_grad=True)


**example** - let's say we have a weather dataset with three features - temperature, humidity and wind and we want to predict whether it's going to rain or be cloudy
1. humidity feature will have more significant weight compared to other features as it is a strong predictor of whethers it's going to rain or not
2. the data is for tropical region with high probability of rain, so a **bias** is added to account for this baseline information.
- with these information our model makes a prediction

### 2. Hidden Layers and Parameters

- here we will add more layers to help the network learn complex patterns
- stack three linear layers using nn.Sequential
- nn.Squential is a pytorch container for stacking layers in sequence
- takes input - passes it to each linear layers in sequence - returns output
- layers within nn.Sequential are hidden layers

In [None]:
# create network with three linear layers
model = nn.Sequential(
    nn.Linear(n_features, 8), #n_features represents number of input features
    nn.Linear(8,4),
    nn.Linear(4,n_classes) #n_classes represents number of output classes
)

- we can keep adding as many layers as we want as long as the input dimension of first layers matches the output dimension of the previous one

In [None]:
# adding more layers  - three linear layers
model = nn.Sequential(
    nn.Linear(10,18), # takes 10 layers and output 18
    nn.Linear(18,20), # takes 18 layers and output 20
    nn.Linear(20,5) # takes 20 layers and output 5
)

1. **layers are made of neurons**
- a layer is fully connected when each neuron links to all neurons in the previous layer
- a neuron in a linear layer :
    - performs a linear operation using all neurons from the previous layer
    - has n+1 parameters - n from inputs and 1 from the bias
2. **Paramters and model capacity**
- more hidden layer = more parameters = higher model capacity ( can handle complex dataset but may take longer to train)
- an effective way to assess a models capacity is by calculating it's total number of parameters

In [None]:
# in 2 layer network 
model = nn.Sequential(nn.Linear(8,4),
                      nn.Linear(4,2))

**manual paramter calculation:**
- first layer has 4 neurons, each neuron has 8+1 (8 weights and 1 bias) parameters. 9 times 4 = 36 parameters
- second layer has 2 neurons, each neuron has 4+1 parameters. 5 times 2 = 10 parameters
- in total this model has - 36 + 10 = 46 learnable parameters

we can do this manual calculation in python using .numel() method
- .numel() : returns the number of elements in the tensor

In [None]:
total = 0
for paramter in model.parameters():
    total += paramater.numel()
print(total)

understanding parameter count helps us understand model complexity and efficiency
- too many parameters can lead to long training times or overfitting
- too few parameters might limit learning capacity