[![Fixel Algorithms](https://fixelalgorithms.co/images/CCExt.png)](https://fixelalgorithms.gitlab.io)

# Deep Learning Methods

## Introduction to Deep Learning - PyTorch Basics

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 1.0.004 | 19/09/2025 | Royi Avital | Updated link of Google Colab                                       |
| 1.0.003 | 17/09/2025 | Royi Avital | Added example to iterate over parameters                           |
| 1.0.002 | 08/05/2025 | Royi Avital | Added a sketch of High Dimensional Tensors                         |
| 1.0.001 | 26/04/2025 | Royi Avital | Updated code to match PyTorch 2.5 and 2.6                          |
| 1.0.000 | 25/04/2024 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/DeepLearningMethods/2025_08/0001DeepLearningPyTorchBasics.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning

# Deep Learning
import torch
import torch.nn            as nn
import torch.nn.functional as F
import torchinfo

# Miscellaneous
from platform import python_version
import random
import time

# Typing
from typing import Callable, Dict, List, Optional, Self, Set, Tuple, Union

# Visualization
import matplotlib.pyplot as plt

# Jupyter
from IPython import get_ipython

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

```python
valToFill = ???
```

 - Multi Line to Fill (At least one)

```python
# You need to start writing
?????
```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

?????
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# Matplotlib default color palette
lMatPltLibclr = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

In [None]:
# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2

In [None]:
# Courses Packages


In [None]:
# General Auxiliary Functions


## PyTorch

In our days _PyTorch_ is considered to be the _Go To_ Deep Learning framework.

![Papers with Code: PyTorch vs. TensorFlow](https://i.imgur.com/BybdtbK.png)
![Code Repositories: PyTorch vs. TensorFlow](https://i.imgur.com/z9N8Ywc.png)

Source [AssemblyAI - PyTorch vs TensorFlow in 2023](https://www.assemblyai.com/blog/pytorch-vs-tensorflow-in-2023/).

Modern DL framework is composed of the following components:

 - **Data Structure**  
   The container of the multidimensional arrays.
 - **Layers**  
   Set of Mathematical operations on data.  
   Built by atoms: Dense, Convolution, Attention, Activations, etc...
 - **Loss Functions**  
   Different objectives for various applications.  
   Often used in the "Heads" of the net.
 - **Automatic Differentiation Engine**  
   Being able to calculate the Gradients of the computational graph of the net.
 - **Optimizers & Schedulers**  
   Applying update rules on the weights and step size.
 - **Data Loaders**  
   Loading data from storage, unpacking, caching, augmentation phase.
 - **Dashboard** (Optional)  
   A tool to analyze multiple experiments with nets during run time and after.
 - **Model Zoo** (Optional)  
   A set of pre defined architectures and pre trained weights.

PyTorch _claim to fame_ is its natural extension to _Python_ with its _dynamic_ (Eager) mode of operation.

* <font color='brown'>(**#**)</font> Any modern DL framework must support various accelerators: GPU's, TPU's, NPU's, etc...  
  The most common accelerator is based on NVIDIA GPU (See [PyTorch `cuda` Backend](https://docs.pytorch.org/docs/stable/cuda.html)).  
  PyTorch also supports Intel GPU's (See [PyTorch `xpu` Backend](https://docs.pytorch.org/docs/stable/xpu.html)) and Apple Silicon GPU (See [PyTorch `mps` Backend](https://docs.pytorch.org/docs/stable/mps.html)).
* <font color='brown'>(**#**)</font> PyTorch is backed by _Facebook_ from its start.
* <font color='brown'>(**#**)</font> PyTorch is originated from _Torch_ which was a DL framework for [Lua Programming Language](https://en.wikipedia.org/wiki/Lua).
* <font color='brown'>(**#**)</font> [PyTorch official tutorials](https://pytorch.org/tutorials).
* <font color='brown'>(**#**)</font> [PyTorch User Forum](https://discuss.pytorch.org).

In [None]:
# Import PyTorch

import torch

# Torch Version
print(torch.__version__)

# Check for CUDA based GPU
print(f'CUDA is available to PyTorch: {torch.cuda.is_available()}')

## Data Structure

PyTorch's native data structure is the Tensor.

![PyTorch Tensor](https://i.imgur.com/xnjH0rU.jpeg)

From [What Do You Mean by Tensor](https://www.i2tutorials.com/what-do-you-mean-by-tensor-and-explain-about-tensor-datatype-and-ranks).

* <font color='brown'>(**#**)</font> [PyTorch Tensor Tutorial](https://pytorch.org/tutorials/beginner/basics/tensorqs_tutorial.html).
* <font color='brown'>(**#**)</font> The PyTorch's `tensor` is similar to NumPy's `ndarray`.

In [None]:
# PyTorch Vector

vX = torch.tensor([0.5, -7.5, 3.25])
vX #<! With the default `dtype`

In [None]:
# PyTorch Data Initializers

mX = torch.ones(2, 3)
vX = torch.linspace(1, 3, 15)
print(mX)
print(vX)

In [None]:
# PyTorch Default Type

print(mX.type())

* <font color='brown'>(**#**)</font> The default of _Torch_ is `Float32` as opposed to _NumPY_ which is `Float64`.
* <font color='brown'>(**#**)</font> In our days it is common to use `Float16` (Actually [`BFloat16`](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format)) and even `Float8` (`BFloat16`).

In [None]:
# Imported Types
vX1 = torch.tensor([1, 2, 5, 6.])
vX2 = torch.tensor([1, 2, 5, 6])
print(vX1.type())
print(vX2.type())

* <font color='brown'>(**#**)</font> You may read on PyTorch's types: [`torch.Tensor`](https://pytorch.org/docs/stable/tensors.html).

In [None]:
# Attributes of the Tensor
mX = torch.rand((2, 3))
print(f'mX Shape: {mX.shape}')
print(f'mX Size: {mX.size()}') #<! See https://github.com/pytorch/pytorch/issues/5544
print(f'mX Size at 1st Dimension: {mX.size(0)}')
print(f'mX NumPy Size: {mX.numpy().size}') #<! Convert to Numpy
print(f'mX Number of Elements: {mX.numel()}')

In [None]:
# From NumPy
mX = torch.from_numpy(np.random.randn(10, 2, 3))
mX = torch.tensor(np.random.randn(10, 2, 3))

Tensors data structure tools for _high dimensional data_:

![](https://i.imgur.com/TcrVkkn.png)
<!-- ![](https://i.postimg.cc/sgmkscFc/image.png) -->

### Device

The _Tensor_ can be generated / transferred into an accelerator.  
In our case, a _CUDA Device_ (NVIDIA GPU).

* <font color='brown'>(**#**)</font> See the [`torch.cuda` Module](https://pytorch.org/docs/stable/cuda.html).
* <font color='brown'>(**#**)</font> See [Check if PyTorch Uses the GPU](https://stackoverflow.com/questions/48152674), [List Available GPU's in PyTorch](https://stackoverflow.com/questions/64776822).
* <font color='brown'>(**#**)</font> PyTorch has support for [`MPS Backend`](https://pytorch.org/docs/stable/notes/mps.html) which is the Apple Silicon GPU.  
  It is heavily invested yet still not on par with `CUDA`. Its main advantage is less overhead and larger memory.  
  See [`torch.mps` Module](https://pytorch.org/docs/stable/mps.html).

In [None]:
# Set the Device
TORCH_DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu') #<! You may use `cuda:0` for the first device
print(f'The chosen device: {TORCH_DEVICE}')

# MPS:
# TORCH_DEVICE = 'cuda' if torch.cuda.is_available() else ('mps' if torch.backends.mps.is_available() else 'cpu')

In [None]:
# Move Data to Device

mX = torch.randn(10, 2)
print(f'The data device: {mX.device}')
mX = mX.to(TORCH_DEVICE) #<! Creates a copy!
print(f'The data device: {mX.device}')
mX = mX.cpu()
print(f'The data device: {mX.device}')

In [None]:
# Convert to NumPy from GPU
# One must make sure the data on CPU before converting into NumPY

mX = torch.randn(10, 2, device = TORCH_DEVICE) #<! Generated on GPU
print(f'The data device: {mX.device}')
# mX = mX.numpy() #<! This will raise an error if `mX`` is on GPU
mX = mX.cpu().numpy() #<! Chaining
print(f'The data type: {type(mX)}')

In [None]:
# Run Time - CPU

numRows = 10_000
numCols = numRows

mX1 = torch.randn(numRows, numCols)
mX2 = torch.randn(numRows, numCols)

startTime = time.time()
mX3       = mX1 @ mX2
mX3[0, 0] = 1.0
endTime   = time.time()

print(f'CPU time: {endTime - startTime}')

In [None]:
# Run Time - GPU
mX1 = torch.randn(numRows, numCols, device = TORCH_DEVICE)
mX2 = torch.randn(numRows, numCols, device = TORCH_DEVICE)
mX3 = mX1 @ mX2

startTime = time.time()
for _ in range(10):
    mX3 = mX1 @ mX2
torch.cuda.synchronize() #<! To actually measure
endTime = time.time()
# mX3 = mX3.cpu()

print(f'GPU time: {endTime - startTime}')

In [None]:
# Run Time - GPU
# More Accurate Method: [How to Measure Run Time in PyTorch](https://discuss.pytorch.org/t/26964)
startEvent = torch.cuda.Event(enable_timing = True)
endEvent   = torch.cuda.Event(enable_timing = True)

mX1 = torch.randn(numRows, numCols, device = TORCH_DEVICE)
mX2 = torch.randn(numRows, numCols, device = TORCH_DEVICE)
mX3 = mX1 @ mX2

startEvent.record()
for _ in range(10):
    mX3 = mX1 @ mX2
endEvent.record()

# Waits for everything to finish running
torch.cuda.synchronize()

print(f'Run Time: {(startEvent.elapsed_time(endEvent) / 1000.0): 0.3f} [Second]') #<! Like `tic()` and `toc()` in MATLAB

### NumPy & SciPy Functionality

PyTorch can be used as a general Linear Algebra + Scientific Computing library accelerated with GPU ot other accelerators.

* <font color='brown'>(**#**)</font> See [`torch.fft`](https://pytorch.org/docs/stable/fft.html), [`torch.linalg`](https://pytorch.org/docs/stable/linalg.html), [`torch.signal`](https://pytorch.org/docs/stable/signal.html), [`torch.special`](https://pytorch.org/docs/stable/special.html), [`torch.optim`](https://pytorch.org/docs/stable/optim.html), [`torch.random`](https://pytorch.org/docs/stable/random.html), [`torch.sparse`](https://pytorch.org/docs/stable/sparse.html).
* <font color='brown'>(**#**)</font> See [CuPy](https://github.com/cupy/cupy) and [JaX](https://github.com/google/jax).
* <font color='brown'>(**#**)</font> [JaX](https://github.com/google/jax) is the backbone of other DL frameworks (Google's spiritual successor of _TensorFlow_).
* <font color='brown'>(**#**)</font> Mind the overhead and accuracy (By default `Float32`) when using CUDA devices.

In [None]:
# Matrix Multiplication

numRows = 10
numCols = 7 

mX = torch.rand(numRows, numCols, device = TORCH_DEVICE)
mY = torch.rand(numCols, numCols, device = TORCH_DEVICE)

# Accelerated
mX @ mY

## Layers & Loss Functions

PyTorch has a vast number of layers in the [`torch.nn`](https://pytorch.org/docs/stable/nn.html) module.  
The layers have both a _Class_ form and _Function_ (See [`torch.nn.functional`](https://pytorch.org/docs/stable/nn.functional.html)) form.

* <font color='brown'>(**#**)</font> The layer initialization happens using [`torch.nn.init`](https://pytorch.org/docs/master/nn.init.html).
* <font color='brown'>(**#**)</font> [Loss Function as Classes](https://pytorch.org/docs/stable/nn.html#loss-functions), [Loss Functions as Functions](https://pytorch.org/docs/stable/nn.functional.html#loss-functions).

In [None]:
# Linear (Dense / Fully Connected) Layer
# https://pytorch.org/docs/stable/generated/torch.nn.Linear.html
# The layer parameters are a Matrix and a Vector.

dimIn       = 10
dimOut      = 3
batchSize   = 4

mX = torch.rand(batchSize, dimIn) #<! N x d

# Initialization happens in `__init__`
oLinLayer = nn.Linear(dimIn, dimOut) #<! Look at `device = `
print(f'The Linear Layer weights: {oLinLayer.weight}') #<! Parameter Object (Registered for Computational Graph)
print(f'The Linear Layer bias: {oLinLayer.bias}') #<! Parameter Object (Registered for Computational Graph)
print(f'The Linear Layer weights array: {oLinLayer.weight.data}') #<! Values (Array in Memory)

# Apply (Data in Rows)
print(f'Output by `forward()`: {oLinLayer.forward(mX)}') #<! Forward
print(f'Output by `call()`: {oLinLayer(mX)[0]}') #<! Call

LinearFun = F.linear
# No automatic computational graph (`grad_fn=`)
print(f'Output by the functional form: {LinearFun(mX, oLinLayer.weight.data, oLinLayer.bias.data)}') #<! Useful for lower overhead for operations with no parameters

In [None]:
# Iterate Over Parameters
# May use `oLinLayer.parameters()` or `oLinLayer.named_parameters()`
for name, param in oLinLayer.named_parameters():
    print(f'Layer Parameter: {name} with shape {param.shape}')

In [None]:
# Access Parameters by Name
oLinLayer.get_parameter('weight')

## Automatic Differentiation

[_Auto Diff_](https://en.wikipedia.org/wiki/Automatic_differentiation) is the most challenging part of a DL framework.  
There 3 main approaches to the design:

 * Symbolic  
   Build the operations in a symbolic way and try solving the gradient.  
   Usually it is slow and not scalable.
   This is how [Wolfram Mathematica](https://en.wikipedia.org/wiki/Wolfram_Mathematica) works.
 * Numerically  
   Either by [_Finite Difference_](https://en.wikipedia.org/wiki/Finite_difference_method) (Slow) or [_Dual Numbers_](https://en.wikipedia.org/wiki/Dual_number).
 * Computational Graph & Overloading / Code Transform  
   Those are the most advanced approaches as they proved to be fast and scalable.

PyTorch implements _Automatic Differentiation_ in [`torch.autograd`](https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html).

In [None]:
# Define a Function
# The function could be `def` style function or a Lambda function.
# It should be Rn -> R function.

def MadeUpFun( vX: torch.Tensor, vW: torch.Tensor ) -> torch.FloatType:

    return torch.pow(torch.dot(vX, vW), 2)

In [None]:
# Data

numElm = 5

# Using `requires_grad` to make PyTorch build the graph
vX = torch.tensor([1.4, 3.2], requires_grad = True) #<! The `requires_grad = False` is the default
vW = torch.tensor([2.3, 5.1], requires_grad = True)

valY = MadeUpFun(vX, vW) #<! Build the Computational Graph
valY.backward() #<! Back Propagation

# ∇xf = 2 w' * x * w
# ∇wf = 2 w' * x * x
print(vX.grad)
print(vW.grad)

The _Computational Graph_ is updated with any operation.  
Hence, it should be reset if the operation it to be redone.  

In [None]:
# Without a Reset

valY = MadeUpFun(vX, vW) #<! Build the Computational Graph
valY.backward() #<! Back Propagation

# ∇xf = 2 w' * x * w
# ∇wf = 2 w' * x * x
print(vX.grad)
print(vW.grad)

* <font color='brown'>(**#**)</font> Gradients are accumulated per iteration.  
  It is motivated to allow multiple iterations before applying optimization step and / or by using multiple GPU's.

In [None]:
# Reset Gradients

vX.grad.data.zero_() #<! Inplace 
vW.grad.data.zero_() #<! Inplace

valY = MadeUpFun(vX, vW) #<! Build the Computational Graph
valY.backward() #<! Back Propagation

# ∇xf = 2 w' * x * w
# ∇wf = 2 w' * x * x
print(vX.grad)
print(vW.grad)

* <font color='brown'>(**#**)</font> Leaf, the starting points of the graph, can not be changed in place.
* <font color='brown'>(**#**)</font> PyTorch is optimized for the case the output of the _computational graph_ is a scalar.

### Detaching from Graph

An object can be detached (A view) from the  graph by using [`.torch.Tensor.detach()`](https://pytorch.org/docs/stable/generated/torch.Tensor.detach.html).

* <font color='brown'>(**#**)</font> See [`torch.clone()`](https://pytorch.org/docs/stable/generated/torch.clone.html).

In [None]:
# Detach an Object
vXDetach = vX.detach()
vS = torch.square(vXDetach) #<! No graph / gradient
valY = MadeUpFun(vS, vW)
valY.backward()
print(vS.grad)

## Composition of Operations

Using the _atoms_ one can build a composed operation where PyTorch will create its computational graph automatically.

### Sequential Model

This section implements a composition of mathematical operations using [PyTorch Sequential Model](https://pytorch.org/docs/stable/generated/torch.nn.Sequential.html).

$$\hat{\boldsymbol{y}} = f \left( \boldsymbol{x} \right) = \boldsymbol{W}_{3} \sigma \left( \boldsymbol{W}_{2} \sigma \left( \boldsymbol{W}_{1} \boldsymbol{x} + \boldsymbol{b}_{1} \right) + \boldsymbol{b}_{2} \right) $$

In [None]:
# Sequential Model

oModel = nn.Sequential(
    nn.Identity(),                              #<! For the summary (Shows the input)
    nn.Linear(100, 50),              nn.ReLU(), #<! z1 = σ(W1 * x + b1)
    nn.Linear(50,  25),              nn.ReLU(), #<! z2 = σ(W2 * z1 + b2)
    nn.Linear(25,  10, bias = False)            #<! y  = W3 * z2 (No Bias)
)

In [None]:
oModel

In [None]:
# Model Device
print(f'The model device: {next(oModel.parameters()).device}')

In [None]:
# Model Summary
# In order to see the model summary once must supply an input.

numSamples  = 16
dataDim     = 100

torchinfo.summary(oModel, input_size = (numSamples, dataDim), device = 'cpu') #<! By default tries on CUDA

In [None]:
# Model Device
print(f'The model device: {next(oModel.parameters()).device}') #<! Checks the first, assumes all on the same GPU

In [None]:
# Run Model
# No need for computational graph

# https://pytorch.org/docs/2.3/notes/autograd.html#evaluation-mode-nn-module-eval
oModel.eval() #<! Evaluation / Inference mode (for layers which requires it)

mX = torch.rand(numSamples, dataDim, requires_grad = False)

# https://pytorch.org/docs/2.3/notes/autograd.html#inference-mode
with torch.inference_mode():
    mY = oModel(mX)

# `no_grad()` vs. `inference_mode()`: https://stackoverflow.com/questions/74191070
# with torch.no_grad():
#     mY = oModel(mX)

print(mY)
print(mY.requires_grad)

In [None]:
# Run Model
# Selective calculation of the gradient.

oModel = nn.Sequential(
    nn.Identity(),                            #<! For the summary (Shows the input)
    nn.Linear(100, 50),            nn.ReLU(), #<! z1 = σ(W1 * x + b1)
    nn.Linear(50,  25),            nn.ReLU(), #<! z2 = σ(W2 * z1 + b2)
    nn.Linear(25,  10, bias = False)          #<! y  = W3 * z2
    )

# https://pytorch.org/docs/2.3/notes/autograd.html#evaluation-mode-nn-module-eval
oModel.eval() #<! Evaluation / Inference mode (for layers which requires it)
for p in oModel.parameters():
    # Disable the gradient calculation per parameter
    # Could be achieved with `no_grad` context (https://pytorch.org/docs/stable/generated/torch.no_grad.html)
    p.requires_grad_(False)
    # p.requires_grad

# Equivalent
# oModel.requires_grad_(False)

mX = torch.rand(numSamples, dataDim, requires_grad = False)
mY = oModel(mX)

print(mY)
print(mY.requires_grad)

### Custom Composition

This section implements a module (`torch.nn`) based on the function:

$$\hat{\boldsymbol{y}}=f\left(\boldsymbol{x}\right)=\boldsymbol{W}_{3}\left(\sigma_{1}\left(\boldsymbol{W}_{1}\boldsymbol{x}\right)+\sigma_{2}\left(\boldsymbol{W}_{2}\boldsymbol{x}\right)\right)$$

<center> <img src="https://media.githubusercontent.com/media/FixelAlgorithmsTeam/FixelCourses/refs/heads/master/DeepLearningMethods/2023_02/05_PyTorch/ParallelNetwork.png" style="width: 500px;"/></center>

By its architecture, it can not be implemented as a sequential model.

* <font color='brown'>(**#**)</font> Actually if there a module which implements the parallel section it can.

#### Option I - Custom Layer

Define a new layer operation

$$\text{NewLayer}\left(\boldsymbol{x}\right)=\sigma_{1}\left(\boldsymbol{W}_{1}\boldsymbol{x}\right)+\sigma_{2}\left(\boldsymbol{W}_{2}\boldsymbol{x}\right)$$

Which will be used in a sequential model.

In [None]:
# NewLayer

class NewLayer(nn.Module):
    def __init__( self, dIn: int, dOut: int ) -> None:
        
        super().__init__() #<! Do this to get all initialization of Layer
        self.oLinear1 = nn.Linear(dIn, dOut, bias = False)
        self.oLinear2 = nn.Linear(dIn, dOut, bias = False)

    def forward( self, mX: torch.Tensor ) -> torch.Tensor:
        
        mZ1 = torch.relu(self.oLinear1(mX)) #<! σ1(W1 * x)
        mZ2 = torch.relu(self.oLinear2(mX)) #<1 σ2(W2 * x)

        return mZ1 + mZ2

* <font color='red'>(**?**)</font> Why is the `backward()` method missing?

In [None]:
# Sequential Model

oModel = nn.Sequential(
    NewLayer (100, 50),              #<! z = σ1(W1 * x) + σ2(W2 * x)
    nn.Linear(50,  10, bias = False) #<! y = W3 * z
)

torchinfo.summary(oModel, (16, 100))

In [None]:
# Model Device
print(f'The model device: {next(oModel.parameters()).device}') #<! GPU!

#### Option II - Complete Architecture

This will build a net which implements the whole architecture.

In [None]:
# Net Model

class ParallelModel(nn.Module):
    def __init__( self, dIn: int, dHidden: int, dOut: int ) -> None:
        
        super().__init__()
        self.oLinear1 = nn.Linear(dIn, dHidden, bias = False)  #<! W1
        self.oLinear2 = nn.Linear(dIn, dHidden, bias = False)  #<! W2
        self.oLinear3 = nn.Linear(dHidden, dOut, bias = False) #<! W3

    def forward( self, mX: torch.Tensor ) -> torch.Tensor:
        
        mZ1 = torch.sigmoid(self.oLinear1(mX)) #<! σ1(W1 * x)
        mZ2 = torch.tanh(self.oLinear2(mX))    #<! σ2(W2 * x)
        mY  = self.oLinear3(mZ1 + mZ2)         #<! W3 * (σ1(W1 * x) + σ2(W2 * x))
        
        return mY

In [None]:
# Net Model

oModel = ParallelModel(100, 50, 10)

torchinfo.summary(oModel, (16, 100))

* <font color='green'>(**@**)</font> Build a _Logistic Regression_ classifier using PyTorch (Binary Classification).