### **Pytorch**

**01. Introduction**

*`Timeline`*

* 1.x
    * Python compatible
    * Dynamic computation graph
    * TorchScript for model serialization and optimization
    * Caffe2 integration
    * Distributed training
    * ONNX (Open Neural Network Exchange) compatibility for interpretability with other frameworks
    * Introduced quantization for model compression and efficiency
    * Expanded ecosystem with torchvision (CV), torchtext (NLP), and torchaudio (audio).
* 2.x
    * Enhanced support for deployment and production-readiness.
    * Optimized for modern hardware (TPUs, custom AI-chips).


*`Core Features`*

1. Tensor Computations
2. GPU accelerations
3. Dynamic Computation Graph
4. Automatic Differentiation
5. Distributed training 
6. Interoperabilitiy with other libraries


*`Core Modules in Pytorch`*
<br>

|Module| Description|
|---|---|
|torch| core-library for providing multi-dimensional arrays (tensors) and mathematical operations on them.|
|torch.autograd|Automatic differentiation engine that records operations on tensors to compute gradients for optimization.|
|torch.nn|Provides neural network library, including layers, activation, loss functions and utilities to build deep learning models|
|torch.optim| Contains optimization algorithms (optimizers) like SGD, Adam, RMSprop used for NN-training.|
|torch.utils.data|Utilities for data-handling including the Dataset and DataLoader classes for managing and loading data efficiently.|
|torch.jit|Supports Just-In-Time (JIT) compilation and TorchScript for optimizing models and enabling deployment without python dependencies.|
|torch.distributed| Tools for distributed training across multiple GPU and machines, facilitating parallel computation.|
|torch.cuda| Interfaces with NVIDIA CUDA to enable GPU acceleration for tensor computations and model training.|
|torch.backends|Contains settings and allows control over backend libraries like cuDNN, MKL, and other for performance tunining. |
|torch.multiprocessing| Utilities for parallelism using multiprocessing, similar to Python's multiprocessing module but with support for CUDA tensors.|
|torch.quantization| Tools for model quantization to reduce model size and improve inference speed, especially on edge devices.|
|torch.onxx| Supports exporting PyTorch models to ONXX format for interoperability with other frameworks and deployment.|


*`Pytorch Domain Libraries`*
* torchvision
* torchtext
* torchaudio
* torcharrow
* torchserve
* pytorch_lightning

*`Ecosystem libraries`*
* Hugging Face Transformers
* Fastai
* Pytorch Geometric
* TorchMetrics
* TorchElastic
* Optuna
* Catalyst
* Ignite
* AllenNLP
* Skorch
* Pytorch Forcasting
* Tensorboard for Pytorch


**02. Basics of tensors**

* Specialized multi-dimensional array designed for mathematical and computational efficiency

In [1]:
import torch as tr
import tensorflow as tf

2025-05-26 00:37:17.200910: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
if tr.cuda.is_available():
    print(f'Using GPU:{tr.cuda.get_device_name(0)}')
else:
    print(f"GPU not available, using CPU")

# current device used
from tensorflow.python.client import device_lib
print(f'\nCurrent devices: {device_lib.list_local_devices()[0].name}')

GPU not available, using CPU

Current devices: /device:CPU:0


**03. Creating Tensors**

In [4]:
# 1. using empty: creates a space in memory of size (2,3) and doesn't assign value rather uses existing value of that space
a = tr.empty(size = (2,3))
print(a)

b = tf.experimental.numpy.empty(shape=(2,3))
print(b)

tensor([[3.0276e-16, 3.3906e-41, 0.0000e+00],
        [0.0000e+00, 4.0944e+14, 4.4565e-41]])
tf.Tensor(
[[0. 0. 0.]
 [0. 0. 0.]], shape=(2, 3), dtype=float64)


In [5]:
# checking type
type(a),type(b)

(torch.Tensor, tensorflow.python.framework.ops.EagerTensor)

In [6]:
# 2. using zeros: creates a tensor with all zeros
a = tr.zeros(size = (2,5)) 
print(a)

b = tf.zeros(shape = (2,5))
print(b)

tensor([[0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0.]])
tf.Tensor(
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]], shape=(2, 5), dtype=float32)


In [None]:
# 3. using ones: creates a tensor with all ones
a = tr.ones(size = (2,4)) 
print(a)

b = tf.ones(shape = (2,4))
print(b)

tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tf.Tensor(
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]], shape=(2, 4), dtype=float32)


In [8]:
# 4. using rand:  
a = tr.rand(size = (2,5)) 
print(a)

b = tf.random.uniform(shape = (2,5))
print(b)

tensor([[0.1312, 0.7962, 0.8914, 0.1855, 0.8976],
        [0.7117, 0.5369, 0.4720, 0.3344, 0.9462]])
tf.Tensor(
[[0.19350553 0.8708118  0.41380608 0.70585346 0.22799802]
 [0.715858   0.590588   0.68044174 0.6352383  0.43992496]], shape=(2, 5), dtype=float32)


In [9]:
# use of seed
random_seed = 14
tr.manual_seed(random_seed)
tf.random.set_seed(random_seed)

a = tr.rand(size = (2,5)) 
print(a)

b = tf.random.uniform(shape = (2,5))
print(b)

tensor([[0.5695, 0.0047, 0.9303, 0.7257, 0.8295],
        [0.7683, 0.0600, 0.1453, 0.2924, 0.5292]])
tf.Tensor(
[[0.49482596 0.3634578  0.47555816 0.3824556  0.5170771 ]
 [0.24241877 0.16012645 0.5617894  0.9455174  0.20611668]], shape=(2, 5), dtype=float32)


In [10]:
# using tensor
a = tr.tensor([[2,3],[7,8]]) 
print(a)

b = tf.constant([[2,3],[7,8]])
print(b)

tensor([[2, 3],
        [7, 8]])
tf.Tensor(
[[2 3]
 [7 8]], shape=(2, 2), dtype=int32)


In [11]:
# Other ways to create tensors

# 6. using arange
a = tr.arange(5,10,2)
print(a)

b = tf.range(5,10,2)
print(b)

tensor([5, 7, 9])
tf.Tensor([5 7 9], shape=(3,), dtype=int32)


In [12]:
# 7. using linspace

a = tr.linspace(5,10,5)
print(a)

b = tf.linspace(5,10,5)
print(b)

tensor([ 5.0000,  6.2500,  7.5000,  8.7500, 10.0000])
tf.Tensor([ 5.    6.25  7.5   8.75 10.  ], shape=(5,), dtype=float64)


In [13]:
# 8. using eye
a = tr.eye(3)
print(a)

b = tf.eye(3)
print(b)

tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])
tf.Tensor(
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]], shape=(3, 3), dtype=float32)


In [14]:
# 9. using full
a = tr.full(size = (3,3), fill_value=5)
print(a)

b = tf.keras.ops.full(shape = (3,3), fill_value = 5)
print(b)

tensor([[5, 5, 5],
        [5, 5, 5],
        [5, 5, 5]])
tf.Tensor(
[[5. 5. 5.]
 [5. 5. 5.]
 [5. 5. 5.]], shape=(3, 3), dtype=float32)


In [12]:
# accessing tensor content
a = tr.randn(1)
a.item(),type(a.item()) # gives scalar output 


(-1.5792160034179688, float)

**04. Tensor shapes**

In [34]:
a = tr.tensor([[1,2,3],[4,5,6]])
b = tf.constant([[1,2,3],[4,5,6]])
a.shape,b.shape

(torch.Size([2, 3]), TensorShape([2, 3]))

In [45]:
# creating tensors of exact shape like given tensors
c = tr.empty_like(a)
d = tr.zeros_like(c)
e = tr.ones_like(d)
# f = tr.rand_like(a) # here the code will throw error as datatype miss-match
f = tr.rand_like(a,dtype=tr.float32)

print(c)
print(d)
print(e)
print(f)

tensor([-881442624,      24193, -899170112], dtype=torch.int32)
tensor([0, 0, 0], dtype=torch.int32)
tensor([1, 1, 1], dtype=torch.int32)
tensor([0.1466, 0.8305, 0.3579])


In [27]:
d = tf.zeros_like(c)
e = tf.ones_like(d)

print(c)
print(d)
print(e)

tensor([[    103908143792128, 7310593858020254331, 3616445622929465956],
        [6067529767722167605, 3617017455194026033, 6499029850491270446]])
tf.Tensor(
[[0 0 0]
 [0 0 0]], shape=(2, 3), dtype=int64)
tf.Tensor(
[[1 1 1]
 [1 1 1]], shape=(2, 3), dtype=int64)


**Note:**
* A torch tensor can be passed to tensorflow but vice-versa doesn't work
* Whenever no-direct `tf` functionality available try using `tf.keras.ops.functionality...`

**05. Tensor Data types**

In [36]:
# checking data-type
print(a.dtype)
print(b.dtype)

torch.int64
<dtype: 'int32'>


In [47]:
# assigning data-type
a = tr.tensor([1.0,2.0,3.0],dtype=tr.int32)
# b = tf.constant([1.0,2.0,3.0],dtype = tf.int32) # direct type mentioning not possible
b = tf.cast(tf.constant([1.0,2.0,3.0]),dtype=tf.int32)

print(a,b)

tensor([1, 2, 3], dtype=torch.int32) tf.Tensor([1 2 3], shape=(3,), dtype=int32)


In [40]:
a = tr.tensor([1,2,3],dtype=tr.float32)
b = tf.constant([1,2,3],dtype = tf.float32) # this is possible
print(a,b)

tensor([1., 2., 3.]) tf.Tensor([1. 2. 3.], shape=(3,), dtype=float32)


In [42]:
# typecasting
a = a.to(dtype=tr.int32)
b = tf.cast(b,tf.int32)

print(a.dtype,b.dtype)

torch.int32 <dtype: 'int32'>


| PyTorch dtype                          | TensorFlow dtype     | Description                          |
|--------------------------------------|---------------------|------------------------------------|
| `tr.float32`                       | `tf.float32`        | 32-bit floating point               |
| `tr.float64`                       | `tf.float64`        | 64-bit floating point (double)      |
| `tr.float16`                       | `tf.float16`        | 16-bit floating point (half)        |
| `tr.bfloat16`                      | `tf.bfloat16`       | 16-bit brain floating point         |
| `tr.uint8`                        | `tf.uint8`          | Unsigned 8-bit integer              |
| `tr.int8`                         | `tf.int8`           | Signed 8-bit integer                |
| `tr.int16` (aka `tr.short`)    | `tf.int16`          | Signed 16-bit integer               |
| `tr.int32` (aka `tr.int`)      | `tf.int32`          | Signed 32-bit integer               |
| `tr.int64` (aka `tr.long`)     | `tf.int64`          | Signed 64-bit integer               |
| `tr.bool`                         | `tf.bool`           | Boolean type                      |
| `tr.complex64`                   | `tf.complex64`      | Complex numbers (64-bit)            |
| `tr.complex128`                  | `tf.complex128`     | Complex numbers (128-bit)           |
| `tr.qint8`                      | `tf.qint8`          | Quantized signed 8-bit integer      |
| `tr.quint8`                     | `tf.quint8`         | Quantized unsigned 8-bit integer    |
| `tr.qint32`                     | `tf.qint32`         | Quantized signed 32-bit integer     |


**06. Mathematical operations**

In [55]:
# 1. Scalar operations
tr.manual_seed(random_seed)
tf.random.set_seed(random_seed)
a = tr.rand(2,2)
b = tf.random.uniform((2,2))
a,b

(tensor([[0.5695, 0.0047],
         [0.9303, 0.7257]]),
 <tf.Tensor: shape=(2, 2), dtype=float32, numpy=
 array([[0.49482596, 0.3634578 ],
        [0.47555816, 0.3824556 ]], dtype=float32)>)

In [60]:
# addition
print(a+2)
print(b+2)
print()

# subtraction
print(a-2)
print(b-2)
print()

# multiplication
print(a*2)
print(b*2)
print()

# int - division
print(a//2)
print(b//2)
print()

# division
print(a/2)
print(b/2)
print()

# modulus
print(a%2)
print(b%2)
print()

# power
print(a**2)
print(b**2)


tensor([[2.5695, 2.0047],
        [2.9303, 2.7257]])
tf.Tensor(
[[2.4948258 2.3634577]
 [2.4755583 2.3824556]], shape=(2, 2), dtype=float32)

tensor([[-1.4305, -1.9953],
        [-1.0697, -1.2743]])
tf.Tensor(
[[-1.505174  -1.6365422]
 [-1.5244418 -1.6175444]], shape=(2, 2), dtype=float32)

tensor([[1.1390, 0.0094],
        [1.8605, 1.4515]])
tf.Tensor(
[[0.9896519 0.7269156]
 [0.9511163 0.7649112]], shape=(2, 2), dtype=float32)

tensor([[0., 0.],
        [0., 0.]])
tf.Tensor(
[[0. 0.]
 [0. 0.]], shape=(2, 2), dtype=float32)

tensor([[0.2847, 0.0024],
        [0.4651, 0.3629]])
tf.Tensor(
[[0.24741298 0.1817289 ]
 [0.23777908 0.1912278 ]], shape=(2, 2), dtype=float32)

tensor([[0.5695, 0.0047],
        [0.9303, 0.7257]])
tf.Tensor(
[[0.49482596 0.3634578 ]
 [0.47555816 0.3824556 ]], shape=(2, 2), dtype=float32)

tensor([[3.2432e-01, 2.2280e-05],
        [8.6537e-01, 5.2669e-01]])
tf.Tensor(
[[0.24485274 0.13210157]
 [0.22615556 0.14627227]], shape=(2, 2), dtype=float32)


In [62]:
# 2. element-wise operations
a = tr.rand(2,3)
b = tr.rand(2,3)

p = tf.random.uniform((2,3))
q = tf.random.uniform((2,3))

In [74]:
# addition
print(a+b)
print(p+q)
print()

# subtraction
print(a-b)
print(p-q)
print()

# multiplication
print(a*b)
print(p*q)
print()

# division
print(a/b)
print(p/q)
print()

# power
print(a**b)
print(p**q)
print()


# inner-product
print(a@b.T)
print(p@tf.transpose(q)) # `.T` won't transpose
print()

tensor([[0.6524, 1.3456, 1.5968],
        [0.8245, 0.9629, 1.7114]])
tf.Tensor(
[[0.8055148 1.0250206 1.7749256]
 [1.5376699 0.6104001 1.0852401]], shape=(2, 3), dtype=float32)

tensor([[-0.5820,  0.6289, -0.3143],
        [ 0.0650, -0.0445,  0.1832]])
tf.Tensor(
[[ 0.56187725 -0.842124    0.02157152]
 [ 0.02714205 -0.31516588  0.22221684]], shape=(2, 3), dtype=float32)

tensor([[0.0217, 0.3538, 0.6128],
        [0.1689, 0.2313, 0.7238]])
tf.Tensor(
[[0.08328702 0.0853736  0.78747386]
 [0.590923   0.06831468 0.28209144]], shape=(2, 3), dtype=float32)

tensor([[0.0570, 2.7550, 0.6711],
        [1.1712, 0.9117, 1.2397]])
tf.Tensor(
[[5.6124024  0.09795525 1.024606  ]
 [1.0359372  0.31897694 1.514973  ]], shape=(2, 3), dtype=float32)

tensor([[0.1267, 0.9954, 0.6541],
        [0.7351, 0.6757, 0.9595]])
tf.Tensor(
[[0.9547358  0.10719694 0.9102146 ]
 [0.8308319  0.4125633  0.8324187 ]], shape=(2, 3), dtype=float32)

tensor([[0.9882, 1.0006],
        [1.3442, 1.1240]])
tf.Tensor(
[[0.956134

In [80]:
# 3. Other operations
a = tr.tensor([1,2,-3,-4])

# getting absolute
print(tr.abs(a),tf.abs(a))

# getting negation
print(tr.neg(a),tf.negative(a))

tensor([1, 2, 3, 4]) tf.Tensor([1 2 3 4], shape=(4,), dtype=int64)
tensor([-1, -2,  3,  4]) tf.Tensor([-1 -2  3  4], shape=(4,), dtype=int64)


In [89]:
a = tr.tensor([1.9,2.3,-3.7,-4.4])

# rounding
print(tr.round(a),tf.round(a))

# ceil
print(tr.ceil(a),tf.keras.ops.ceil(a))

# floor
print(tr.floor(a),tf.keras.ops.floor(a))

# clamp
print(tr.clamp(a,min=2,max=3),tf.clip_by_value(a,clip_value_min=2,clip_value_max=3))



tensor([ 2.,  2., -4., -4.]) tf.Tensor([ 2.  2. -4. -4.], shape=(4,), dtype=float32)
tensor([ 2.,  3., -3., -4.]) tf.Tensor([ 2.  3. -3. -4.], shape=(4,), dtype=float32)
tensor([ 1.,  2., -4., -5.]) tf.Tensor([ 1.  2. -4. -5.], shape=(4,), dtype=float32)
tensor([2.0000, 2.3000, 2.0000, 2.0000]) tf.Tensor([2.  2.3 2.  2. ], shape=(4,), dtype=float32)


In [98]:
# 4. Reduction Operations

a = tr.randint(size =(2,3),low = 0,high = 10,dtype=tr.float32)
b = tf.random.uniform(shape = (2,3), minval=0,maxval=10,dtype=tf.int32)

print(a,b)

tensor([[5., 7., 1.],
        [5., 3., 8.]]) tf.Tensor(
[[1 2 6]
 [0 9 1]], shape=(2, 3), dtype=int32)


In [99]:
# sum
print(tr.sum(a))
print(tf.reduce_sum(a))
print()

# sum along columns
print(tr.sum(a,dim = 0))
print(tf.reduce_sum(a,axis = 0))
print()

# sum along rows
print(tr.sum(a,dim = 1))
print(tf.reduce_sum(a,axis = 1))
print()

tensor(29.)
tf.Tensor(29.0, shape=(), dtype=float32)

tensor([10., 10.,  9.])
tf.Tensor([10. 10.  9.], shape=(3,), dtype=float32)

tensor([13., 16.])
tf.Tensor([13. 16.], shape=(2,), dtype=float32)



In [100]:
# mean
print(tr.mean(a)) # needs datatype to be float or complex
print(tf.reduce_mean(a))
print()

# mean along columns
print(tr.mean(a,dim = 0))
print(tf.reduce_mean(a,axis = 0))
print()

# mean along rows
print(tr.mean(a,dim = 1))
print(tf.reduce_mean(a,axis = 1))
print()

tensor(4.8333)
tf.Tensor(4.8333335, shape=(), dtype=float32)

tensor([5.0000, 5.0000, 4.5000])
tf.Tensor([5.  5.  4.5], shape=(3,), dtype=float32)

tensor([4.3333, 5.3333])
tf.Tensor([4.3333335 5.3333335], shape=(2,), dtype=float32)



In [105]:
# median
print(tr.median(a)) # needs datatype to be float or complex
print(tf.keras.ops.median(a))
print()

# median along columns
print(tr.median(a,dim = 0))
print(tf.keras.ops.median(a,axis = 0))
print()

# median along rows
print(tr.median(a,dim = 1))
print(tf.keras.ops.median(a,axis = 1))
print()

tensor(5.)
tf.Tensor(5.0, shape=(), dtype=float32)

torch.return_types.median(
values=tensor([5., 3., 1.]),
indices=tensor([0, 1, 0]))
tf.Tensor([5.  5.  4.5], shape=(3,), dtype=float32)

torch.return_types.median(
values=tensor([5., 5.]),
indices=tensor([0, 0]))
tf.Tensor([5. 5.], shape=(2,), dtype=float32)



In [106]:
# max
print(tr.max(a)) # needs datatype to be float or complex
print(tf.reduce_max(a))
print()

# max along columns
print(tr.max(a,dim = 0))
print(tf.reduce_max(a,axis = 0))
print()

# max along rows
print(tr.max(a,dim = 1))
print(tf.reduce_max(a,axis = 1))
print()

tensor(8.)
tf.Tensor(8.0, shape=(), dtype=float32)

torch.return_types.max(
values=tensor([5., 7., 8.]),
indices=tensor([0, 0, 1]))
tf.Tensor([5. 7. 8.], shape=(3,), dtype=float32)

torch.return_types.max(
values=tensor([7., 8.]),
indices=tensor([1, 2]))
tf.Tensor([7. 8.], shape=(2,), dtype=float32)



In [107]:
# min
print(tr.min(a)) # needs datatype to be float or complex
print(tf.reduce_min(a))
print()

# min along columns
print(tr.min(a,dim = 0))
print(tf.reduce_min(a,axis = 0))
print()

# min along rows
print(tr.min(a,dim = 1))
print(tf.reduce_min(a,axis = 1))
print()

tensor(1.)
tf.Tensor(1.0, shape=(), dtype=float32)

torch.return_types.min(
values=tensor([5., 3., 1.]),
indices=tensor([0, 1, 0]))
tf.Tensor([5. 3. 1.], shape=(3,), dtype=float32)

torch.return_types.min(
values=tensor([1., 3.]),
indices=tensor([2, 1]))
tf.Tensor([1. 3.], shape=(2,), dtype=float32)



In [None]:
# prod
print(tr.prod(a))
print(tf.keras.ops.prod(a))
print()

# along columns
print(tr.prod(a,dim = 0))
print(tf.keras.ops.prod(a,axis=0))
print()

# along rows
print(tr.prod(a,dim = 1))
print(tf.keras.ops.prod(a, axis = 1))


tensor(4200.)
tf.Tensor(4200.0, shape=(), dtype=float32)

tensor([25., 21.,  8.])
tf.Tensor([25. 21.  8.], shape=(3,), dtype=float32)

tensor([ 35., 120.])
tf.Tensor([ 35. 120.], shape=(2,), dtype=float32)


In [125]:
# std
print(tr.std(a))
print(tf.keras.ops.std(a))
print()

# along columns
print(tr.std(a,dim = 0))
print(tf.keras.ops.std(a,axis=0))
print()

# along rows
print(tr.std(a,dim = 1))
print(tf.keras.ops.std(a, axis = 1))


tensor(2.5626)
tf.Tensor(2.3392782, shape=(), dtype=float32)

tensor([0.0000, 2.8284, 4.9497])
tf.Tensor([0.  2.  3.5], shape=(3,), dtype=float32)

tensor([3.0551, 2.5166])
tf.Tensor([2.4944384 2.0548046], shape=(2,), dtype=float32)


In [118]:
# var
print(tr.var(a))
print(tf.keras.ops.var(a))
print()

# along columns
print(tr.var(a,dim = 0))
print(tf.keras.ops.var(a,axis=0))
print()

# along rows
print(tr.var(a,dim = 1))
print(tf.keras.ops.var(a, axis = 1))


tensor(6.5667)
tf.Tensor(5.472223, shape=(), dtype=float32)

tensor([ 0.0000,  8.0000, 24.5000])
tf.Tensor([ 0.    4.   12.25], shape=(3,), dtype=float32)

tensor([9.3333, 6.3333])
tf.Tensor([6.222223 4.222222], shape=(2,), dtype=float32)


**Note:** In tensorflow, there is no bessel's correction, while in torch there is bessel's correction for std and var

In [133]:
# argmax
print(tr.argmax(a))
print(tf.argmax(tr.reshape(a,[-1]))) # needs flattening incase to get a single output without axis
print()

# along columns
print(tr.argmax(a,dim = 0))
print(tf.argmax(a,axis=0))
print()

# along rows
print(tr.argmax(a,dim = 1))
print(tf.argmax(a, axis = 1))


tensor(5)
tf.Tensor(5, shape=(), dtype=int64)

tensor([0, 0, 1])
tf.Tensor([0 0 1], shape=(3,), dtype=int64)

tensor([1, 2])
tf.Tensor([1 2], shape=(2,), dtype=int64)


In [134]:
# argmin
print(tr.argmin(a))
print(tf.argmin(tr.reshape(a,[-1]))) # needs flattening incase to get a single output without axis
print()

# along columns
print(tr.argmin(a,dim = 0))
print(tf.argmin(a,axis=0))
print()

# along rows
print(tr.argmin(a,dim = 1))
print(tf.argmin(a, axis = 1))


tensor(2)
tf.Tensor(2, shape=(), dtype=int64)

tensor([0, 1, 0])
tf.Tensor([0 1 0], shape=(3,), dtype=int64)

tensor([2, 1])
tf.Tensor([2 1], shape=(2,), dtype=int64)


In [139]:
# 5. Matrix operations
a = tr.tensor([[1,2,3],[4,5,6]])
b = tr.tensor([[1,2],[3,4],[5,6]])

# matrix multiplication
print(tr.matmul(a,b))
print(tf.matmul(a,b))
print()

# using `@` operator
print(a@b)
print()

# transpose
print(tr.transpose(a,dim0=0,dim1=1))
print(tf.transpose(a))

# using `.T` (only for torch)
print(a.T)


tensor([[22, 28],
        [49, 64]])
tf.Tensor(
[[22 28]
 [49 64]], shape=(2, 2), dtype=int64)

tensor([[22, 28],
        [49, 64]])

tensor([[1, 4],
        [2, 5],
        [3, 6]])
tf.Tensor(
[[1 4]
 [2 5]
 [3 6]], shape=(3, 2), dtype=int64)
tensor([[1, 4],
        [2, 5],
        [3, 6]])


In [146]:
tr.manual_seed(random_seed)
a = tr.rand(3,3)

# determinant
print(tr.det(a))
print(tf.linalg.det(a))

# inverse
print(tr.inverse(a))
print(tf.linalg.inv(a))

tensor(0.1256)
tf.Tensor(0.12555058, shape=(), dtype=float32)
tensor([[ 1.0426,  1.0657, -6.1171],
        [-1.3229,  0.8816,  1.8923],
        [ 0.4434, -0.6569,  3.7353]])
tf.Tensor(
[[ 1.0426165   1.0656512  -6.1171055 ]
 [-1.3228978   0.8816091   1.8922603 ]
 [ 0.443406   -0.65685827  3.7352521 ]], shape=(3, 3), dtype=float32)


In [159]:
# 6. Comparison operators
tr.manual_seed(random_seed)
a = tr.rand(2,3)
b = tr.rand(2,3)

print(a)
print(b)
print()

# greater
print(a>b)
print()
# greater equal
print(a>=b)
print()

# lesser
print(a<b)
print()
# lesser - equal
print(a<=b)
print()

# equal
print(a==b)

tensor([[0.5695, 0.0047, 0.9303],
        [0.7257, 0.8295, 0.7683]])
tensor([[0.0600, 0.1453, 0.2924],
        [0.5292, 0.1466, 0.8305]])

tensor([[ True, False,  True],
        [ True,  True, False]])

tensor([[ True, False,  True],
        [ True,  True, False]])

tensor([[False,  True, False],
        [False, False,  True]])

tensor([[False,  True, False],
        [False, False,  True]])

tensor([[False, False, False],
        [False, False, False]])


In [185]:
# 7. Special functions

a = tr.tensor([[1,2,3],[-4,5,-6]])

# log
print(tr.log(a))
print(tf.math.log(a.to(tr.float)))
print()

# exp
print(tr.exp(a))
print(tf.math.exp(a.to(tr.float)))
print()

# sqrt
print(tr.sqrt(a))
print(tf.sqrt(a.to(tr.float)))
print()

# sigmoid
print(tr.sigmoid(a))
print(tf.sigmoid(a.to(tr.float)))
print()

# softmax
print(tr.softmax(a.to(tr.float),dim = 0)) # needs data to be float
print(tf.math.softmax(a.to(tr.float),axis = 0))
print()

# relu
print(tr.relu(a.to(tr.float)))
print(tf.keras.ops.relu(a.to(tr.float)))
print()

tensor([[0.0000, 0.6931, 1.0986],
        [   nan, 1.6094,    nan]])
tf.Tensor(
[[0.        0.6931472 1.0986123]
 [      nan 1.609438        nan]], shape=(2, 3), dtype=float32)

tensor([[2.7183e+00, 7.3891e+00, 2.0086e+01],
        [1.8316e-02, 1.4841e+02, 2.4788e-03]])
tf.Tensor(
[[2.7182817e+00 7.3890562e+00 2.0085537e+01]
 [1.8315639e-02 1.4841316e+02 2.4787523e-03]], shape=(2, 3), dtype=float32)

tensor([[1.0000, 1.4142, 1.7321],
        [   nan, 2.2361,    nan]])
tf.Tensor(
[[1.        1.4142135 1.7320508]
 [      nan 2.236068        nan]], shape=(2, 3), dtype=float32)

tensor([[0.7311, 0.8808, 0.9526],
        [0.0180, 0.9933, 0.0025]])
tf.Tensor(
[[0.7310586  0.8807971  0.95257413]
 [0.01798621 0.9933072  0.00247262]], shape=(2, 3), dtype=float32)

tensor([[9.9331e-01, 4.7426e-02, 9.9988e-01],
        [6.6929e-03, 9.5257e-01, 1.2339e-04]])
tf.Tensor(
[[9.9330717e-01 4.7425874e-02 9.9987662e-01]
 [6.6928510e-03 9.5257413e-01 1.2339458e-04]], shape=(2, 3), dtype=float32)

tensor([

**Note:** Tensorflow needs the inputs to special functions as floating point or complex

In [209]:
# 8. Inplace Operations
tr.manual_seed(random_seed)
m = tr.randn(2,3)
n = tr.randn(2,3)

print(m)
print(n)

print(m.add_(n)==m)
print()
print(m.sub_(n)==m)
print()
print(m.mul_(n)==m)
print()
print(m.div_(n)==m)
print()
print(m.relu_())
print()
print(m.abs_())

tensor([[-1.0141, -0.3720, -0.7516],
        [-0.8623, -0.3270,  0.5212]])
tensor([[ 1.2622, -1.4680, -0.1037],
        [ 0.5177, -1.0845, -2.0901]])
tensor([[True, True, True],
        [True, True, True]])

tensor([[True, True, True],
        [True, True, True]])

tensor([[True, True, True],
        [True, True, True]])

tensor([[True, True, True],
        [True, True, True]])

tensor([[0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.5212]])

tensor([[0.0000, 0.0000, 0.0000],
        [0.0000, 0.0000, 0.5212]])


**Note:** 
* In TensorFlow, in-place operations are not allowed in the same way they are in PyTorch. This is because TensorFlow uses a computational graph model, and tensors are typically immutable (not updated in-place).
* However, you can reassign variables or use tf.Variable for mutable tensors.

In [211]:
tf.random.set_seed(random_seed)

# Mutable tensors using tf.Variable
m = tf.Variable(tf.random.normal((2, 3)))
n = tf.random.normal((2, 3))

print("m:\n", m.numpy())
print("n:\n", n.numpy())

# add_ equivalent
m.assign_add(n)
print("\nm after add_:\n", m.numpy())

# sub_ equivalent
m.assign_sub(n)
print("\nm after sub_:\n", m.numpy())

# mul_ equivalent
m.assign(m * n)
print("\nm after mul_:\n", m.numpy())

# div_ equivalent
m.assign(m / n)
print("\nm after div_:\n", m.numpy())

# relu_ equivalent
m.assign(tf.nn.relu(m))
print("\nm after relu_:\n", m.numpy())

# abs_ equivalent
m.assign(tf.abs(m))
print("\nm after abs_:\n", m.numpy())

m:
 [[ 0.89734983 -0.7757973   0.8208116 ]
 [-0.90155447  1.1472297   0.05468887]]
n:
 [[ 0.14229247  0.6229361   0.5665132 ]
 [-0.5726     -0.13736247 -2.1842823 ]]

m after add_:
 [[ 1.0396423  -0.15286118  1.3873248 ]
 [-1.4741545   1.0098672  -2.1295934 ]]

m after sub_:
 [[ 0.89734983 -0.7757973   0.8208116 ]
 [-0.90155447  1.1472297   0.05468893]]

m after mul_:
 [[ 0.12768613 -0.48327217  0.4650006 ]
 [ 0.5162301  -0.15758629 -0.11945606]]

m after div_:
 [[ 0.89734983 -0.7757973   0.8208116 ]
 [-0.90155447  1.1472297   0.05468893]]

m after relu_:
 [[0.89734983 0.         0.8208116 ]
 [0.         1.1472297  0.05468893]]

m after abs_:
 [[0.89734983 0.         0.8208116 ]
 [0.         1.1472297  0.05468893]]


**07. Copying a Tensor**

In [216]:
# using assignment operator in python, we point to existing memory location using a new pointer.
a = tr.randn(2,3)
b = a
print(a==b)
a[0][0]=7
print(a==b)

print(id(a)==id(b))

tensor([[True, True, True],
        [True, True, True]])
tensor([[True, True, True],
        [True, True, True]])
True


In [222]:
b = a.clone()
print(id(a)==id(b))

False


**Notes**
* Tensorflow doesn't have `clone()` like in PyTorch, because:
    * TensorFlow encourages immutability for tensors.
    * You must be explicit about when and how copies are made.

In [230]:
tf.random.set_seed(random_seed)
a = tf.Variable(tf.random.uniform((2,3)))
b = a
print(id(a)==id(b))

b = tf.Variable(a.read_value())
print(id(a)==id(b))

# another approach
import copy

b = copy.deepcopy(a)
print(id(a)==id(b))

True
False
False


**08. Tensor operations on GPU**

In [None]:
# creating a new tensor on GPU
device = tr.device('cuda')

# tr.rand((2,3),device=device) # uncomment for GPU
a = tr.rand((2,3),device=tr.device('cpu')) # default device in CPU
b = a.to(device) # will create a new tensor in GPU

**09. Reshaping tensors**

In [249]:
# reshape
a = tr.randn(4,4)
b = a.reshape(2,2,2,2)
print(a.shape,b.shape)
print()
a = tf.random.uniform((4,4))
b = tf.reshape(a,(2,2,2,2)) # directly not possible using `a.reshape`
print(a.shape,b.shape)


torch.Size([4, 4]) torch.Size([2, 2, 2, 2])

(4, 4) (2, 2, 2, 2)


In [None]:
# flatten == reshape(T,[-1])
a = tr.randn(4,4)
b = a.flatten()
print(a.shape,b.shape)
print()
a = tf.random.uniform((4,4))
b = tf.reshape(a,[-1]) # directly not possible using `a.reshape`
print(a.shape,b.shape)

torch.Size([4, 4]) torch.Size([16])

(4, 4) (16,)


In [None]:
# permute == transpose
a = tr.randn(2,3,4)
b = a.permute(2,0,1)
print(a.shape,b.shape)
print()

a = tf.random.uniform((2,3,4))
b = tf.transpose(a,perm=(2,0,1))
print(a.shape,b.shape)

torch.Size([2, 3, 4]) torch.Size([4, 2, 3])

(2, 3, 4) (4, 2, 3)


In [None]:
# unsqueeze & squeeze

a = tr.randn(226,226,3)
b = a.unsqueeze(0)
c = b.squeeze()
print(a.shape,b.shape,c.shape)
print()

a = tf.random.uniform((226,226,3))
b = tf.expand_dims(a,axis = 0)
c = tf.squeeze(b)
print(a.shape,b.shape,c.shape)

torch.Size([226, 226, 3]) torch.Size([1, 226, 226, 3]) torch.Size([226, 226, 3])

(226, 226, 3) (1, 226, 226, 3) (226, 226, 3)


In [265]:
# Pytorch <--> NumPy <--> TensorFlow

import numpy as np

a = tr.tensor([1,2,3])
b = a.numpy()
c = tf.constant(a)
print(type(a),type(b),type(c))

<class 'torch.Tensor'> <class 'numpy.ndarray'> <class 'tensorflow.python.framework.ops.EagerTensor'>


In [266]:
a = np.array([1,2,3])
b = tr.tensor(a)
c = tf.constant(a)
print(type(a),type(b),type(c))

<class 'numpy.ndarray'> <class 'torch.Tensor'> <class 'tensorflow.python.framework.ops.EagerTensor'>


In [270]:
a = tf.constant([1,2,3])
b = a.numpy()
# c = tr.tensor(a) # doesn't work
c = tr.tensor(a.numpy())
d = tr.from_numpy(a.numpy())
print(type(a),type(b),type(c))

<class 'tensorflow.python.framework.ops.EagerTensor'> <class 'numpy.ndarray'> <class 'torch.Tensor'>


***--- Continued in the next notebook ---***