<a href="https://colab.research.google.com/github/vessln/Cosmetics_store/blob/main/Intro_to_Deep_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Introduction to Deep Learning

In [1]:
!pip freeze

absl-py==1.4.0
accelerate==1.2.1
aiohappyeyeballs==2.4.4
aiohttp==3.11.10
aiosignal==1.3.2
alabaster==1.0.0
albucore==0.0.19
albumentations==1.4.20
altair==5.5.0
annotated-types==0.7.0
anyio==3.7.1
argon2-cffi==23.1.0
argon2-cffi-bindings==21.2.0
array_record==0.5.1
arviz==0.20.0
astropy==6.1.7
astropy-iers-data==0.2024.12.16.0.35.48
astunparse==1.6.3
async-timeout==4.0.3
atpublic==4.1.0
attrs==24.3.0
audioread==3.0.1
autograd==1.7.0
babel==2.16.0
backcall==0.2.0
beautifulsoup4==4.12.3
bigframes==1.29.0
bigquery-magics==0.4.0
bleach==6.2.0
blinker==1.9.0
blis==0.7.11
blosc2==2.7.1
bokeh==3.6.2
Bottleneck==1.4.2
bqplot==0.12.43
branca==0.8.1
CacheControl==0.14.1
cachetools==5.5.0
catalogue==2.0.10
certifi==2024.12.14
cffi==1.17.1
chardet==5.2.0
charset-normalizer==3.4.0
chex==0.1.88
clarabel==0.9.0
click==8.1.7
cloudpathlib==0.20.0
cloudpickle==3.1.0
cmake==3.31.2
cmdstanpy==1.2.5
colorcet==3.1.0
colorlover==0.3.0
colour==0.1.5
community==1.0.0b1
confection==0.1.5
cons==0.4.6
contourpy=

In [2]:
# imports
import numpy as np

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense

import torch

from sklearn.datasets import load_iris

In [3]:
# scalar (rank = 0):
2

# vector (rank = 1):
[1, 2]

# two-dimentional matrix (rank = 2):
[
    [2, 0],
    [3, 1]
]

# tensor - multidimentional matrix (rank = 3):
[
    [
        [1, 2, 18],
        [3, 0, 18],
        [2, 5, 18],
        [5, 1, 18],
    ],
    [
        ...
    ],
    [
        ...
    ],
]

[[[1, 2, 18], [3, 0, 18], [2, 5, 18], [5, 1, 18]], [Ellipsis], [Ellipsis]]

## Computational graphs
Computational graphs (directed acyclic graph) - abstract representation of calculations in DL. They are used to model complex math operations such as **graphs** composed of nodes and edges. They have input, hidden and output layers. Each node from one layer is connected to a node from the next layer, but nodes from one layer are not connected (fully connected networks). Similar to staking – each layer upgrades on the previous one. In residual networks, there may be connections between non-sequential layers.

In [4]:
def test_func(a, b):
  return (2 * a + 3 * b) ** 2

In [5]:
# python
a, b = 15, 20
test_func(a, b)

8100

In [6]:
# numpy
a = np.array([15, 3, 4, 18, -5])
b = np.array([20, 5, 18, 2, 20])
[(2 * x + 3 * y) ** 2 for x, y in zip(a, b)]

[8100, 441, 3844, 1764, 2500]

In [7]:
# tensorflow
a = tf.constant([15, 3, 4, 18, -5])
b = tf.constant([20, 5, 18, 2, 20])
test_func(a, b)

<tf.Tensor: shape=(5,), dtype=int32, numpy=array([8100,  441, 3844, 1764, 2500], dtype=int32)>

In [8]:
# torch
a = torch.tensor([15, 3, 4, 18, -5])
b = torch.tensor([20, 5, 18, 2, 20])
test_func(a, b)

tensor([8100,  441, 3844, 1764, 2500])

With python @decorator i can make a reusable function:

In [9]:
# reusable function
@tf.function
def raw_func(a, b):
  return (2 * a + 3 * b) ** 2

## Linear Models

In [10]:
iris_data = load_iris()

In [11]:
attributes, labels = iris_data["data"], iris_data["target"]

### Define and train Logistic Regression with Tensorflow
For tensorflow - I can work with pure attributes and labels.

TensorFlow offers low level and high level APIs for creating neural network.TensorFlow 2.0 integrated **Keras** as its default high-level API.

In [12]:
# input = count of features (give a tuple):
features = (attributes.shape[1], )
features

(4,)

In [13]:
# output = count of classes:
class_count = len(set(labels))
class_count

3

In [14]:
tf.keras.backend.clear_session()

In [15]:
# 1. Define the architecture (layers):

model_tf = Sequential([
    Input(features), # input layer
    Dense(class_count, activation = "softmax"), # output layer
])

This model has 3 logistic regressions, each of it gets 4 parameters and 1 bias -> 4*3 + 3 = 15 parameters and 3 outputs.
Output shape: (None, 3) -> None - no matter how much records i give in the input

In [16]:
model_tf.summary()

In [17]:
# 2. Give a task to the model (the correct loss function for classification):

# if target = (1, 0, 2, 1).. -> sparse_categorical_crossentropy
# if target = (1, 0, 0), (0, 1, 0) .. -> categorical_crossentropy

model_tf.compile(loss = "sparse_categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])

An **epoch** is one complete pass through the entire dataset. With each epoch loss function decrease.

In [18]:
# this is partial fit
model_tf.fit(attributes, labels, batch_size = 8, epochs = 100)

Epoch 1/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.4797 - loss: 1.2755   
Epoch 2/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.4909 - loss: 1.0606  
Epoch 3/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.5927 - loss: 0.8387 
Epoch 4/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.5220 - loss: 0.9096 
Epoch 5/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.6409 - loss: 0.7172  
Epoch 6/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - accuracy: 0.5919 - loss: 0.7318 
Epoch 7/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6827 - loss: 0.6701 
Epoch 8/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7456 - loss: 0.6589  
Epoch 9/100
[1m19/19[0m [32m━━━━

<keras.src.callbacks.history.History at 0x7f971b454c40>

In [19]:
# on each row, the model gives the probability that the current record is of a given class
model_tf.predict(attributes)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 


array([[9.37267065e-01, 6.03793859e-02, 2.35351222e-03],
       [8.78849566e-01, 1.15378365e-01, 5.77201881e-03],
       [9.20206666e-01, 7.60423690e-02, 3.75091052e-03],
       [8.90278876e-01, 1.03022568e-01, 6.69857580e-03],
       [9.46924746e-01, 5.10061234e-02, 2.06910726e-03],
       [9.39180434e-01, 5.83626330e-02, 2.45695002e-03],
       [9.29122210e-01, 6.70289248e-02, 3.84878181e-03],
       [9.21643496e-01, 7.48551264e-02, 3.50130792e-03],
       [8.74342203e-01, 1.17367022e-01, 8.29074625e-03],
       [8.92742395e-01, 1.02012828e-01, 5.24477893e-03],
       [9.44355905e-01, 5.38327210e-02, 1.81139773e-03],
       [9.16908681e-01, 7.85054043e-02, 4.58587054e-03],
       [8.90672147e-01, 1.04009829e-01, 5.31804049e-03],
       [9.26987052e-01, 6.93270043e-02, 3.68578429e-03],
       [9.71127033e-01, 2.83550788e-02, 5.17836830e-04],
       [9.74641800e-01, 2.47277580e-02, 6.30422670e-04],
       [9.59683597e-01, 3.91834863e-02, 1.13295438e-03],
       [9.31597292e-01, 6.57298

In [20]:
# get the class with the highest probability (the index of the largest value) for each row:
tf.argmax(model_tf.predict(attributes), axis = 1)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 


<tf.Tensor: shape=(150,), dtype=int64, numpy=
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       2, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])>

In [21]:
model_tf.evaluate(attributes, labels)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9737 - loss: 0.2791  


[0.36717700958251953, 0.9666666388511658]

### Define and train Logistic Regression with Pytorch
For pytorch - I should convert attributes and labels into torch.tensor.

In [22]:
# convert into tensor
attrb_pt = torch.FloatTensor(attributes)
target_pt = torch.LongTensor(labels)

In [23]:
n_features = attributes.shape[1]
n_classes = len(set(labels))

Defne the model with OOP:

In [24]:
class LogisticRegressionPT(torch.nn.Module):
  # define structure of each layer:
  def __init__(self):
    super(LogisticRegressionPT, self).__init__()
    self.layer = torch.nn.Linear(n_features, n_classes)

  # define the sequence - what comes after what:
  def forward(self, x):
    x = torch.nn.functional.softmax(self.layer(x))
    return x

In [25]:
model_pt = LogisticRegressionPT()

In [26]:
learning_rate = 0.01
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model_pt.parameters(), lr = learning_rate)

In [27]:
def train(model, optimizer, criterion, X, y, num_epochs, train_losses):
    for epoch in range(num_epochs):
      optimizer.zero_grad() # reset the gradients
      output_train = model_pt(attrb_pt)  # forward
      loss_train = criterion(output_train, target_pt) # calculate the loss function
      loss_train.backward() # backward - here are the gradients
      optimizer.step() # weight update based on average loss and the optimizer
      train_losses[epoch] = loss_train.item()

      if (epoch + 1) % 50 == 0:
        print(f"Epoch {epoch+1}/{num_epochs}, Loss: {loss_train.item():.4f}")

In [28]:
num_epochs = 1000
train_losses = np.zeros(num_epochs)
train(model_pt, optimizer, criterion, attrb_pt, target_pt, num_epochs, train_losses)

  x = torch.nn.functional.softmax(self.layer(x))


Epoch 50/1000, Loss: 0.9920
Epoch 100/1000, Loss: 0.9252
Epoch 150/1000, Loss: 0.9071
Epoch 200/1000, Loss: 0.8995
Epoch 250/1000, Loss: 0.8955
Epoch 300/1000, Loss: 0.8930
Epoch 350/1000, Loss: 0.8914
Epoch 400/1000, Loss: 0.8902
Epoch 450/1000, Loss: 0.8894
Epoch 500/1000, Loss: 0.8887
Epoch 550/1000, Loss: 0.8882
Epoch 600/1000, Loss: 0.8878
Epoch 650/1000, Loss: 0.8875
Epoch 700/1000, Loss: 0.8872
Epoch 750/1000, Loss: 0.8870
Epoch 800/1000, Loss: 0.8868
Epoch 850/1000, Loss: 0.8866
Epoch 900/1000, Loss: 0.8864
Epoch 950/1000, Loss: 0.8863
Epoch 1000/1000, Loss: 0.8862


Pytorch is very flexible - I define my own model and train function.

In [29]:
# summary of the model:
print(model_pt)

LogisticRegressionPT(
  (layer): Linear(in_features=4, out_features=3, bias=True)
)


In [30]:
# model_pt.forward(attrb_pt) == model_pt(attrb_pt)

In [31]:
!pip install torcheval

Collecting torcheval
  Downloading torcheval-0.0.7-py3-none-any.whl.metadata (8.6 kB)
Downloading torcheval-0.0.7-py3-none-any.whl (179 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/179.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━[0m [32m153.6/179.2 kB[0m [31m4.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m179.2/179.2 kB[0m [31m3.2 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: torcheval
Successfully installed torcheval-0.0.7


In [32]:
from torcheval.metrics.functional import multiclass_accuracy

In [33]:
predictions = torch.argmax(model_pt.forward(attrb_pt), dim = 1)
multiclass_accuracy(predictions, target_pt)

  x = torch.nn.functional.softmax(self.layer(x))


tensor(0.6667)

### Pytorch Lightning
Pytorch Lightning is a version of pytorch designed to make the process of creating and training models easier, without having to write a lot of code. It provides a structure and an automation.


## Deep Feed-Forward Neural Network

### Tensorflow

In [34]:
model_deep_tf = Sequential([
    Input(features), # input layer
    Dense(20, activation = "relu"), # hidden layer
    Dense(10, activation = "relu"), # hidden layer
    Dense(class_count, activation = "softmax"), # output layer
])

In [35]:
model_deep_tf.summary()

In [36]:
model_deep_tf.compile(loss = "sparse_categorical_crossentropy", optimizer = "adam", metrics = ["accuracy"])

In [37]:
model_deep_tf.fit(attributes, labels, batch_size = 8, epochs = 100)

Epoch 1/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.4505 - loss: 1.8431
Epoch 2/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6221 - loss: 1.3257 
Epoch 3/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6251 - loss: 1.1492 
Epoch 4/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6380 - loss: 0.9975 
Epoch 5/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7125 - loss: 0.8974  
Epoch 6/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6728 - loss: 0.8793  
Epoch 7/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.6817 - loss: 0.8314 
Epoch 8/100
[1m19/19[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - accuracy: 0.7225 - loss: 0.7782  
Epoch 9/100
[1m19/19[0m [32m━━━━━━━

<keras.src.callbacks.history.History at 0x7f9638b8b940>

In [38]:
tf.argmax(model_deep_tf.predict(attributes), axis = 1)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step


<tf.Tensor: shape=(150,), dtype=int64, numpy=
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 2, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])>

In [40]:
model_deep_tf.evaluate(attributes, labels)

[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - accuracy: 0.9820 - loss: 0.0463      


[0.07103560864925385, 0.9733333587646484]

### Pytorch

In [None]:
class LogisticRegressionPT(torch.nn.Module):
  def __init__(self):
    super(LogisticRegressionPT, self).__init__()
    self.layer1 = torch.nn.Linear(n_features, 20)
    self.layer2 = torch.nn.Linear(20, 10)
    self.layer3 = torch.nn.Linear(10, n_classes)

  def forward(self, x):
    x = torch.nn.functional.relu(self.layer1(x))
    x = torch.nn.functional.relu(self.layer2(x))
    x = torch.nn.functional.relu(self.layer3(x))

    return x