# ADAM Optimizer

A common optimizer used in machine learning is ADAM. We have created the components for it in a separate file of ADAM.json and will use that to showcase it here.

In [1]:
import json
from pybdp import load_project
from IPython.display import Markdown
from copy import deepcopy
from pprint import pprint

# Load the project JSON from file
with open("ADAM.json", "r") as f:
    project_json = json.load(f)

# Load the project
project = load_project(project_json)

## High Level

At a high level, the ADAM algorithm initializes the some parameters and then loops updating theta.

In [2]:
print("Zoomed out:")
display(Markdown(project.processors_map["ADAM"].create_mermaid_graphic()[0]))
print()
print("Zoomed in:")
display(Markdown(project.processors_map["ADAM"].create_mermaid_graphic_composite()[0]))

Zoomed out:


```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[ADAM - ADAM Block Block]
direction LR
X0[ADAM]
subgraph G0P[Ports]
direction TB
XX0P0[theta]
end
XX0P0[theta] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[theta]
end
X0 o--o XX0T0[theta]
end

```


Zoomed in:


```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GC0[ADAM - ADAM Block Block]
direction LR
subgraph GS0[ADAM System]
subgraph G1[ADAM Initialization - ADAM Initialization Block Block]
direction LR
X1[ADAM Initialization]
subgraph G1P[Ports]
direction TB
XX1P0[theta]
end
XX1P0[theta] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[theta]
XX1T1[m]
XX1T2[v]
XX1T3[t]
end
X1 o--o XX1T0[theta]
X1 o--o XX1T1[m]
X1 o--o XX1T2[v]
X1 o--o XX1T3[t]
end
subgraph G2[ADAM Update Loop - ADAM Update Loop Block Block]
direction LR
X2[ADAM Update Loop]
subgraph G2P[Ports]
direction TB
XX2P0[theta]
XX2P1[m]
XX2P2[v]
XX2P3[t]
end
XX2P0[theta] o--o X2
XX2P1[m] o--o X2
XX2P2[v] o--o X2
XX2P3[t] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[theta]
end
X2 o--o XX2T0[theta]
end
XX1T0[theta] ---> XX2P0[theta]
XX1T1[m] ---> XX2P1[m]
XX1T2[v] ---> XX2P2[v]
XX1T3[t] ---> XX2P3[t]
end
subgraph GC0P[Ports]
direction TB
X1P0[theta]
end
X1P0[theta] o--o XX1P0[theta]
subgraph GC0T[Terminals]
direction TB
X1T0[theta]
end
XX2T0[theta] o--o X1T0[theta]
end

```

In [5]:
display(Markdown(project.systems_map["ADAM Update Step System"].create_mermaid_graphic()[0]))

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GS0[ADAM Update Step System]
subgraph G0[Get Function Gradients - Get Gradients Block]
direction LR
X0[Get Function Gradients]
subgraph G0P[Ports]
direction TB
XX0P0[theta]
end
XX0P0[theta] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[g]
end
X0 o--o XX0T0[g]
end
subgraph G1[Exponential Smoothing First Moment - Update Biased First Moment Block]
direction LR
X1[Exponential Smoothing First Moment]
subgraph G1P[Ports]
direction TB
XX1P0[m]
XX1P1[g]
end
XX1P0[m] o--o X1
XX1P1[g] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[m]
end
X1 o--o XX1T0[m]
end
subgraph G2[Exponential Smoothing Second Moment - Update Biased Second Moment Block]
direction LR
X2[Exponential Smoothing Second Moment]
subgraph G2P[Ports]
direction TB
XX2P0[v]
XX2P1[g]
end
XX2P0[v] o--o X2
XX2P1[g] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[v]
end
X2 o--o XX2T0[v]
end
subgraph G3[Increment Timestep - Update Timestep Block]
direction LR
X3[Increment Timestep]
subgraph G3P[Ports]
direction TB
XX3P0[t]
end
XX3P0[t] o--o X3
subgraph G3T[Terminals]
direction TB
XX3T0[t]
end
X3 o--o XX3T0[t]
end
subgraph G4[Exponential Decay First Moment Bias Correction - Compute Bias-Corrected First Moment Block]
direction LR
X4[Exponential Decay First Moment Bias Correction]
subgraph G4P[Ports]
direction TB
XX4P0[m]
XX4P1[t]
end
XX4P0[m] o--o X4
XX4P1[t] o--o X4
subgraph G4T[Terminals]
direction TB
XX4T0[m]
end
X4 o--o XX4T0[m]
end
subgraph G5[Exponential Decay Second Moment Bias Correction - Compute Bias-Corrected Second Moment Block]
direction LR
X5[Exponential Decay Second Moment Bias Correction]
subgraph G5P[Ports]
direction TB
XX5P0[v]
XX5P1[t]
end
XX5P0[v] o--o X5
XX5P1[t] o--o X5
subgraph G5T[Terminals]
direction TB
XX5T0[v]
end
X5 o--o XX5T0[v]
end
subgraph G6[ADAM Theta Update - Update Theta Block]
direction LR
X6[ADAM Theta Update]
subgraph G6P[Ports]
direction TB
XX6P0[m]
XX6P1[v]
XX6P2[theta]
end
XX6P0[m] o--o X6
XX6P1[v] o--o X6
XX6P2[theta] o--o X6
subgraph G6T[Terminals]
direction TB
XX6T0[theta]
end
X6 o--o XX6T0[theta]
end
XX4T0[m] ---> XX6P0[m]
XX5T0[v] ---> XX6P1[v]
XX3T0[t] ---> XX4P1[t]
XX3T0[t] ---> XX5P1[t]
XX1T0[m] ---> XX4P0[m]
XX2T0[v] ---> XX5P0[v]
XX0T0[g] ---> XX1P1[g]
XX0T0[g] ---> XX2P1[g]
end

```

In [6]:
project.systems_map["ADAM Update Step System"].make_processor_lazy()

-----Add the following to your JSON-----

Add to blocks:
{'Codomain': ['theta'],
 'Description': 'A lazy loaded composite processor block for ADAM Update Step '
                'System',
 'Domain': ['theta', 'm', 'v', 't', 'theta'],
 'ID': 'ADAM Update Step System-CP Block',
 'Name': 'ADAM Update Step System-CP Block'}

Add to processors:
{'Description': 'A lazy loaded composite processor block for ADAM Update Step '
                'System',
 'ID': 'ADAM Update Step System-CP',
 'Name': 'ADAM Update Step System-CP',
 'Parent': 'ADAM Update Step System-CP Block',
 'Ports': ['theta', 'm', 'v', 't', 'theta'],
 'Subsystem': {'Port Mappings': [{'Index': 0,
                                  'Processor': 'Get Function Gradients'},
                                 {'Index': 0,
                                  'Processor': 'Exponential Smoothing First '
                                               'Moment'},
                                 {'Index': 0,
                                  'Pr