# ADAM Optimizer

A common optimizer used in machine learning is ADAM. We have created the components for it in a separate file of ADAM.json and will use that to showcase it here.

In [1]:
import sys
#sys.path.append("../../pybdp")
#from src import pybdp
import pybdp
from IPython.display import Markdown
from pprint import pprint

# Start with an empty project
project = pybdp.create_empty_project()

## High Level

At a high level, the ADAM algorithm initializes some optimization parameters and then loops updating theta, the paremeters we are optimizing for.

In [2]:
# Add the spaces
project.add_space(id = "theta",
                  name = "theta",
                  description = "The model parameters")

# Add the block
project.add_block(id="Parameter Optimization Block",
                  name="Parameter Optimization Block",
                  description="The block for parameter optimization",
                  domain=["theta"],
                  codomain=["theta"],)

# Add the processor
project.add_processor(id="ADAM",
                      name="ADAM",
                      description="The ADAM optimization algorithm",
                      parent_id="Parameter Optimization Block",)


processor = project.processors_map["ADAM"]
processor.display_mermaid_graphic()
print("Ports:")
print(processor.ports)
print("Terminals:")
print(processor.terminals)

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[ADAM - Parameter Optimization Block Block]
direction LR
X0[ADAM]
subgraph G0P[Ports]
direction TB
XX0P0[theta]
end
XX0P0[theta] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[theta]
end
X0 o--o XX0T0[theta]
end

```

Ports:
[< Space ID: theta Name: theta >]
Terminals:
[< Space ID: theta Name: theta >]


## Initialization and Updating

The next step is to break down what happens in our subsystem. There are two components that run sequentially:

1. Initializing the variables for ADAM
2. Running the update loop

We add them below as a subsystem, first creating the system itself.

In [3]:
# Add the spaces
project.add_space(id = "m",
                  name = "m",
                  description = "First moment vector")

project.add_space(id = "v",
                  name = "v",
                  description = "Second moment vector")

project.add_space(id = "t",
                  name = "t",
                  description = "The current timestep")


# Add the block
project.add_block(id="Parameter Initialization",
                  name="Parameter Initialization",
                  description="The block for parameter initialization",
                  domain=["theta"],
                  codomain=["theta", "m", "v", "t"],)

project.add_block(id="Optimization Update Loop",
                  name="Optimization Update Loop",
                  description="The block for the update loop of optimization",
                  domain=["theta", "m", "v", "t"],
                  codomain=["theta"],)



# Add the processor
project.add_processor(id="ADAM Initialization",
                      name="ADAM Initialization",
                      description="Initialiazes the ADAM state variables",
                      parent_id="Parameter Initialization")
project.add_processor(id="ADAM Update Loop",
                      name="ADAM Update Loop",
                      description="Loops through the ADAM updates",
                      parent_id="Optimization Update Loop")
pprint(project.processors_map["ADAM Initialization"].find_potential_wires(project.processors_map["ADAM Update Loop"]))

{'Ports': [{'Parent': 'theta',
            'Source': {'Index': 0, 'Processor': 'ADAM Update Loop'},
            'Target': {'Index': 0, 'Processor': 'ADAM Initialization'}}],
 'Terminals': [{'Parent': 'theta',
                'Source': {'Index': 0, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 0, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 'm',
                'Source': {'Index': 1, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 1, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 'v',
                'Source': {'Index': 2, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 2, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 't',
                'Source': {'Index': 3, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 3, 'Processor': 'ADAM Update Loop'}}]}


In [4]:
project.add_wires([{'Parent': 'theta',
                'Source': {'Index': 0, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 0, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 'm',
                'Source': {'Index': 1, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 1, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 'v',
                'Source': {'Index': 2, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 2, 'Processor': 'ADAM Update Loop'}},
               {'Parent': 't',
                'Source': {'Index': 3, 'Processor': 'ADAM Initialization'},
                'Target': {'Index': 3, 'Processor': 'ADAM Update Loop'}}],
                auto_increment=True)

project.add_system(id="ADAM System",
                   name="ADAM System",
                   processors=["ADAM Initialization", "ADAM Update Loop"],
                   wires=["W1", "W2", "W3", "W4"],
                     description="The system representing the ADAM algorithm",)


processor = project.systems_map["ADAM System"]
processor.display_mermaid_graphic()

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GS0[ADAM System]
subgraph G0[ADAM Initialization - Parameter Initialization Block]
direction LR
X0[ADAM Initialization]
subgraph G0P[Ports]
direction TB
XX0P0[theta]
end
XX0P0[theta] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[theta]
XX0T1[m]
XX0T2[v]
XX0T3[t]
end
X0 o--o XX0T0[theta]
X0 o--o XX0T1[m]
X0 o--o XX0T2[v]
X0 o--o XX0T3[t]
end
subgraph G1[ADAM Update Loop - Optimization Update Loop Block]
direction LR
X1[ADAM Update Loop]
subgraph G1P[Ports]
direction TB
XX1P0[theta]
XX1P1[m]
XX1P2[v]
XX1P3[t]
end
XX1P0[theta] o--o X1
XX1P1[m] o--o X1
XX1P2[v] o--o X1
XX1P3[t] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[theta]
end
X1 o--o XX1T0[theta]
end
XX0T0[theta] ---> XX1P0[theta]
XX0T1[m] ---> XX1P1[m]
XX0T2[v] ---> XX1P2[v]
XX0T3[t] ---> XX1P3[t]
end

```

## Attatch Subsystem

In [5]:
possible = project.processors_map["ADAM"].find_potential_subsystems_mappings(project.systems_map["ADAM System"])
pprint(possible)

{'Port Mappings': [[{'Index': 0, 'Processor': 'ADAM Initialization'}]],
 'Terminal Mappings': [[{'Index': 0, 'Processor': 'ADAM Initialization'},
                        {'Index': 0, 'Processor': 'ADAM Update Loop'}]]}


In [6]:
port_mappings = [x[0] for x in possible["Port Mappings"]]
terminal_mappings = [x[1] for x in possible["Terminal Mappings"]]
print("Port Mappings: ", port_mappings)
print("Terminal Mappings: ", terminal_mappings)

project.attach_subsystem(project.processors_map["ADAM"],
                          project.systems_map["ADAM System"],
                          port_mappings,
                          terminal_mappings)
processor = project.processors_map["ADAM"]
processor.display_mermaid_graphic()
processor.display_mermaid_graphic(composite=True)

Port Mappings:  [{'Processor': 'ADAM Initialization', 'Index': 0}]
Terminal Mappings:  [{'Processor': 'ADAM Update Loop', 'Index': 0}]


```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[ADAM - Parameter Optimization Block Block]
direction LR
X0[ADAM]
subgraph G0P[Ports]
direction TB
XX0P0[theta]
end
XX0P0[theta] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[theta]
end
X0 o--o XX0T0[theta]
end

```

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GC0[ADAM - Parameter Optimization Block Block]
direction LR
subgraph GS0[ADAM System]
subgraph G1[ADAM Initialization - Parameter Initialization Block]
direction LR
X1[ADAM Initialization]
subgraph G1P[Ports]
direction TB
XX1P0[theta]
end
XX1P0[theta] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[theta]
XX1T1[m]
XX1T2[v]
XX1T3[t]
end
X1 o--o XX1T0[theta]
X1 o--o XX1T1[m]
X1 o--o XX1T2[v]
X1 o--o XX1T3[t]
end
subgraph G2[ADAM Update Loop - Optimization Update Loop Block]
direction LR
X2[ADAM Update Loop]
subgraph G2P[Ports]
direction TB
XX2P0[theta]
XX2P1[m]
XX2P2[v]
XX2P3[t]
end
XX2P0[theta] o--o X2
XX2P1[m] o--o X2
XX2P2[v] o--o X2
XX2P3[t] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[theta]
end
X2 o--o XX2T0[theta]
end
XX1T0[theta] ---> XX2P0[theta]
XX1T1[m] ---> XX2P1[m]
XX1T2[v] ---> XX2P2[v]
XX1T3[t] ---> XX2P3[t]
end
subgraph GC0P[Ports]
direction TB
X1P0[theta]
end
X1P0[theta] --> XX1P0[theta]
subgraph GC0T[Terminals]
direction TB
X1T0[theta]
end
XX2T0[theta] --> X1T0[theta]
end

```

## Update Loop

Now we begin work on the update loop, we will have three components of it:

1. A switch which can take in spaces from either outside the processor or inside the processor
2. A convergance criteria which determines when the loop ends
3. An update step which specifies how parameters are updated

In [7]:
# Add the block
project.add_block(id="Convergence Criteria",
                  name="Convergence Criteria",
                  description="Evaluates whether the optimization process has converged based on moments, timestep, and parameters",
                  domain=["m", "v", "t", "theta"],
                  codomain=["m", "v", "t", "theta"],)

project.add_block(id="Optimization Step",
                  name="Optimization Step",
                  description="One step in the optimization process",
                  domain=["theta",  "m", "v", "t"],
                  codomain=["theta",  "m", "v", "t"],)



# Add the processor
project.add_processor(id="Theta Convergence Criteria",
                      name="Theta Convergence Criteria",
                      description="Convergence criteria based only on theta",
                      parent_id="Convergence Criteria")
project.add_processor(id="ADAM Update Step",
                      name="ADAM Update Step",
                      description="One update step in the ADAM algorithm",
                      parent_id="Optimization Step")