# MNIST BDP V1

The following is a breakdown of the problem of using a supervised learning model on the MNIST dataset. We begin with an empty project and fill it out with the pybdp interface to create an example of a machine learning block diagram for loading MNIST.

In [None]:
import sys
#sys.path.append("../../pybdp")
#from src import pybdp
import pybdp
from IPython.display import Markdown
from pprint import pprint

# Start with an empty project
project = pybdp.create_empty_project()

## System High Level

At the highest level, there is a processor that takes a model in and returns evaluation metrics. This is our "MNIST Experiment" processor.

We first create the spaces, block and processor then we display it with create_mermaid_graphic and also check out its ports and terminals.

In [2]:
# Add the spaces
project.add_space(id = "Model",
                  name = "Model",
                  description = "A machine learning model")
project.add_space(id = "Evaluation Metrics",
                  name = "Evaluation Metrics",
                  description = "Metrics for evaluating the model")

# Add the block
project.add_block(id="Experiment",
                  name="Experiment",
                  description="A machine learning experiment",
                  domain=["Model"],
                  codomain=["Evaluation Metrics"],)

# Add the processor
project.add_processor(id="MNIST Experiment",
                      name="MNIST Experiment",
                      description="Runs experiments on the MNIST dataset",
                      parent_id="Experiment",)


processor = project.processors_map["MNIST Experiment"]
processor.display_mermaid_graphic()
print("Ports:")
print(processor.ports)
print("Terminals:")
print(processor.terminals)

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[MNIST Experiment - Experiment Block]
direction LR
X0[MNIST Experiment]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
end
XX0P0[Model] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

Ports:
[< Space ID: Model Name: Model >]
Terminals:
[< Space ID: Evaluation Metrics Name: Evaluation Metrics >]


We can also check out if it is a primitive processor or if it has a subsystem attatched. Right now it will be primitive because there is no subsystem but eventually we will transform it to have a subsystem.

In [3]:
print(processor.is_primitive())

True


## Breaking down the Subsystem

The first thing to do is further break down our subsystem into two components, which we will describe first by their blocks.

### Blocks and Spaces

1. **Load Supervised Features Block**: A block for loading datasets which returns X Train, X Test, Y Train and Y Test
2. **Supervised Learning Block**: A block representing supervised learning which given the spaces of Model, X Train, X Test, Y Train and Y Test returns Evaluation Metrics

In [4]:
# Add spaces
project.add_space(id="X Train",
                  name="X Train",
                  description="X training data")

project.add_space(id="Y Train",
                  name="Y Train",
                  description="Y training data")

project.add_space(id="X Test",
                  name="X Test",
                  description="X testing data")

project.add_space(id="Y Test",
                  name="Y Test",
                  description="Y testing data")

# Add the blocks
project.add_block(id="Load Supervised Features",
                  name="Load Supervised Features",
                  description="Block for a composite processor of loading supervised features",
                  domain=[],
                  codomain=[
          "X Train",
          "Y Train",
          "X Test",
          "Y Test"
        ],)

project.add_block(id="Supervised Learning",
                  name="Supervised Learning",
                  description="Block for a composite processor of a supervised learning system with cross validation split",
                  domain=[
          "Model",
          "X Train",
          "Y Train",
          "X Test",
          "Y Test"
        ],
                  codomain=["Evaluation Metrics"],)

### Processors

Given the two blocks we previously created, we now make the specific processors for MNIST:

1. **Load MNIST**: A processor for loading and pre-processing the MNIST dataset
2. **Default Supervised Learning**: A processor for conducting default supervised learning based with defaults of the model

In [5]:
# Add processors
project.add_processor(id="Load MNIST",
                      name="Load MNIST",
                      description="A processor for loading and pre-processing the MNIST dataset",
                      parent_id="Load Supervised Features",)

project.add_processor(id="Default Supervised Learning",
                      name="Default Supervised Learning",
                      description="A processor for conducting default supervised learning based with defaults of the model",
                      parent_id="Supervised Learning",)

# Display processors
project.processors_map["Load MNIST"].display_mermaid_graphic()
project.processors_map["Default Supervised Learning"].display_mermaid_graphic()

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Load MNIST - Load Supervised Features Block]
direction LR
X0[Load MNIST]
subgraph G0P[Ports]
direction TB
end
subgraph G0T[Terminals]
direction TB
XX0T0[X Train]
XX0T1[Y Train]
XX0T2[X Test]
XX0T3[Y Test]
end
X0 o--o XX0T0[X Train]
X0 o--o XX0T1[Y Train]
X0 o--o XX0T2[X Test]
X0 o--o XX0T3[Y Test]
end

```

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Default Supervised Learning - Supervised Learning Block]
direction LR
X0[Default Supervised Learning]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
XX0P1[X Train]
XX0P2[Y Train]
XX0P3[X Test]
XX0P4[Y Test]
end
XX0P0[Model] o--o X0
XX0P1[X Train] o--o X0
XX0P2[Y Train] o--o X0
XX0P3[X Test] o--o X0
XX0P4[Y Test] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

### Wiring up and Building a System

Our next step is to wire up and build a system. One thing we can do is use `find_potential_wires` between the two processors which will return all feasible potential wirings.

In [6]:
pprint(project.processors_map["Load MNIST"].find_potential_wires(project.processors_map["Default Supervised Learning"]))

{'Ports': [],
 'Terminals': [{'Parent': 'X Train',
                'Source': {'Index': 0, 'Processor': 'Load MNIST'},
                'Target': {'Index': 1,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'Y Train',
                'Source': {'Index': 1, 'Processor': 'Load MNIST'},
                'Target': {'Index': 2,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'X Test',
                'Source': {'Index': 2, 'Processor': 'Load MNIST'},
                'Target': {'Index': 3,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'Y Test',
                'Source': {'Index': 3, 'Processor': 'Load MNIST'},
                'Target': {'Index': 4,
                           'Processor': 'Default Supervised Learning'}}]}


We want to take all these terminal connections and wire them up but we have to give them IDs. Instead of doing it manually, we will set the flag to True for creating auto-incrementing wires of the type W1, W2, etc.

In [7]:
project.add_wires([{'Parent': 'X Train',
                'Source': {'Index': 0, 'Processor': 'Load MNIST'},
                'Target': {'Index': 1,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'Y Train',
                'Source': {'Index': 1, 'Processor': 'Load MNIST'},
                'Target': {'Index': 2,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'X Test',
                'Source': {'Index': 2, 'Processor': 'Load MNIST'},
                'Target': {'Index': 3,
                           'Processor': 'Default Supervised Learning'}},
               {'Parent': 'Y Test',
                'Source': {'Index': 3, 'Processor': 'Load MNIST'},
                'Target': {'Index': 4,
                           'Processor': 'Default Supervised Learning'}}],
                auto_increment=True)

pprint(project.wires)

[< Wire ID: W1 Space: X Train Source: (Load MNIST, 0) Target: (Default Supervised Learning, 1) >,
 < Wire ID: W2 Space: Y Train Source: (Load MNIST, 1) Target: (Default Supervised Learning, 2) >,
 < Wire ID: W3 Space: X Test Source: (Load MNIST, 2) Target: (Default Supervised Learning, 3) >,
 < Wire ID: W4 Space: Y Test Source: (Load MNIST, 3) Target: (Default Supervised Learning, 4) >]


Now that the wires are in we can create our first system.

In [8]:
project.add_system(id="MNIST Experiment System",
                   name="MNIST Experiment System",
                   processors=["Load MNIST", "Default Supervised Learning"],
                   wires=["W1", "W2", "W3", "W4"],
                     description="The system representing the entire machine learning experiment for the MNIST dataset",)
project.systems_map["MNIST Experiment System"].display_mermaid_graphic()

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GS0[MNIST Experiment System]
subgraph G0[Load MNIST - Load Supervised Features Block]
direction LR
X0[Load MNIST]
subgraph G0P[Ports]
direction TB
end
subgraph G0T[Terminals]
direction TB
XX0T0[X Train]
XX0T1[Y Train]
XX0T2[X Test]
XX0T3[Y Test]
end
X0 o--o XX0T0[X Train]
X0 o--o XX0T1[Y Train]
X0 o--o XX0T2[X Test]
X0 o--o XX0T3[Y Test]
end
subgraph G1[Default Supervised Learning - Supervised Learning Block]
direction LR
X1[Default Supervised Learning]
subgraph G1P[Ports]
direction TB
XX1P0[Model]
XX1P1[X Train]
XX1P2[Y Train]
XX1P3[X Test]
XX1P4[Y Test]
end
XX1P0[Model] o--o X1
XX1P1[X Train] o--o X1
XX1P2[Y Train] o--o X1
XX1P3[X Test] o--o X1
XX1P4[Y Test] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[Evaluation Metrics]
end
X1 o--o XX1T0[Evaluation Metrics]
end
XX0T0[X Train] ---> XX1P1[X Train]
XX0T1[Y Train] ---> XX1P2[Y Train]
XX0T2[X Test] ---> XX1P3[X Test]
XX0T3[Y Test] ---> XX1P4[Y Test]
end

```

## Adding a Subsystem

What we want to do next is take this MNIST Experiment System and attatch it as a subsystem of MNIST Experiment.

One feature we will use is the `find_potential_subsystems_mappings` to determine the possible mappings. Note that it is a nested array where each of the inner arrays represents all possible options of a mapping on the port/terminal of that index for the processor.

In [9]:
pprint(project.processors_map["MNIST Experiment"].find_potential_subsystems_mappings(project.systems_map["MNIST Experiment System"]))

{'Port Mappings': [[{'Index': 0, 'Processor': 'Default Supervised Learning'}]],
 'Terminal Mappings': [[{'Index': 0,
                         'Processor': 'Default Supervised Learning'}]]}


Now we simply attatch a subsystem to the processor by feeding in the processor, system, port mappings we choose and the terminal mappings we choose.

In [10]:
# Attatch the subsystem
port_mappings = [{'Index': 0, 'Processor': 'Default Supervised Learning'}]
terminal_mappings = [{'Index': 0, 'Processor': 'Default Supervised Learning'}]
project.attach_subsystem(project.processors_map["MNIST Experiment"],
                          project.systems_map["MNIST Experiment System"],
                          port_mappings,
                          terminal_mappings)

# Display the processor with and without the subsystem
project.processors_map["MNIST Experiment"].display_mermaid_graphic()
project.processors_map["MNIST Experiment"].display_mermaid_graphic(composite=True)

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[MNIST Experiment - Experiment Block]
direction LR
X0[MNIST Experiment]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
end
XX0P0[Model] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GC0[MNIST Experiment - Experiment Block]
direction LR
subgraph GS0[MNIST Experiment System]
subgraph G1[Load MNIST - Load Supervised Features Block]
direction LR
X1[Load MNIST]
subgraph G1P[Ports]
direction TB
end
subgraph G1T[Terminals]
direction TB
XX1T0[X Train]
XX1T1[Y Train]
XX1T2[X Test]
XX1T3[Y Test]
end
X1 o--o XX1T0[X Train]
X1 o--o XX1T1[Y Train]
X1 o--o XX1T2[X Test]
X1 o--o XX1T3[Y Test]
end
subgraph G2[Default Supervised Learning - Supervised Learning Block]
direction LR
X2[Default Supervised Learning]
subgraph G2P[Ports]
direction TB
XX2P0[Model]
XX2P1[X Train]
XX2P2[Y Train]
XX2P3[X Test]
XX2P4[Y Test]
end
XX2P0[Model] o--o X2
XX2P1[X Train] o--o X2
XX2P2[Y Train] o--o X2
XX2P3[X Test] o--o X2
XX2P4[Y Test] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[Evaluation Metrics]
end
X2 o--o XX2T0[Evaluation Metrics]
end
XX1T0[X Train] ---> XX2P1[X Train]
XX1T1[Y Train] ---> XX2P2[Y Train]
XX1T2[X Test] ---> XX2P3[X Test]
XX1T3[Y Test] ---> XX2P4[Y Test]
end
subgraph GC0P[Ports]
direction TB
X1P0[Model]
end
X1P0[Model] --> XX2P0[Model]
subgraph GC0T[Terminals]
direction TB
X1T0[Evaluation Metrics]
end
XX2T0[Evaluation Metrics] --> X1T0[Evaluation Metrics]
end

```

By calling get_system and grabbing the processors we can find the two inner components.

In [11]:
processor = project.processors_map["MNIST Experiment"]
components = processor.get_system().processors
pprint(components)

[< Processor ID: Load MNIST Name: Load MNIST []->['X Train', 'Y Train', 'X Test', 'Y Test']>,
 < Processor ID: Default Supervised Learning Name: Default Supervised Learning ['Model', 'X Train', 'Y Train', 'X Test', 'Y Test']->['Evaluation Metrics']>]


Display the components and check if they are primitive. In this notebook we have not yet added any details to them so both are primitive.

In [12]:
for component in components:
    print("Processor: {}".format(component.name))
    print("Is primitive? - {}".format(component.is_primitive()))
    print()
    
    display(Markdown(component.create_mermaid_graphic()[0]))

Processor: Load MNIST
Is primitive? - True



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Load MNIST - Load Supervised Features Block]
direction LR
X0[Load MNIST]
subgraph G0P[Ports]
direction TB
end
subgraph G0T[Terminals]
direction TB
XX0T0[X Train]
XX0T1[Y Train]
XX0T2[X Test]
XX0T3[Y Test]
end
X0 o--o XX0T0[X Train]
X0 o--o XX0T1[Y Train]
X0 o--o XX0T2[X Test]
X0 o--o XX0T3[Y Test]
end

```

Processor: Default Supervised Learning
Is primitive? - True



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Default Supervised Learning - Supervised Learning Block]
direction LR
X0[Default Supervised Learning]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
XX0P1[X Train]
XX0P2[Y Train]
XX0P3[X Test]
XX0P4[Y Test]
end
XX0P0[Model] o--o X0
XX0P1[X Train] o--o X0
XX0P2[Y Train] o--o X0
XX0P3[X Test] o--o X0
XX0P4[Y Test] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

## Saving

Now that we have created this project, we can save it as JSON to load another time!

In [13]:
project.save("../JSON/Supervised Learning V1.json")