# MNIST BDP

The following is a breakdown of the problem of using a supervised learning model on the MNIST dataset. We begin with an empty project and fill it out with the pybdp interface to create an example of a machine learning block diagram for loading MNIST.

In [1]:
import sys
sys.path.append("../../pybdp")
from src import pybdp
from IPython.display import Markdown
from pprint import pprint

# Start with an empty project
project = pybdp.create_empty_project()

## System High Level

At the highest level, there is a processor that takes a model in and returns evaluation metrics. This is our "MNIST Experiment" processor.

We first create the spaces, block and processor then we display it with create_mermaid_graphic and also check out its ports and terminals.

In [2]:
# Add the spaces
project.add_space(id = "Model",
                  name = "Model",
                  description = "A machine learning model")
project.add_space(id = "Evaluation Metrics",
                  name = "Evaluation Metrics",
                  description = "Metrics for evaluating the model")

# Add the block
project.add_block(id="Experiment",
                  name="Experiment",
                  description="A machine learning experiment",
                  domain=["Model"],
                  codomain=["Evaluation Metrics"],)

# Add the processor
project.add_processor(id="MNIST Experiment",
                      name="MNIST Experiment",
                      description="Runs experiments on the MNIST dataset",
                      parent_id="Experiment",)


processor = project.processors_map["MNIST Experiment"]
processor.display_mermaid_graphic()
print("Ports:")
print(processor.ports)
print("Terminals:")
print(processor.terminals)

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[MNIST Experiment - Experiment Block]
direction LR
X0[MNIST Experiment]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
end
XX0P0[Model] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

Ports:
[< Space ID: Model Name: Model >]
Terminals:
[< Space ID: Evaluation Metrics Name: Evaluation Metrics >]


We can also check out if it is a primitive processor or if it has a subsystem attatched. Right now it will be primitive because there is no subsystem but eventually we will transform it to have a subsystem.

In [3]:
print(processor.is_primitive())

True


## Breaking down the Subsystem

The first thing to do is further break down our subsystem into two components, which we will describe first by their blocks.

### Blocks and Spaces

1. **Load Supervised Features Block**: A block for loading datasets which returns X Train, X Test, Y Train and Y Test
2. **Supervised Learning Block**: A block representing supervised learning which given the spaces of Model, X Train, X Test, Y Train and Y Test returns Evaluation Metrics

In [4]:
# Add spaces
project.add_space(id="X Train",
                  name="X Train",
                  description="X training data")

project.add_space(id="Y Train",
                  name="Y Train",
                  description="Y training data")

project.add_space(id="X Test",
                  name="X Test",
                  description="X testing data")

project.add_space(id="Y Test",
                  name="Y Test",
                  description="Y testing data")

# Add the blocks
project.add_block(id="Load Supervised Features",
                  name="Load Supervised Features",
                  description="Block for a composite processor of loading supervised features",
                  domain=[],
                  codomain=[
          "X Train",
          "Y Train",
          "X Test",
          "Y Test"
        ],)

project.add_block(id="Supervised Learning",
                  name="Supervised Learning",
                  description="Block for a composite processor of a supervised learning system with cross validation split",
                  domain=[
          "Model",
          "X Train",
          "Y Train",
          "X Test",
          "Y Test"
        ],
                  codomain=["Evaluation Metrics"],)

### Processors

Given the two blocks we previously created, we now make the specific processors for MNIST:

1. **Load MNIST**: A processor for loading and pre-processing the MNIST dataset
2. **Default Supervised Learning**: A processor for conducting default supervised learning based with defaults of the model

In [5]:
# Add processors
project.add_processor(id="Load MNIST",
                      name="Load MNIST",
                      description="A processor for loading and pre-processing the MNIST dataset",
                      parent_id="Load Supervised Features",)

project.add_processor(id="Default Supervised Learning",
                      name="Default Supervised Learning",
                      description="A processor for conducting default supervised learning based with defaults of the model",
                      parent_id="Supervised Learning",)

# Display processors
project.processors_map["Load MNIST"].display_mermaid_graphic()
project.processors_map["Default Supervised Learning"].display_mermaid_graphic()

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Load MNIST - Load Supervised Features Block]
direction LR
X0[Load MNIST]
subgraph G0P[Ports]
direction TB
end
subgraph G0T[Terminals]
direction TB
XX0T0[X Train]
XX0T1[Y Train]
XX0T2[X Test]
XX0T3[Y Test]
end
X0 o--o XX0T0[X Train]
X0 o--o XX0T1[Y Train]
X0 o--o XX0T2[X Test]
X0 o--o XX0T3[Y Test]
end

```

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Default Supervised Learning - Supervised Learning Block]
direction LR
X0[Default Supervised Learning]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
XX0P1[X Train]
XX0P2[Y Train]
XX0P3[X Test]
XX0P4[Y Test]
end
XX0P0[Model] o--o X0
XX0P1[X Train] o--o X0
XX0P2[Y Train] o--o X0
XX0P3[X Test] o--o X0
XX0P4[Y Test] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

### Wiring up and Building a System

Our next step is to wire up and build a system. One thing we can do is use `find_potential_wires` between the two processors which will return all feasible potential wirings.

In [6]:
project.processors_map["Load MNIST"].find_potential_wires(project.processors_map["Default Supervised Learning"])

{'Ports': [], 'Terminals': []}

In [7]:
{
                    "Description": "Runs experiments on the MNIST dataset",
                    "ID": "MNIST Experiment",
                    "Name": "MNIST Experiment",
                    "Parent": "Experiment",
                    "Ports": [
                      "Model"
                    ],
                    "Subsystem": {
                      "Port Mappings": [
                        {
                          "Index": 0,
                          "Processor": "Default Supervised Learning"
                        }
                      ],
                      "System ID": "MNIST Experiment System",
                      "Terminal Mappings": [
                        {
                          "Index": 0,
                          "Processor": "Default Supervised Learning"
                        }
                      ]
                    },
                    "Terminals": [
                      "Evaluation Metrics"
                    ]
                  },

({'Description': 'Runs experiments on the MNIST dataset',
  'ID': 'MNIST Experiment',
  'Name': 'MNIST Experiment',
  'Parent': 'Experiment',
  'Ports': ['Model'],
  'Subsystem': {'Port Mappings': [{'Index': 0,
     'Processor': 'Default Supervised Learning'}],
   'System ID': 'MNIST Experiment System',
   'Terminal Mappings': [{'Index': 0,
     'Processor': 'Default Supervised Learning'}]},
  'Terminals': ['Evaluation Metrics']},)

In [8]:
{
        "Description": "Load MNIST",
        "ID": "Load MNIST",
        "Name": "Load MNIST",
        "Parent": "Load Supervised Features",
        "Ports": [],
        "Subsystem": {
          "Port Mappings": [],
          "System ID": "Load MNIST System",
          "Terminal Mappings": [
            {
              "Index": 0,
              "Processor": "Image Normalization Preprocessing - Training"
            },
            {
              "Index": 1,
              "Processor": "Image Normalization Preprocessing - Training"
            },
            {
              "Index": 0,
              "Processor": "Image Normalization Preprocessing - Testing"
            },
            {
              "Index": 1,
              "Processor": "Image Normalization Preprocessing - Testing"
            }
          ]
        },
        "Terminals": [
          "X Train Array",
          "Y Train Array",
          "X Test Array",
          "Y Test Array"
        ]
      },


{
        "Description": "Conducts supervised learning using the defaults of the model",
        "ID": "Default Supervised Learning",
        "Name": "Default Supervised Learning",
        "Parent": "Supervised Learning",
        "Ports": [
          "Model",
          "X Train Array",
          "Y Train Array",
          "X Test Array",
          "Y Test Array"
        ],
        "Subsystem": {
          "Port Mappings": [
            {
              "Index": 0,
              "Processor": "Fit Supervised Model - Default"
            },
            {
              "Index": 1,
              "Processor": "Fit Supervised Model - Default"
            },
            {
              "Index": 2,
              "Processor": "Fit Supervised Model - Default"
            },
            {
              "Index": 1,
              "Processor": "Evaluate Supervised Model - Default"
            },
            {
              "Index": 2,
              "Processor": "Evaluate Supervised Model - Default"
            }
          ],
          "System ID": "Default Supervised Learning System",
          "Terminal Mappings": [
            {
              "Index": 0,
              "Processor": "No Post-processing"
            }
          ]
        },
        "Terminals": ["Evaluation Metrics"]
      },

({'Description': 'Conducts supervised learning using the defaults of the model',
  'ID': 'Default Supervised Learning',
  'Name': 'Default Supervised Learning',
  'Parent': 'Supervised Learning',
  'Ports': ['Model',
   'X Train Array',
   'Y Train Array',
   'X Test Array',
   'Y Test Array'],
  'Subsystem': {'Port Mappings': [{'Index': 0,
     'Processor': 'Fit Supervised Model - Default'},
    {'Index': 1, 'Processor': 'Fit Supervised Model - Default'},
    {'Index': 2, 'Processor': 'Fit Supervised Model - Default'},
    {'Index': 1, 'Processor': 'Evaluate Supervised Model - Default'},
    {'Index': 2, 'Processor': 'Evaluate Supervised Model - Default'}],
   'System ID': 'Default Supervised Learning System',
   'Terminal Mappings': [{'Index': 0, 'Processor': 'No Post-processing'}]},
  'Terminals': ['Evaluation Metrics']},)

In [9]:
processor = project.processors_map["MNIST Experiment"]
display(Markdown(processor.create_mermaid_graphic()[0]))
print("Ports:")
print(processor.ports)
print("Terminals:")
print(processor.terminals)

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[MNIST Experiment - Experiment Block]
direction LR
X0[MNIST Experiment]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
end
XX0P0[Model] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

Ports:
[< Space ID: Model Name: Model >]
Terminals:
[< Space ID: Evaluation Metrics Name: Evaluation Metrics >]


## Breaking it down into data and model fitting/evaluation

This processor has a subsystem however which is evidenced by the fact that the processor is not primitive.

The create_mermaid_graphic_composite function lets us peek into the subsystem it represents.

In [10]:
display(Markdown(processor.create_mermaid_graphic_composite()[0]))

```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[MNIST Experiment - Experiment Block]
direction LR
X0[MNIST Experiment]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
end
XX0P0[Model] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

By calling get_system and grabbing the processors we can find the two inner components.

In [11]:
components = processor.get_system().processors
pprint(components)

AttributeError: 'NoneType' object has no attribute 'processors'

Display the components and check if they are primitive.

In [None]:
for component in components:
    print("Processor: {}".format(component.name))
    print("Is primitive? - {}".format(component.is_primitive()))
    print()
    
    display(Markdown(component.create_mermaid_graphic()[0]))

Processor: Load MNIST
Is primitive? - False



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Load MNIST - Load Supervised Features Block]
direction LR
X0[Load MNIST]
subgraph G0P[Ports]
direction TB
end
subgraph G0T[Terminals]
direction TB
XX0T0[X Train Array]
XX0T1[Y Train Array]
XX0T2[X Test Array]
XX0T3[Y Test Array]
end
X0 o--o XX0T0[X Train Array]
X0 o--o XX0T1[Y Train Array]
X0 o--o XX0T2[X Test Array]
X0 o--o XX0T3[Y Test Array]
end

```

Processor: Default Supervised Learning
Is primitive? - False



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph G0[Default Supervised Learning - Supervised Learning Block]
direction LR
X0[Default Supervised Learning]
subgraph G0P[Ports]
direction TB
XX0P0[Model]
XX0P1[X Train Array]
XX0P2[Y Train Array]
XX0P3[X Test Array]
XX0P4[Y Test Array]
end
XX0P0[Model] o--o X0
XX0P1[X Train Array] o--o X0
XX0P2[Y Train Array] o--o X0
XX0P3[X Test Array] o--o X0
XX0P4[Y Test Array] o--o X0
subgraph G0T[Terminals]
direction TB
XX0T0[Evaluation Metrics]
end
X0 o--o XX0T0[Evaluation Metrics]
end

```

## One Level Deeper

Since both are not primitive, we can also show their subystems. Note that there can be infinite nesting, but for this block diagram this is the final level of recursion.

In [None]:
for component in components:
    print("Processor: {}".format(component.name))
    print()
    
    display(Markdown(component.create_mermaid_graphic_composite()[0]))

Processor: Load MNIST



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GC0[Load MNIST - Load Supervised Features Block]
direction LR
subgraph GS0[Load MNIST System]
subgraph G1[Load MNIST Dataset - Load Supervised Dataset Block]
direction LR
X1[Load MNIST Dataset]
subgraph G1P[Ports]
direction TB
end
subgraph G1T[Terminals]
direction TB
XX1T0[X]
XX1T1[Y]
end
X1 o--o XX1T0[X]
X1 o--o XX1T1[Y]
end
subgraph G2[Test-Train Split - Cross Validation Split Block]
direction LR
X2[Test-Train Split]
subgraph G2P[Ports]
direction TB
XX2P0[X]
XX2P1[Y]
end
XX2P0[X] o--o X2
XX2P1[Y] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[X Train Array]
XX2T1[Y Train Array]
XX2T2[X Test Array]
XX2T3[Y Test Array]
end
X2 o--o XX2T0[X Train Array]
X2 o--o XX2T1[Y Train Array]
X2 o--o XX2T2[X Test Array]
X2 o--o XX2T3[Y Test Array]
end
subgraph G3[Image Normalization Preprocessing - Training - Training Data Preprocessing Block]
direction LR
X3[Image Normalization Preprocessing - Training]
subgraph G3P[Ports]
direction TB
XX3P0[X Train Array]
XX3P1[Y Train Array]
end
XX3P0[X Train Array] o--o X3
XX3P1[Y Train Array] o--o X3
subgraph G3T[Terminals]
direction TB
XX3T0[X Train Array]
XX3T1[Y Train Array]
end
X3 o--o XX3T0[X Train Array]
X3 o--o XX3T1[Y Train Array]
end
subgraph G4[Image Normalization Preprocessing - Testing - Testing Data Preprocessing Block]
direction LR
X4[Image Normalization Preprocessing - Testing]
subgraph G4P[Ports]
direction TB
XX4P0[X Train Array]
XX4P1[Y Train Array]
XX4P2[X Test Array]
XX4P3[Y Test Array]
end
XX4P0[X Train Array] o--o X4
XX4P1[Y Train Array] o--o X4
XX4P2[X Test Array] o--o X4
XX4P3[Y Test Array] o--o X4
subgraph G4T[Terminals]
direction TB
XX4T0[X Test Array]
XX4T1[Y Test Array]
end
X4 o--o XX4T0[X Test Array]
X4 o--o XX4T1[Y Test Array]
end
XX1T0[X] ---> XX2P0[X]
XX1T1[Y] ---> XX2P1[Y]
XX2T0[X Train Array] ---> XX3P0[X Train Array]
XX2T1[Y Train Array] ---> XX3P1[Y Train Array]
XX2T0[X Train Array] ---> XX4P0[X Train Array]
XX2T1[Y Train Array] ---> XX4P1[Y Train Array]
XX2T2[X Test Array] ---> XX4P2[X Test Array]
XX2T3[Y Test Array] ---> XX4P3[Y Test Array]
end
subgraph GC0P[Ports]
direction TB
end
subgraph GC0T[Terminals]
direction TB
X1T0[X Train Array]
X1T1[Y Train Array]
X1T2[X Test Array]
X1T3[Y Test Array]
end
XX3T0[X Train Array] o--o X1T0[X Train Array]
XX3T1[Y Train Array] o--o X1T1[Y Train Array]
XX4T0[X Test Array] o--o X1T2[X Test Array]
XX4T1[Y Test Array] o--o X1T3[Y Test Array]
end

```

Processor: Default Supervised Learning



```mermaid
---
config:
    layout: elk
---
graph LR
subgraph GC0[Default Supervised Learning - Supervised Learning Block]
direction LR
subgraph GS0[Default Supervised Learning System]
subgraph G1[Fit Supervised Model - Default - Fit Supervised Model Block]
direction LR
X1[Fit Supervised Model - Default]
subgraph G1P[Ports]
direction TB
XX1P0[Model]
XX1P1[X Train Array]
XX1P2[Y Train Array]
end
XX1P0[Model] o--o X1
XX1P1[X Train Array] o--o X1
XX1P2[Y Train Array] o--o X1
subgraph G1T[Terminals]
direction TB
XX1T0[Model]
end
X1 o--o XX1T0[Model]
end
subgraph G2[Evaluate Supervised Model - Default - Evaluate Supervised Model Block]
direction LR
X2[Evaluate Supervised Model - Default]
subgraph G2P[Ports]
direction TB
XX2P0[Model]
XX2P1[X Test Array]
XX2P2[Y Test Array]
end
XX2P0[Model] o--o X2
XX2P1[X Test Array] o--o X2
XX2P2[Y Test Array] o--o X2
subgraph G2T[Terminals]
direction TB
XX2T0[Evaluation Metrics]
end
X2 o--o XX2T0[Evaluation Metrics]
end
subgraph G3[No Post-processing - Post-processing Block]
direction LR
X3[No Post-processing]
subgraph G3P[Ports]
direction TB
XX3P0[Evaluation Metrics]
end
XX3P0[Evaluation Metrics] o--o X3
subgraph G3T[Terminals]
direction TB
XX3T0[Evaluation Metrics]
end
X3 o--o XX3T0[Evaluation Metrics]
end
XX1T0[Model] ---> XX2P0[Model]
XX2T0[Evaluation Metrics] ---> XX3P0[Evaluation Metrics]
end
subgraph GC0P[Ports]
direction TB
X1P0[Model]
X1P1[X Train Array]
X1P2[Y Train Array]
X1P3[X Test Array]
X1P4[Y Test Array]
end
X1P0[Model] o--o XX1P0[Model]
X1P1[X Train Array] o--o XX1P1[X Train Array]
X1P2[Y Train Array] o--o XX1P2[Y Train Array]
X1P3[X Test Array] o--o XX2P1[X Test Array]
X1P4[Y Test Array] o--o XX2P2[Y Test Array]
subgraph GC0T[Terminals]
direction TB
X1T0[Evaluation Metrics]
end
XX3T0[Evaluation Metrics] o--o X1T0[Evaluation Metrics]
end

```