<table width=100%>
  <tr>
    <td>
      <a href="https://colab.research.google.com/github/aurelienmorgan/retrain-pipelines/blob/master/extra/frameworks/Metaflow/remote_local_metaflow.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" /></a>
    </td>
    <td width=405>
      <a href="https://pypi.org/project/retrain-pipelines/" target="_blank"><img src="https://github.com/user-attachments/assets/19725866-13f9-48c1-b958-35c2e014351a" width="150" alt="retrain-pipelines" /></a>
      <a href="https://metaflow.org/" target="_blank"><img src="https://github.com/user-attachments/assets/8085a813-d993-47aa-8992-62123fa39967" width="250" alt="Metaflow" /></a>
    </td>
  </tr>
</table>

<b><center><font size=14em>Stateful Metaflow Service & UI</font></center></b><br />
<center><em><font size=12em>(2/2) Consume</font></em></center>

<em>The herein notebook is the continuance of the <a href="https://github.com/aurelienmorgan/retrain-pipelines/blob/master/extra/frameworks/Metaflow/metaflow_service.ipynb" target="_blank">(1/2) start</a> Google Colab notebook</em>

<hr />

## Setup

In [None]:
from google.colab import drive
# grant all permission or it fails
drive.mount('/content/drive')

In [None]:
# the below var, must hold the same Google Drive location
# as the one set in the "Metaflow Service" notebook
MF_ROOT = "/content/drive/MyDrive/Metaflow_hf"

In [None]:
tunnel_url = None
datastore_dir = f"{MF_ROOT}/local_datastore/"

<hr />

Declare some convenience methods

In [None]:
import os
import sys
import requests

os.environ['USERNAME'] = 'user'

In [None]:
def valid_tunnel_url():
    """
    When prompted, enter the URL of the tunnel which you established
    in the "Metaflow Service" Colab notebook
    """

    global tunnel_url
    if tunnel_url is not None:
        try:
            response = requests.get(f"{tunnel_url}/service/ping")
            if response.status_code != 200 or response.text != "pong":
                tunnel_url = input("Enter an active tunnel URL:\n")
            else:
                print(tunnel_url)
        except:
            tunnel_url = input("The former endpoint is not reachable. " +
                              "Enter an active tunnel URL:\n")
    else:
        tunnel_url = input("Enter an active tunnel URL:\n")

In [None]:
def unload_package(package_name):
    # Remove the package and its submodules from sys.modules
    names_to_remove = [name for name in sys.modules if name.startswith(package_name)]
    for name in names_to_remove:
        del sys.modules[name]
    if package_name in globals():
        del globals()[package_name]

Now, declare the <code>Hello World</code> flow&nbsp;:

In [None]:
!pip install metaflow
!pip install metaflow-card-html

In [None]:
%%writefile hello_world_flow.py
from metaflow import FlowSpec, step, current, card
from metaflow.cards import Markdown

class HelloWorldFlow(FlowSpec):

    @step
    def start(self):
        print("Hello, World!")
        self.next(self.pipeline_card)

    @card(id="custom", type="html")
    @step
    def pipeline_card(self):
        print("blabla")
        self.html = "blabla"
        self.next(self.end)

    @step
    def end(self):
        print("Flow Finished")

if __name__ == '__main__':
    HelloWorldFlow()

<hr />

# Standard <code>metaflow</code> integration

## Metaflow API

In [None]:
valid_tunnel_url()

# Launch flow run
! export METAFLOW_SERVICE_URL={tunnel_url}/service/ && \
  export METAFLOW_DEFAULT_METADATA=service && \
  export USERNAME=user && \
  cd {datastore_dir} && \
  python /content/hello_world_flow.py run

## Metaflow SDK

In [None]:
valid_tunnel_url()

import os
os.environ['METAFLOW_SERVICE_URL'] = f"{tunnel_url}/service"
os.environ['METAFLOW_DEFAULT_METADATA'] = 'service'

import metaflow

You can use the Metaflow python SDK as usual and, this will work with your Colab-hosted instance&nbsp;:

In [None]:
list(metaflow.Flow("HelloWorldFlow").runs())[0:10]

# <code>retrain-pipelines</code> integration

Lets start by installing the lib

In [None]:
!pip install --no-cache-dir "retrain-pipelines>=0.1.1"

Alternatively, one could install the current development snapshot from remote source&nbsp;:

In [None]:
# !pip install git+https://github.com/aurelienmorgan/retrain-pipelines.git@master#subdirectory=pkg_src
# !chmod +x /usr/local/lib/python3.10/dist-packages/retrain_pipelines/legacy_launcher.sh

## Metaflow API

Below is how you can launch a pipeline run through the <code>retrain-pipelines</code> cell magic&nbsp;:

In [None]:
%load_ext retrain_pipelines.local_launcher_magic

In [None]:
valid_tunnel_url()

os.environ['METAFLOW_SERVICE_URL'] = f"{tunnel_url}/service"
os.environ['METAFLOW_DATASTORE_SYSROOT_LOCAL'] = \
    f"{MF_ROOT}/local_datastore/"

%retrain_pipelines_local /content/hello_world_flow.py run

## Metaflow SDK

Below is how you can interact with the <code>metaflow</code> python package through <code>retrain-pipelines</code>&nbsp;:

In [None]:
unload_package('metaflow')
valid_tunnel_url()

os.environ['METAFLOW_SERVICE_URL'] = f"{tunnel_url}/service"
os.environ['METAFLOW_DATASTORE_SYSROOT_LOCAL'] = \
    f"{MF_ROOT}/local_datastore/"

from retrain_pipelines.frameworks import local_metaflow as metaflow

In [None]:
list(metaflow.Flow("HelloWorldFlow").runs())[0:10]

## Inspectors

In [None]:
mf_flow_name = 'HelloWorldFlow'

In [None]:
unload_package('metaflow')
valid_tunnel_url()

os.environ['METAFLOW_SERVICE_URL'] = f"{tunnel_url}/service"
os.environ['METAFLOW_DATASTORE_SYSROOT_LOCAL'] = \
    f"{MF_ROOT}/local_datastore/"

from retrain_pipelines.frameworks import local_metaflow as metaflow
from retrain_pipelines.inspectors import browse_pipeline_card

In [None]:
browse_pipeline_card(f"{tunnel_url}/ui_backend_service", mf_flow_name, verbose=True)