# Pydata Global 2022: Production-grade Machine Learning with Flyte

In this tutorial, you're going to learn about some of the key challenges to building and deploying reliable machine learning systems. At a high level, these challenges are the following:

- Scalability
- Data Quality
- Reproducibility
- Recoverability
- Auditability

In [1]:
from pathlib import Path
from flytekit.remote import FlyteRemote
from flytekit.configuration import Config


remote = FlyteRemote(
    config=Config.auto(),
    default_project="flytesnacks",
    default_domain="development",
)

  from .autonotebook import tqdm as notebook_tqdm


### Example 0: Introduction

In [4]:
from workflows import example_00_intro

execution = remote.execute_local_workflow(
    example_00_intro.training_workflow,
    inputs={
        "hyperparameters": {"C": 0.1, "max_iter": 5000},
        "test_size": 0.2,
        "random_state": 11,
    }
)
remote.generate_console_url(execution)

'https://sandbox.union.ai/console/projects/flytesnacks/domains/development/executions/f9d3f6c1e447043e684d'

In [9]:
execution = remote.sync(execution)

In [None]:
execution.outputs.keys()

#### Scheduling Launchplans

Activate the schedule:

In [5]:
lp_id = remote.fetch_launch_plan(name="scheduled_training_workflow").id
remote.client.update_launch_plan(lp_id, "ACTIVE")

Get the execution for the most recent 

In [34]:
recent_executions = [
    execution
    for execution in remote.recent_executions()
    if execution.spec.launch_plan.name == "scheduled_training_workflow"
]

scheduled_execution = None
if recent_executions:
    scheduled_execution = recent_executions[0]
print(scheduled_execution)

id {
  project: "flytesnacks"
  domain: "development"
  name: "f9db5d6b6033d3043000"
}
spec {
  launch_plan {
    resource_type: LAUNCH_PLAN
    project: "flytesnacks"
    domain: "development"
    name: "scheduled_training_workflow"
    version: "v3"
  }
  metadata {
    mode: SCHEDULED
    scheduled_at {
      seconds: 1669933080
    }
    system_metadata {
    }
  }
  labels {
  }
  annotations {
  }
  auth_role {
  }
}
closure {
  outputs {
    uri: "s3://my-s3-bucket/metadata/propeller/flytesnacks-development-f9db5d6b6033d3043000/end-node/data/0/outputs.pb"
  }
  phase: SUCCEEDED
  started_at {
    seconds: 1669933080
    nanos: 79155000
  }
  duration {
    seconds: 42
    nanos: 492539000
  }
}



In [43]:
scheduled_execution = remote.sync(scheduled_execution)
model = scheduled_execution.outputs["o0"]
model

https://scikit-learn.org/stable/model_persistence.html#security-maintainability-limitations


Now deactivate the schedule

In [44]:
remote.client.update_launch_plan(lp_id, "INACTIVE")

### Example 1: Dynamic Workflows

In [None]:
from workflows import example_01_dynamic

execution = remote.execute_local_workflow(
    example_01_dynamic.tuning_workflow,
    inputs={
        "hyperparam_grid": [
            {"C": 0.1, "max_iter": 5000},
            {"C": 0.01, "max_iter": 5000},
            {"C": 0.001, "max_iter": 5000},
        ],
    }
)
remote.generate_console_url(execution)

### Example 2: Map Tasks

### Example 3: Plugins

### Example 4: Type System

### Example 5: Pandera Types

### Example 6: Reproducibility

### Example 7: Caching

### Example 8: Recovering Failed Executions

### Example 9: Checkpointing

### Example 10: Visiualization with Flyte Decks

In [3]:
from workflows import example_10_flyte_decks

execution = remote.execute_local_workflow(
    example_10_flyte_decks.penguins_data_workflow,
    inputs={},
)
remote.generate_console_url(execution)

'https://sandbox.union.ai/console/projects/flytesnacks/domains/development/executions/fb7956c06972c4cae85e'

### Example 11: Extending Flyte Decks