Skip to content

jacopotagliabue/dag-card-is-the-new-model-card

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

dag-card-is-the-new-model-card

Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.

Overview

Model cards have been designed to be a "reference for all, regardless of expertise", a digital "one-pager" collecting quantitative and qualitative information about a given ML model, its use cases, strengths and biases. In this repo, we present a small experiment, Dag Cards, which are a small tweak on the original ones, in particular as adapted to a more general ML concept (a DAG/pipeline, not just the model per se) and to an organization-internal use case.

DAG Card

In particular, our small script combines structural information about a Metaflow's DAG with data about recent runs, artifacts, tests etc.: while being a tiny script, it provides enough functionalities to build realistic cards and collect feedback from stakeholders; given that we only use popular libraries and APIs (Jinja for templating, Metaflow for DAG, Weights & Biases for experiment tracking), we hope the setup is general enough to be almost immediately applicable to other workflows as well.

Please refer to the companion blog post for the back-story and some more context on cards, DAGs, behavioral tests etc.

Update Dec. 2021: our joint scholarly paper with Outerbounds, presenting the official @card feature for Metaflow, is now available as part of the DCAI workshop at Neurips 2021.

Structure

Flow

The card_builder.py script runs with a very simple logic:

Script structure

Given a HTML template, the script collects and "prettifies" data from different services to come up with a complete picture of the DAG - in our MVP, Metaflow client and W&B APIs (given the modular nature of templating, it is easy to extend the basic structure to have different/more services involved). In the end, the script "fills" the slots in the template to produce the final stand-alone HTML page with the card.

Card Structure

The current DAG Card has these main sections:

  • Overview: high-level description of the DAG.
  • Owners: DAG developers.
  • DAG: a visual description of the DAG.
  • Model: collapsible sections reporting metrics and artifacts for the latest K runs.
  • Tests: results of behavioral tests.

How to Run It

Code has been developed and tested on Python 3.6; dependencies are specified in the requirements.txt file. Please create a local .env file based on the provided template, and fill it with your values.

Prerequisites

Sample DAG

Assuming you are using named profiles for Metaflow, you can run the DAG with:

METAFLOW_PROFILE=my_profile python training_flow.py run

The DAG is mostly just a simplified version of the one in our previous tutorial; as such, it is built for pedagogical purposes (i.e. having a DAG to build a card for) with some shortcuts here and there (e.g. re-using the local model folder to run behavioral tests).

Card Builder

Assuming you are using named profiles for Metaflow, you can create a DAG card with:

METAFLOW_PROFILE=my_profile python card_builder.py

The result will be a static HTML page in the card folder.

Acknowledgements

  • Google cards were first presented at FAT*, and our general styling was influenced by their examples.
  • Metaflow functionalities come from their standard client plus some creative digging into their repo.
  • Charts are simple scripts embedded in the page, built with out-of-the-box functions from Chart.js.
  • Table style is from here.

Open Points / Backlog

In no particular order, some open points and improvements to make the card builder a little less hacky (together with the TODOs already in the code, of course):

  • the entire HTML/CSS/JS template is front-end code written by back-end people: as such, it will make front-end people cry: while definitely enough to produce a decent-looking page, some refactoring is needed to make it prettier and more readable;
  • the sections (e.g. which content to actually include in a card) and parameters (e.g. in which DAG step do we find properties X, Y, Z for our charts?) should be exposed through a config mechanism of some sort, even if just a simple yml file;
  • to give a quick overview of the model, we should also include a visualization (e.g. starting from the standard Keras-generated pic).

License

This code is provided "as is" and it is licensed under the terms of the MIT license.

About

Template-based generation of DAG cards from Metaflow classes, inspired by Google cards for machine learning models.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published