Skip to content

Latest commit

 

History

History
60 lines (47 loc) · 2.63 KB

monitoring_mlflow_wandb.rst

File metadata and controls

60 lines (47 loc) · 2.63 KB

Monitoring with MLflow and WandB

In squirrel, performance in iterstream can be calculated and logged. This is done by applying an extra method :pymonitor() into the original chaining iterstream. It can be added into any step in the above example where it is defined. For example, you can add .monitor(callback=wandb.log) right after async_map(times_two) Then the performance of all the previous steps combined will be calculated at this point and the calculated metrics will be passed to any user-specified callback such as :pywandb.log.

The following is a complete example:

import wandb
import mlflow
import numpy as np

def times_two(x: float) -> float:
    return x * 2

samples = [np.random.rand(10, 10) for i in range(10 ** 4)]
batch_size = 5

with wandb.init(): # or mlflow.start_run()
    it = (
        IterableSource(samples)
        .async_map(times_two)
        .monitor(wandb.log) # or mlflow.log_metrics
        .batched(batch_size)
    )
    it.collect() # or it.take(<some int>).join()

This will create an iterstream with the same transformation logics as it was without the method monitor, but the calculated metrics at step async_map is sent to the callback function wandb.log. (The calculated metrics is of type Dict[str, [int, float]], therefore any function takes such argument can be used to plug into the callback of monitor.)

By default, monitor calculates two metrics: Input/output operations per second (IOPS) and throughput. However, this can be configured by passing a data class :pysquirrel.metrics.MetricsConf to the argument metrics_conf in monitor. For details, see :pysquirrel.iterstream.metrics.

Monitoring at different locations in an iterstream in one run can be achieved by inserting monitor with different `prefix`:

with wandb.init(): # or mlflow.start_run()
    it = (
        IterableSource(samples)
        .monitor(wandb.log, prefix="(before async_map) ")
        .async_map(times_two)
        .monitor(wandb.log, prefix="(after async_map) ") # or mlflow.log_metrics
        .batched(batch_size)
    )
    it.collect() # or it.take(<some int>).join()

This will generate 4 instead of 2 metrics with each original metric bifurcate into two with different prefixes to track at which point the metrics are generated. (This does not interfere with metrics_conf which determines which metrics should be used in each monitor.)