Skip to content

2.4.1 (Oct 18th 2021)

Compare
Choose a tag to compare
@savingoyal savingoyal released this 18 Oct 22:28
· 525 commits to master since this release
1f30607

Metaflow 2.4.1 Release Notes

The Metaflow 2.4.1 release is a patch release

Bug Fixes

Expose non-pythonic dependencies inside the conda environment on AWS Batch (#735)

Prior to this release, non-pythonic dependencies in a conda environment were not automatically visible to a Metaflow task executing on AWS Batch (see #734) (they were available for tasks that were executed locally). For example

import os
from metaflow import FlowSpec, step, conda, conda_base, batch

class TestFlow(FlowSpec):

    @step
    def start(self):
        self.next(self.use_node)

    @batch
    @conda(libraries={"nodejs": ">=16.0.0"})
    @step
    def use_node(self):
        print(os.system("node --version"))
        self.next(self.end)

    @step
    def end(self):
        pass


if __name__ == "__main__":
    TestFlow()

would print an error. This release fixes the issue with the incorrect PATH configuration.

New Features

Introduce size properties for artifacts and logs in metaflow.client (#752)

This release exposes size properties for artifacts and logs (stderr and stdout) in metaflow.client. These properties are relied upon by the Metaflow UI (open-sourcing soon!).

Expose attempt level task properties (#725)

In addition to the above mentioned properties, now users of Metaflow can access attempt specific Task metadata using the client

Task('42/start/452', attempt=1)

Introduce @kubernetes decorator for launching Metaflow tasks on Kubernetes (#644)

This release marks the alpha launch of @kubernetes decorator that allows farming off Metaflow tasks onto Kubernetes. The functionality works in exactly the same manner as @batch -

from metaflow import FlowSpec, step, resources

class BigSum(FlowSpec):

    @resources(memory=60000, cpu=1)
    @step
    def start(self):
        import numpy
        import time
        big_matrix = numpy.random.ranf((80000, 80000))
        t = time.time()
        self.sum = numpy.sum(big_matrix)
        self.took = time.time() - t
        self.next(self.end)

    @step
    def end(self):
        print("The sum is %f." % self.sum)
        print("Computing it took %dms." % (self.took * 1000))

if __name__ == '__main__':
    BigSum()
python big_sum.py run --with kubernetes

will run all steps of this workflow on your existing EKS cluster (which can be configured with metaflow configure eks) and provides all the goodness of Metaflow!

To get started follow this guide! We would appreciate your early feedback at http://slack.outerbounds.co.