diff --git a/docs/api/S3.ipynb b/docs/api/S3.ipynb index 1580ddb6..3d430fc7 100644 --- a/docs/api/S3.ipynb +++ b/docs/api/S3.ipynb @@ -24,10 +24,10 @@ "id": "267ce6f4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.364067Z", - "iopub.status.busy": "2024-07-25T06:16:55.363954Z", - "iopub.status.idle": "2024-07-25T06:16:55.632670Z", - "shell.execute_reply": "2024-07-25T06:16:55.632359Z" + "iopub.execute_input": "2025-08-30T21:15:32.992870Z", + "iopub.status.busy": "2025-08-30T21:15:32.992750Z", + "iopub.status.idle": "2025-08-30T21:15:33.209177Z", + "shell.execute_reply": "2025-08-30T21:15:33.208899Z" } }, "outputs": [], @@ -55,10 +55,10 @@ "id": "be77093e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.635239Z", - "iopub.status.busy": "2024-07-25T06:16:55.635039Z", - "iopub.status.idle": "2024-07-25T06:16:55.655916Z", - "shell.execute_reply": "2024-07-25T06:16:55.655612Z" + "iopub.execute_input": "2025-08-30T21:15:33.211275Z", + "iopub.status.busy": "2025-08-30T21:15:33.211127Z", + "iopub.status.idle": "2025-08-30T21:15:33.230516Z", + "shell.execute_reply": "2025-08-30T21:15:33.230256Z" } }, "outputs": [ @@ -66,9 +66,9 @@ "data": { "text/html": [ "\n", - "

class S3 tmproot='.', bucket=None, prefix=None, run=None, s3root=None[source]

metaflow

The Metaflow S3 client.

This object manages the connection to S3 and a temporary diretory that is used
to download objects. Note that in most cases when the data fits in memory, no local
disk IO is needed as operations are cached by the operating system, which makes
operations fast as long as there is enough memory available.

The easiest way is to use this object as a context manager:
```
with S3() as s3:
    data = [obj.blob for obj in s3.get_many(urls)]
print(data)
```
The context manager takes care of creating and deleting a temporary directory
automatically. Without a context manager, you must call `.close()` to delete
the directory explicitly:
```
s3 = S3()
data = [obj.blob for obj in s3.get_many(urls)]
s3.close()
```
You can customize the location of the temporary directory with `tmproot`. It
defaults to the current working directory.

To make it easier to deal with object locations, the client can be initialized
with an S3 path prefix. There are three ways to handle locations:

1. Use a `metaflow.Run` object or `self`, e.g. `S3(run=self)` which
   initializes the prefix with the global `DATATOOLS_S3ROOT` path, combined
   with the current run ID. This mode makes it easy to version data based
   on the run ID consistently. You can use the `bucket` and `prefix` to
   override parts of `DATATOOLS_S3ROOT`.

2. Specify an S3 prefix explicitly with `s3root`,
   e.g. `S3(s3root='s3://mybucket/some/path')`.

3. Specify nothing, i.e. `S3()`, in which case all operations require
   a full S3 url prefixed with `s3://`.

Parameters
----------
tmproot : str, default: '.'
    Where to store the temporary directory.
bucket : str, optional
    Override the bucket from `DATATOOLS_S3ROOT` when `run` is specified.
prefix : str, optional
    Override the path from `DATATOOLS_S3ROOT` when `run` is specified.
run : FlowSpec or Run, optional
    Derive path prefix from the current or a past run ID, e.g. S3(run=self).
s3root : str, optional
    If `run` is not specified, use this as the S3 prefix.

\n", + "

class S3 tmproot='.', bucket=None, prefix=None, run=None, s3root=None[source]

metaflow

The Metaflow S3 client.

This object manages the connection to S3 and a temporary diretory that is used
to download objects. Note that in most cases when the data fits in memory, no local
disk IO is needed as operations are cached by the operating system, which makes
operations fast as long as there is enough memory available.

The easiest way is to use this object as a context manager:
```
with S3() as s3:
    data = [obj.blob for obj in s3.get_many(urls)]
print(data)
```
The context manager takes care of creating and deleting a temporary directory
automatically. Without a context manager, you must call `.close()` to delete
the directory explicitly:
```
s3 = S3()
data = [obj.blob for obj in s3.get_many(urls)]
s3.close()
```
You can customize the location of the temporary directory with `tmproot`. It
defaults to the current working directory.

To make it easier to deal with object locations, the client can be initialized
with an S3 path prefix. There are three ways to handle locations:

1. Use a `metaflow.Run` object or `self`, e.g. `S3(run=self)` which
   initializes the prefix with the global `DATATOOLS_S3ROOT` path, combined
   with the current run ID. This mode makes it easy to version data based
   on the run ID consistently. You can use the `bucket` and `prefix` to
   override parts of `DATATOOLS_S3ROOT`.

2. Specify an S3 prefix explicitly with `s3root`,
   e.g. `S3(s3root='s3://mybucket/some/path')`.

3. Specify nothing, i.e. `S3()`, in which case all operations require
   a full S3 url prefixed with `s3://`.

Parameters
----------
tmproot : str, default: '.'
    Where to store the temporary directory.
bucket : str, optional
    Override the bucket from `DATATOOLS_S3ROOT` when `run` is specified.
prefix : str, optional
    Override the path from `DATATOOLS_S3ROOT` when `run` is specified.
run : FlowSpec or Run, optional
    Derive path prefix from the current or a past run ID, e.g. S3(run=self).
s3root : str, optional
    If `run` is not specified, use this as the S3 prefix.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -83,7 +83,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -101,10 +101,10 @@ "id": "a0e04313", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.658196Z", - "iopub.status.busy": "2024-07-25T06:16:55.658076Z", - "iopub.status.idle": "2024-07-25T06:16:55.660743Z", - "shell.execute_reply": "2024-07-25T06:16:55.660439Z" + "iopub.execute_input": "2025-08-30T21:15:33.232479Z", + "iopub.status.busy": "2025-08-30T21:15:33.232381Z", + "iopub.status.idle": "2025-08-30T21:15:33.234544Z", + "shell.execute_reply": "2025-08-30T21:15:33.234301Z" } }, "outputs": [ @@ -112,9 +112,9 @@ "data": { "text/html": [ "\n", - "

method S3.close (self)[source]

Delete all temporary files downloaded in this context.

\n", + "

method S3.close (self)[source]

Delete all temporary files downloaded in this context.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -123,7 +123,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -149,10 +149,10 @@ "id": "30c3afdf", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.663324Z", - "iopub.status.busy": "2024-07-25T06:16:55.663117Z", - "iopub.status.idle": "2024-07-25T06:16:55.666684Z", - "shell.execute_reply": "2024-07-25T06:16:55.666412Z" + "iopub.execute_input": "2025-08-30T21:15:33.236950Z", + "iopub.status.busy": "2025-08-30T21:15:33.236863Z", + "iopub.status.idle": "2025-08-30T21:15:33.239722Z", + "shell.execute_reply": "2025-08-30T21:15:33.239523Z" } }, "outputs": [ @@ -160,9 +160,9 @@ "data": { "text/html": [ "\n", - "

method S3.get (self, key: Union[str, metaflow.plugins.datatools.s3.s3.S3GetObject, NoneType] = None, return_missing: bool = False, return_info: bool = True) -> metaflow.plugins.datatools.s3.s3.S3Object[source]

Get a single object from S3.

Parameters
----------
key : Union[str, S3GetObject], optional, default None
    Object to download. It can be an S3 url, a path suffix, or
    an S3GetObject that defines a range of data to download. If None, or
    not provided, gets the S3 root.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.
return_info : bool, default True
    If set to True, fetch the content-type and user metadata associated
    with the object at no extra cost, included for symmetry with `get_many`

Returns
-------
S3Object
    An S3Object corresponding to the object requested.

\n", + "

method S3.get (self, key: Union[str, metaflow.plugins.datatools.s3.s3.S3GetObject, NoneType] = None, return_missing: bool = False, return_info: bool = True) -> metaflow.plugins.datatools.s3.s3.S3Object[source]

Get a single object from S3.

Parameters
----------
key : Union[str, S3GetObject], optional, default None
    Object to download. It can be an S3 url, a path suffix, or
    an S3GetObject that defines a range of data to download. If None, or
    not provided, gets the S3 root.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.
return_info : bool, default True
    If set to True, fetch the content-type and user metadata associated
    with the object at no extra cost, included for symmetry with `get_many`

Returns
-------
S3Object
    An S3Object corresponding to the object requested.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -178,7 +178,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -196,10 +196,10 @@ "id": "2bbb1e36", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.668637Z", - "iopub.status.busy": "2024-07-25T06:16:55.668562Z", - "iopub.status.idle": "2024-07-25T06:16:55.671501Z", - "shell.execute_reply": "2024-07-25T06:16:55.671249Z" + "iopub.execute_input": "2025-08-30T21:15:33.242005Z", + "iopub.status.busy": "2025-08-30T21:15:33.241917Z", + "iopub.status.idle": "2025-08-30T21:15:33.244492Z", + "shell.execute_reply": "2025-08-30T21:15:33.244292Z" } }, "outputs": [ @@ -207,9 +207,9 @@ "data": { "text/html": [ "\n", - "

method S3.get_many (self, keys: Iterable[Union[str, metaflow.plugins.datatools.s3.s3.S3GetObject]], return_missing: bool = False, return_info: bool = True) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get many objects from S3 in parallel.

Parameters
----------
keys : Iterable[Union[str, S3GetObject]]
    Objects to download. Each object can be an S3 url, a path suffix, or
    an S3GetObject that defines a range of data to download.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.
return_info : bool, default True
    If set to True, fetch the content-type and user metadata associated
    with the object at no extra cost, included for symmetry with `get_many`.

Returns
-------
List[S3Object]
    S3Objects corresponding to the objects requested.

\n", + "

method S3.get_many (self, keys: Iterable[Union[str, metaflow.plugins.datatools.s3.s3.S3GetObject]], return_missing: bool = False, return_info: bool = True) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get many objects from S3 in parallel.

Parameters
----------
keys : Iterable[Union[str, S3GetObject]]
    Objects to download. Each object can be an S3 url, a path suffix, or
    an S3GetObject that defines a range of data to download.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.
return_info : bool, default True
    If set to True, fetch the content-type and user metadata associated
    with the object at no extra cost, included for symmetry with `get_many`.

Returns
-------
List[S3Object]
    S3Objects corresponding to the objects requested.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -225,7 +225,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -243,10 +243,10 @@ "id": "64a74dcf", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.673609Z", - "iopub.status.busy": "2024-07-25T06:16:55.673501Z", - "iopub.status.idle": "2024-07-25T06:16:55.676324Z", - "shell.execute_reply": "2024-07-25T06:16:55.676093Z" + "iopub.execute_input": "2025-08-30T21:15:33.246450Z", + "iopub.status.busy": "2025-08-30T21:15:33.246369Z", + "iopub.status.idle": "2025-08-30T21:15:33.248945Z", + "shell.execute_reply": "2025-08-30T21:15:33.248737Z" } }, "outputs": [ @@ -254,9 +254,9 @@ "data": { "text/html": [ "\n", - "

method S3.get_recursive (self, keys: Iterable[str], return_info: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get many objects from S3 recursively in parallel.

Parameters
----------
keys : Iterable[str]
    Prefixes to download recursively. Each prefix can be an S3 url or a path suffix
    which define the root prefix under which all objects are downloaded.
return_info : bool, default False
    If set to True, fetch the content-type and user metadata associated
    with the object.

Returns
-------
List[S3Object]
    S3Objects stored under the given prefixes.

\n", + "

method S3.get_recursive (self, keys: Iterable[str], return_info: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get many objects from S3 recursively in parallel.

Parameters
----------
keys : Iterable[str]
    Prefixes to download recursively. Each prefix can be an S3 url or a path suffix
    which define the root prefix under which all objects are downloaded.
return_info : bool, default False
    If set to True, fetch the content-type and user metadata associated
    with the object.

Returns
-------
List[S3Object]
    S3Objects stored under the given prefixes.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -271,7 +271,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -289,10 +289,10 @@ "id": "62b5aa5c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.678408Z", - "iopub.status.busy": "2024-07-25T06:16:55.678312Z", - "iopub.status.idle": "2024-07-25T06:16:55.680730Z", - "shell.execute_reply": "2024-07-25T06:16:55.680538Z" + "iopub.execute_input": "2025-08-30T21:15:33.250587Z", + "iopub.status.busy": "2025-08-30T21:15:33.250512Z", + "iopub.status.idle": "2025-08-30T21:15:33.252525Z", + "shell.execute_reply": "2025-08-30T21:15:33.252343Z" } }, "outputs": [ @@ -300,9 +300,9 @@ "data": { "text/html": [ "\n", - "

method S3.get_all (self, return_info: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get all objects under the prefix set in the `S3` constructor.

This method requires that the `S3` object is initialized either with `run` or
`s3root`.

Parameters
----------
return_info : bool, default False
    If set to True, fetch the content-type and user metadata associated
    with the object.

Returns
-------
Iterable[S3Object]
    S3Objects stored under the main prefix.

\n", + "

method S3.get_all (self, return_info: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get all objects under the prefix set in the `S3` constructor.

This method requires that the `S3` object is initialized either with `run` or
`s3root`.

Parameters
----------
return_info : bool, default False
    If set to True, fetch the content-type and user metadata associated
    with the object.

Returns
-------
Iterable[S3Object]
    S3Objects stored under the main prefix.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -316,7 +316,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 7, @@ -342,10 +342,10 @@ "id": "b00c7e64", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.682599Z", - "iopub.status.busy": "2024-07-25T06:16:55.682502Z", - "iopub.status.idle": "2024-07-25T06:16:55.685091Z", - "shell.execute_reply": "2024-07-25T06:16:55.684875Z" + "iopub.execute_input": "2025-08-30T21:15:33.254574Z", + "iopub.status.busy": "2025-08-30T21:15:33.254495Z", + "iopub.status.idle": "2025-08-30T21:15:33.256763Z", + "shell.execute_reply": "2025-08-30T21:15:33.256555Z" }, "scrolled": true }, @@ -354,9 +354,9 @@ "data": { "text/html": [ "\n", - "

method S3.list_paths (self, keys: Optional[Iterable[str]] = None) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

List the next level of paths in S3.

If multiple keys are specified, listings are done in parallel. The returned
S3Objects have `.exists == False` if the path refers to a prefix, not an
existing S3 object.

For instance, if the directory hierarchy is
```
a/0.txt
a/b/1.txt
a/c/2.txt
a/d/e/3.txt
f/4.txt
```
The `list_paths(['a', 'f'])` call returns
```
a/0.txt (exists == True)
a/b/ (exists == False)
a/c/ (exists == False)
a/d/ (exists == False)
f/4.txt (exists == True)
```

Parameters
----------
keys : Iterable[str], optional, default None
    List of paths.

Returns
-------
List[S3Object]
    S3Objects under the given paths, including prefixes (directories) that
    do not correspond to leaf objects.

\n", + "

method S3.list_paths (self, keys: Optional[Iterable[str]] = None) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

List the next level of paths in S3.

If multiple keys are specified, listings are done in parallel. The returned
S3Objects have `.exists == False` if the path refers to a prefix, not an
existing S3 object.

For instance, if the directory hierarchy is
```
a/0.txt
a/b/1.txt
a/c/2.txt
a/d/e/3.txt
f/4.txt
```
The `list_paths(['a', 'f'])` call returns
```
a/0.txt (exists == True)
a/b/ (exists == False)
a/c/ (exists == False)
a/d/ (exists == False)
f/4.txt (exists == True)
```

Parameters
----------
keys : Iterable[str], optional, default None
    List of paths.

Returns
-------
List[S3Object]
    S3Objects under the given paths, including prefixes (directories) that
    do not correspond to leaf objects.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -370,7 +370,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 8, @@ -388,10 +388,10 @@ "id": "9b39aa1f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.687389Z", - "iopub.status.busy": "2024-07-25T06:16:55.687310Z", - "iopub.status.idle": "2024-07-25T06:16:55.689833Z", - "shell.execute_reply": "2024-07-25T06:16:55.689561Z" + "iopub.execute_input": "2025-08-30T21:15:33.259069Z", + "iopub.status.busy": "2025-08-30T21:15:33.258972Z", + "iopub.status.idle": "2025-08-30T21:15:33.261357Z", + "shell.execute_reply": "2025-08-30T21:15:33.261178Z" }, "scrolled": true }, @@ -400,9 +400,9 @@ "data": { "text/html": [ "\n", - "

method S3.list_recursive (self, keys: Optional[Iterable[str]] = None) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

List all objects recursively under the given prefixes.

If multiple keys are specified, listings are done in parallel. All objects
returned have `.exists == True` as this call always returns leaf objects.

For instance, if the directory hierarchy is
```
a/0.txt
a/b/1.txt
a/c/2.txt
a/d/e/3.txt
f/4.txt
```
The `list_paths(['a', 'f'])` call returns
```
a/0.txt (exists == True)
a/b/1.txt (exists == True)
a/c/2.txt (exists == True)
a/d/e/3.txt (exists == True)
f/4.txt (exists == True)
```

Parameters
----------
keys : Iterable[str], optional, default None
    List of paths.

Returns
-------
List[S3Object]
    S3Objects under the given paths.

\n", + "

method S3.list_recursive (self, keys: Optional[Iterable[str]] = None) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

List all objects recursively under the given prefixes.

If multiple keys are specified, listings are done in parallel. All objects
returned have `.exists == True` as this call always returns leaf objects.

For instance, if the directory hierarchy is
```
a/0.txt
a/b/1.txt
a/c/2.txt
a/d/e/3.txt
f/4.txt
```
The `list_paths(['a', 'f'])` call returns
```
a/0.txt (exists == True)
a/b/1.txt (exists == True)
a/c/2.txt (exists == True)
a/d/e/3.txt (exists == True)
f/4.txt (exists == True)
```

Parameters
----------
keys : Iterable[str], optional, default None
    List of paths.

Returns
-------
List[S3Object]
    S3Objects under the given paths.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -416,7 +416,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -442,10 +442,10 @@ "id": "8e76d108", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.692135Z", - "iopub.status.busy": "2024-07-25T06:16:55.692039Z", - "iopub.status.idle": "2024-07-25T06:16:55.695207Z", - "shell.execute_reply": "2024-07-25T06:16:55.694894Z" + "iopub.execute_input": "2025-08-30T21:15:33.263490Z", + "iopub.status.busy": "2025-08-30T21:15:33.263413Z", + "iopub.status.idle": "2025-08-30T21:15:33.266179Z", + "shell.execute_reply": "2025-08-30T21:15:33.265983Z" }, "scrolled": true }, @@ -454,9 +454,9 @@ "data": { "text/html": [ "\n", - "

method S3.put (self, key: Union[str, metaflow.plugins.datatools.s3.s3.S3PutObject], obj: Union[io.RawIOBase, io.BufferedIOBase, str, bytes], overwrite: bool = True, content_type: Optional[str] = None, metadata: Optional[Dict[str, str]] = None) -> str[source]

Upload a single object to S3.

Parameters
----------
key : Union[str, S3PutObject]
    Object path. It can be an S3 url or a path suffix.
obj : PutValue
    An object to store in S3. Strings are converted to UTF-8 encoding.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.
content_type : str, optional, default None
    Optional MIME type for the object.
metadata : Dict[str, str], optional, default None
    A JSON-encodable dictionary of additional headers to be stored
    as metadata with the object.

Returns
-------
str
    URL of the object stored.

\n", + "

method S3.put (self, key: Union[str, metaflow.plugins.datatools.s3.s3.S3PutObject], obj: Union[io.RawIOBase, io.BufferedIOBase, str, bytes], overwrite: bool = True, content_type: Optional[str] = None, metadata: Optional[Dict[str, str]] = None) -> str[source]

Upload a single object to S3.

Parameters
----------
key : Union[str, S3PutObject]
    Object path. It can be an S3 url or a path suffix.
obj : PutValue
    An object to store in S3. Strings are converted to UTF-8 encoding.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.
content_type : str, optional, default None
    Optional MIME type for the object.
metadata : Dict[str, str], optional, default None
    A JSON-encodable dictionary of additional headers to be stored
    as metadata with the object.

Returns
-------
str
    URL of the object stored.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -474,7 +474,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -492,10 +492,10 @@ "id": "ae5c6247", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.697366Z", - "iopub.status.busy": "2024-07-25T06:16:55.697267Z", - "iopub.status.idle": "2024-07-25T06:16:55.700234Z", - "shell.execute_reply": "2024-07-25T06:16:55.699996Z" + "iopub.execute_input": "2025-08-30T21:15:33.269936Z", + "iopub.status.busy": "2025-08-30T21:15:33.269853Z", + "iopub.status.idle": "2025-08-30T21:15:33.272552Z", + "shell.execute_reply": "2025-08-30T21:15:33.272353Z" } }, "outputs": [ @@ -503,9 +503,9 @@ "data": { "text/html": [ "\n", - "

method S3.put_many (self, key_objs: List[Union[Tuple[str, Union[io.RawIOBase, io.BufferedIOBase, str, bytes]], metaflow.plugins.datatools.s3.s3.S3PutObject]], overwrite: bool = True) -> List[Tuple[str, str]][source]

Upload many objects to S3.

Each object to be uploaded can be specified in two ways:

1. As a `(key, obj)` tuple where `key` is a string specifying
   the path and `obj` is a string or a bytes object.

2. As a `S3PutObject` which contains additional metadata to be
   stored with the object.

Parameters
----------
key_objs : List[Union[Tuple[str, PutValue], S3PutObject]]
    List of key-object pairs to upload.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.

Returns
-------
List[Tuple[str, str]]
    List of `(key, url)` pairs corresponding to the objects uploaded.

\n", + "

method S3.put_many (self, key_objs: List[Union[Tuple[str, Union[io.RawIOBase, io.BufferedIOBase, str, bytes]], metaflow.plugins.datatools.s3.s3.S3PutObject]], overwrite: bool = True) -> List[Tuple[str, str]][source]

Upload many objects to S3.

Each object to be uploaded can be specified in two ways:

1. As a `(key, obj)` tuple where `key` is a string specifying
   the path and `obj` is a string or a bytes object.

2. As a `S3PutObject` which contains additional metadata to be
   stored with the object.

Parameters
----------
key_objs : List[Union[Tuple[str, PutValue], S3PutObject]]
    List of key-object pairs to upload.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.

Returns
-------
List[Tuple[str, str]]
    List of `(key, url)` pairs corresponding to the objects uploaded.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -520,7 +520,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 11, @@ -538,10 +538,10 @@ "id": "d9c3ea39", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.702286Z", - "iopub.status.busy": "2024-07-25T06:16:55.702189Z", - "iopub.status.idle": "2024-07-25T06:16:55.704955Z", - "shell.execute_reply": "2024-07-25T06:16:55.704709Z" + "iopub.execute_input": "2025-08-30T21:15:33.279023Z", + "iopub.status.busy": "2025-08-30T21:15:33.278911Z", + "iopub.status.idle": "2025-08-30T21:15:33.281700Z", + "shell.execute_reply": "2025-08-30T21:15:33.281450Z" } }, "outputs": [ @@ -549,9 +549,9 @@ "data": { "text/html": [ "\n", - "

method S3.put_files (self, key_paths: List[Union[Tuple[str, Union[io.RawIOBase, io.BufferedIOBase, str, bytes]], metaflow.plugins.datatools.s3.s3.S3PutObject]], overwrite: bool = True) -> List[Tuple[str, str]][source]

Upload many local files to S3.

Each file to be uploaded can be specified in two ways:

1. As a `(key, path)` tuple where `key` is a string specifying
   the S3 path and `path` is the path to a local file.

2. As a `S3PutObject` which contains additional metadata to be
   stored with the file.

Parameters
----------
key_paths :  List[Union[Tuple[str, PutValue], S3PutObject]]
    List of files to upload.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.

Returns
-------
List[Tuple[str, str]]
    List of `(key, url)` pairs corresponding to the files uploaded.

\n", + "

method S3.put_files (self, key_paths: List[Union[Tuple[str, Union[io.RawIOBase, io.BufferedIOBase, str, bytes]], metaflow.plugins.datatools.s3.s3.S3PutObject]], overwrite: bool = True) -> List[Tuple[str, str]][source]

Upload many local files to S3.

Each file to be uploaded can be specified in two ways:

1. As a `(key, path)` tuple where `key` is a string specifying
   the S3 path and `path` is the path to a local file.

2. As a `S3PutObject` which contains additional metadata to be
   stored with the file.

Parameters
----------
key_paths :  List[Union[Tuple[str, PutValue], S3PutObject]]
    List of files to upload.
overwrite : bool, default True
    Overwrite the object if it exists. If set to False, the operation
    succeeds without uploading anything if the key already exists.

Returns
-------
List[Tuple[str, str]]
    List of `(key, url)` pairs corresponding to the files uploaded.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -566,7 +566,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 12, @@ -592,10 +592,10 @@ "id": "c7055621", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.706871Z", - "iopub.status.busy": "2024-07-25T06:16:55.706784Z", - "iopub.status.idle": "2024-07-25T06:16:55.709603Z", - "shell.execute_reply": "2024-07-25T06:16:55.709377Z" + "iopub.execute_input": "2025-08-30T21:15:33.284058Z", + "iopub.status.busy": "2025-08-30T21:15:33.283974Z", + "iopub.status.idle": "2025-08-30T21:15:33.286310Z", + "shell.execute_reply": "2025-08-30T21:15:33.286117Z" } }, "outputs": [ @@ -603,9 +603,9 @@ "data": { "text/html": [ "\n", - "

method S3.info (self, key: Optional[str] = None, return_missing: bool = False) -> metaflow.plugins.datatools.s3.s3.S3Object[source]

Get metadata about a single object in S3.

This call makes a single `HEAD` request to S3 which can be
much faster than downloading all data with `get`.

Parameters
----------
key : str, optional, default None
    Object to query. It can be an S3 url or a path suffix.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.

Returns
-------
S3Object
    An S3Object corresponding to the object requested. The object
    will have `.downloaded == False`.

\n", + "

method S3.info (self, key: Optional[str] = None, return_missing: bool = False) -> metaflow.plugins.datatools.s3.s3.S3Object[source]

Get metadata about a single object in S3.

This call makes a single `HEAD` request to S3 which can be
much faster than downloading all data with `get`.

Parameters
----------
key : str, optional, default None
    Object to query. It can be an S3 url or a path suffix.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.

Returns
-------
S3Object
    An S3Object corresponding to the object requested. The object
    will have `.downloaded == False`.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -620,7 +620,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 13, @@ -638,10 +638,10 @@ "id": "ba5e0f9b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.711568Z", - "iopub.status.busy": "2024-07-25T06:16:55.711473Z", - "iopub.status.idle": "2024-07-25T06:16:55.714093Z", - "shell.execute_reply": "2024-07-25T06:16:55.713851Z" + "iopub.execute_input": "2025-08-30T21:15:33.288359Z", + "iopub.status.busy": "2025-08-30T21:15:33.288272Z", + "iopub.status.idle": "2025-08-30T21:15:33.290748Z", + "shell.execute_reply": "2025-08-30T21:15:33.290529Z" } }, "outputs": [ @@ -649,9 +649,9 @@ "data": { "text/html": [ "\n", - "

method S3.info_many (self, keys: Iterable[str], return_missing: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get metadata about many objects in S3 in parallel.

This call makes a single `HEAD` request to S3 which can be
much faster than downloading all data with `get`.

Parameters
----------
keys : Iterable[str]
    Objects to query. Each key can be an S3 url or a path suffix.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.

Returns
-------
List[S3Object]
    A list of S3Objects corresponding to the paths requested. The
    objects will have `.downloaded == False`.

\n", + "

method S3.info_many (self, keys: Iterable[str], return_missing: bool = False) -> List[metaflow.plugins.datatools.s3.s3.S3Object][source]

Get metadata about many objects in S3 in parallel.

This call makes a single `HEAD` request to S3 which can be
much faster than downloading all data with `get`.

Parameters
----------
keys : Iterable[str]
    Objects to query. Each key can be an S3 url or a path suffix.
return_missing : bool, default False
    If set to True, do not raise an exception for a missing key but
    return it as an `S3Object` with `.exists == False`.

Returns
-------
List[S3Object]
    A list of S3Objects corresponding to the paths requested. The
    objects will have `.downloaded == False`.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -666,7 +666,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 14, @@ -696,10 +696,10 @@ "id": "788746a9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.716393Z", - "iopub.status.busy": "2024-07-25T06:16:55.716303Z", - "iopub.status.idle": "2024-07-25T06:16:55.724697Z", - "shell.execute_reply": "2024-07-25T06:16:55.724457Z" + "iopub.execute_input": "2025-08-30T21:15:33.293260Z", + "iopub.status.busy": "2025-08-30T21:15:33.293138Z", + "iopub.status.idle": "2025-08-30T21:15:33.301049Z", + "shell.execute_reply": "2025-08-30T21:15:33.300857Z" }, "scrolled": true }, @@ -708,9 +708,9 @@ "data": { "text/html": [ "\n", - "

class S3Object [source]

This object represents a path or an object in S3,
with an optional local copy.

`S3Object`s are not instantiated directly, but they are returned
by many methods of the `S3` client.

\n", + "

class S3Object [source]

This object represents a path or an object in S3,
with an optional local copy.

`S3Object`s are not instantiated directly, but they are returned
by many methods of the `S3` client.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -719,7 +719,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 15, @@ -737,10 +737,10 @@ "id": "d9e06053", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.726726Z", - "iopub.status.busy": "2024-07-25T06:16:55.726637Z", - "iopub.status.idle": "2024-07-25T06:16:55.728693Z", - "shell.execute_reply": "2024-07-25T06:16:55.728433Z" + "iopub.execute_input": "2025-08-30T21:15:33.303319Z", + "iopub.status.busy": "2025-08-30T21:15:33.303228Z", + "iopub.status.idle": "2025-08-30T21:15:33.305041Z", + "shell.execute_reply": "2025-08-30T21:15:33.304860Z" }, "scrolled": false }, @@ -760,7 +760,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 16, @@ -778,10 +778,10 @@ "id": "a6b91a6f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.731022Z", - "iopub.status.busy": "2024-07-25T06:16:55.730936Z", - "iopub.status.idle": "2024-07-25T06:16:55.732901Z", - "shell.execute_reply": "2024-07-25T06:16:55.732647Z" + "iopub.execute_input": "2025-08-30T21:15:33.307057Z", + "iopub.status.busy": "2025-08-30T21:15:33.306985Z", + "iopub.status.idle": "2025-08-30T21:15:33.308743Z", + "shell.execute_reply": "2025-08-30T21:15:33.308555Z" }, "scrolled": false }, @@ -801,7 +801,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 17, @@ -819,10 +819,10 @@ "id": "7c4203ec", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.735121Z", - "iopub.status.busy": "2024-07-25T06:16:55.735033Z", - "iopub.status.idle": "2024-07-25T06:16:55.737100Z", - "shell.execute_reply": "2024-07-25T06:16:55.736876Z" + "iopub.execute_input": "2025-08-30T21:15:33.310666Z", + "iopub.status.busy": "2025-08-30T21:15:33.310591Z", + "iopub.status.idle": "2025-08-30T21:15:33.312322Z", + "shell.execute_reply": "2025-08-30T21:15:33.312137Z" }, "scrolled": false }, @@ -842,7 +842,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 18, @@ -860,10 +860,10 @@ "id": "e84a384d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.739260Z", - "iopub.status.busy": "2024-07-25T06:16:55.739170Z", - "iopub.status.idle": "2024-07-25T06:16:55.741240Z", - "shell.execute_reply": "2024-07-25T06:16:55.740994Z" + "iopub.execute_input": "2025-08-30T21:15:33.314225Z", + "iopub.status.busy": "2025-08-30T21:15:33.314154Z", + "iopub.status.idle": "2025-08-30T21:15:33.315857Z", + "shell.execute_reply": "2025-08-30T21:15:33.315679Z" }, "scrolled": false }, @@ -883,7 +883,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 19, @@ -901,10 +901,10 @@ "id": "eeb28c6b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.743302Z", - "iopub.status.busy": "2024-07-25T06:16:55.743199Z", - "iopub.status.idle": "2024-07-25T06:16:55.745316Z", - "shell.execute_reply": "2024-07-25T06:16:55.745046Z" + "iopub.execute_input": "2025-08-30T21:15:33.318038Z", + "iopub.status.busy": "2025-08-30T21:15:33.317966Z", + "iopub.status.idle": "2025-08-30T21:15:33.319678Z", + "shell.execute_reply": "2025-08-30T21:15:33.319520Z" }, "scrolled": false }, @@ -924,7 +924,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 20, @@ -942,10 +942,10 @@ "id": "71ae77e4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.747389Z", - "iopub.status.busy": "2024-07-25T06:16:55.747309Z", - "iopub.status.idle": "2024-07-25T06:16:55.749317Z", - "shell.execute_reply": "2024-07-25T06:16:55.749047Z" + "iopub.execute_input": "2025-08-30T21:15:33.321765Z", + "iopub.status.busy": "2025-08-30T21:15:33.321689Z", + "iopub.status.idle": "2025-08-30T21:15:33.323405Z", + "shell.execute_reply": "2025-08-30T21:15:33.323223Z" }, "scrolled": true }, @@ -965,7 +965,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 21, @@ -983,10 +983,10 @@ "id": "2be25e64", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.751525Z", - "iopub.status.busy": "2024-07-25T06:16:55.751441Z", - "iopub.status.idle": "2024-07-25T06:16:55.753445Z", - "shell.execute_reply": "2024-07-25T06:16:55.753202Z" + "iopub.execute_input": "2025-08-30T21:15:33.325452Z", + "iopub.status.busy": "2025-08-30T21:15:33.325371Z", + "iopub.status.idle": "2025-08-30T21:15:33.327141Z", + "shell.execute_reply": "2025-08-30T21:15:33.326932Z" }, "scrolled": true }, @@ -1006,7 +1006,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 22, @@ -1024,10 +1024,10 @@ "id": "4f20cd6a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.755533Z", - "iopub.status.busy": "2024-07-25T06:16:55.755449Z", - "iopub.status.idle": "2024-07-25T06:16:55.757428Z", - "shell.execute_reply": "2024-07-25T06:16:55.757176Z" + "iopub.execute_input": "2025-08-30T21:15:33.329132Z", + "iopub.status.busy": "2025-08-30T21:15:33.329061Z", + "iopub.status.idle": "2025-08-30T21:15:33.330763Z", + "shell.execute_reply": "2025-08-30T21:15:33.330578Z" }, "scrolled": false }, @@ -1047,7 +1047,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 23, @@ -1065,10 +1065,10 @@ "id": "86be736c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.759514Z", - "iopub.status.busy": "2024-07-25T06:16:55.759426Z", - "iopub.status.idle": "2024-07-25T06:16:55.761395Z", - "shell.execute_reply": "2024-07-25T06:16:55.761162Z" + "iopub.execute_input": "2025-08-30T21:15:33.332654Z", + "iopub.status.busy": "2025-08-30T21:15:33.332586Z", + "iopub.status.idle": "2025-08-30T21:15:33.334208Z", + "shell.execute_reply": "2025-08-30T21:15:33.334030Z" }, "scrolled": true }, @@ -1088,7 +1088,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 24, @@ -1106,10 +1106,10 @@ "id": "ac57517e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.763551Z", - "iopub.status.busy": "2024-07-25T06:16:55.763464Z", - "iopub.status.idle": "2024-07-25T06:16:55.765486Z", - "shell.execute_reply": "2024-07-25T06:16:55.765246Z" + "iopub.execute_input": "2025-08-30T21:15:33.336272Z", + "iopub.status.busy": "2025-08-30T21:15:33.336203Z", + "iopub.status.idle": "2025-08-30T21:15:33.337936Z", + "shell.execute_reply": "2025-08-30T21:15:33.337712Z" }, "scrolled": false }, @@ -1129,7 +1129,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 25, @@ -1147,10 +1147,10 @@ "id": "c73642bb", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.767601Z", - "iopub.status.busy": "2024-07-25T06:16:55.767518Z", - "iopub.status.idle": "2024-07-25T06:16:55.769503Z", - "shell.execute_reply": "2024-07-25T06:16:55.769254Z" + "iopub.execute_input": "2025-08-30T21:15:33.339900Z", + "iopub.status.busy": "2025-08-30T21:15:33.339829Z", + "iopub.status.idle": "2025-08-30T21:15:33.341734Z", + "shell.execute_reply": "2025-08-30T21:15:33.341539Z" }, "scrolled": true }, @@ -1170,7 +1170,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 26, @@ -1188,10 +1188,10 @@ "id": "82f36246", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.771721Z", - "iopub.status.busy": "2024-07-25T06:16:55.771626Z", - "iopub.status.idle": "2024-07-25T06:16:55.773673Z", - "shell.execute_reply": "2024-07-25T06:16:55.773403Z" + "iopub.execute_input": "2025-08-30T21:15:33.343793Z", + "iopub.status.busy": "2025-08-30T21:15:33.343719Z", + "iopub.status.idle": "2025-08-30T21:15:33.345451Z", + "shell.execute_reply": "2025-08-30T21:15:33.345267Z" }, "scrolled": true }, @@ -1211,7 +1211,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 27, @@ -1229,10 +1229,10 @@ "id": "1c06cf68", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.775747Z", - "iopub.status.busy": "2024-07-25T06:16:55.775653Z", - "iopub.status.idle": "2024-07-25T06:16:55.777621Z", - "shell.execute_reply": "2024-07-25T06:16:55.777370Z" + "iopub.execute_input": "2025-08-30T21:15:33.347585Z", + "iopub.status.busy": "2025-08-30T21:15:33.347504Z", + "iopub.status.idle": "2025-08-30T21:15:33.349294Z", + "shell.execute_reply": "2025-08-30T21:15:33.349118Z" }, "scrolled": false }, @@ -1249,7 +1249,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 28, @@ -1267,10 +1267,10 @@ "id": "22068e78", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.779811Z", - "iopub.status.busy": "2024-07-25T06:16:55.779711Z", - "iopub.status.idle": "2024-07-25T06:16:55.781811Z", - "shell.execute_reply": "2024-07-25T06:16:55.781562Z" + "iopub.execute_input": "2025-08-30T21:15:33.351318Z", + "iopub.status.busy": "2025-08-30T21:15:33.351248Z", + "iopub.status.idle": "2025-08-30T21:15:33.352937Z", + "shell.execute_reply": "2025-08-30T21:15:33.352761Z" }, "scrolled": true }, @@ -1290,7 +1290,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 29, @@ -1318,10 +1318,10 @@ "id": "92683b71", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.783914Z", - "iopub.status.busy": "2024-07-25T06:16:55.783825Z", - "iopub.status.idle": "2024-07-25T06:16:55.786436Z", - "shell.execute_reply": "2024-07-25T06:16:55.786196Z" + "iopub.execute_input": "2025-08-30T21:15:33.354982Z", + "iopub.status.busy": "2025-08-30T21:15:33.354911Z", + "iopub.status.idle": "2025-08-30T21:15:33.357068Z", + "shell.execute_reply": "2025-08-30T21:15:33.356890Z" } }, "outputs": [], @@ -1389,10 +1389,10 @@ "id": "6aaa4c89", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.788422Z", - "iopub.status.busy": "2024-07-25T06:16:55.788224Z", - "iopub.status.idle": "2024-07-25T06:16:55.790532Z", - "shell.execute_reply": "2024-07-25T06:16:55.790291Z" + "iopub.execute_input": "2025-08-30T21:15:33.358677Z", + "iopub.status.busy": "2025-08-30T21:15:33.358599Z", + "iopub.status.idle": "2025-08-30T21:15:33.360542Z", + "shell.execute_reply": "2025-08-30T21:15:33.360354Z" }, "scrolled": true }, @@ -1416,7 +1416,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 31, @@ -1434,10 +1434,10 @@ "id": "a0bbfe37", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.792669Z", - "iopub.status.busy": "2024-07-25T06:16:55.792581Z", - "iopub.status.idle": "2024-07-25T06:16:55.794839Z", - "shell.execute_reply": "2024-07-25T06:16:55.794610Z" + "iopub.execute_input": "2025-08-30T21:15:33.362577Z", + "iopub.status.busy": "2025-08-30T21:15:33.362502Z", + "iopub.status.idle": "2025-08-30T21:15:33.364441Z", + "shell.execute_reply": "2025-08-30T21:15:33.364246Z" }, "scrolled": false }, @@ -1463,7 +1463,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 32, diff --git a/docs/api/S3.md b/docs/api/S3.md index b8f5ab22..f6d74c33 100644 --- a/docs/api/S3.md +++ b/docs/api/S3.md @@ -15,7 +15,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d ## The `S3` client - + @@ -31,7 +31,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -43,7 +43,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d ## Downloading data - + @@ -60,7 +60,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -77,7 +77,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -93,7 +93,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -110,7 +110,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d ## Listing objects - + @@ -125,7 +125,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -142,7 +142,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d ## Uploading data - + @@ -161,7 +161,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -177,7 +177,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -195,7 +195,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d ## Querying metadata - + @@ -211,7 +211,7 @@ For instructions how to use the class, see [Loading and Storing Data](/scaling/d - + @@ -233,7 +233,7 @@ Most operations above return `S3Object`s that encapsulate information about S3 p Note that the data itself is not kept in these objects but it is stored in a temporary directory which is accessible through the properties of this object. - + diff --git a/docs/api/argoevent.ipynb b/docs/api/argoevent.ipynb index 7cbff70a..e71c50e8 100644 --- a/docs/api/argoevent.ipynb +++ b/docs/api/argoevent.ipynb @@ -21,10 +21,10 @@ "id": "623be2bc", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.861281Z", - "iopub.status.busy": "2024-07-25T06:16:57.861167Z", - "iopub.status.idle": "2024-07-25T06:16:58.142243Z", - "shell.execute_reply": "2024-07-25T06:16:58.141916Z" + "iopub.execute_input": "2025-08-30T21:15:35.980208Z", + "iopub.status.busy": "2025-08-30T21:15:35.980109Z", + "iopub.status.idle": "2025-08-30T21:15:36.215677Z", + "shell.execute_reply": "2025-08-30T21:15:36.215432Z" } }, "outputs": [], @@ -43,10 +43,10 @@ "id": "1bf3f08e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.144720Z", - "iopub.status.busy": "2024-07-25T06:16:58.144574Z", - "iopub.status.idle": "2024-07-25T06:16:58.150537Z", - "shell.execute_reply": "2024-07-25T06:16:58.150220Z" + "iopub.execute_input": "2025-08-30T21:15:36.217828Z", + "iopub.status.busy": "2025-08-30T21:15:36.217681Z", + "iopub.status.idle": "2025-08-30T21:15:36.223338Z", + "shell.execute_reply": "2025-08-30T21:15:36.223136Z" } }, "outputs": [ @@ -69,7 +69,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -87,10 +87,10 @@ "id": "380eb9ac", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.152753Z", - "iopub.status.busy": "2024-07-25T06:16:58.152639Z", - "iopub.status.idle": "2024-07-25T06:16:58.155420Z", - "shell.execute_reply": "2024-07-25T06:16:58.155095Z" + "iopub.execute_input": "2025-08-30T21:15:36.225225Z", + "iopub.status.busy": "2025-08-30T21:15:36.225147Z", + "iopub.status.idle": "2025-08-30T21:15:36.227365Z", + "shell.execute_reply": "2025-08-30T21:15:36.227159Z" } }, "outputs": [ @@ -112,7 +112,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -130,10 +130,10 @@ "id": "31916bae", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.157788Z", - "iopub.status.busy": "2024-07-25T06:16:58.157687Z", - "iopub.status.idle": "2024-07-25T06:16:58.160981Z", - "shell.execute_reply": "2024-07-25T06:16:58.160604Z" + "iopub.execute_input": "2025-08-30T21:15:36.229396Z", + "iopub.status.busy": "2025-08-30T21:15:36.229326Z", + "iopub.status.idle": "2025-08-30T21:15:36.231974Z", + "shell.execute_reply": "2025-08-30T21:15:36.231765Z" } }, "outputs": [ @@ -155,7 +155,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -173,10 +173,10 @@ "id": "305678d7", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.163921Z", - "iopub.status.busy": "2024-07-25T06:16:58.163779Z", - "iopub.status.idle": "2024-07-25T06:16:58.166454Z", - "shell.execute_reply": "2024-07-25T06:16:58.166148Z" + "iopub.execute_input": "2025-08-30T21:15:36.234080Z", + "iopub.status.busy": "2025-08-30T21:15:36.233990Z", + "iopub.status.idle": "2025-08-30T21:15:36.236173Z", + "shell.execute_reply": "2025-08-30T21:15:36.235982Z" } }, "outputs": [ @@ -198,7 +198,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, diff --git a/docs/api/cards.ipynb b/docs/api/cards.ipynb index 7b032be9..7947dd07 100644 --- a/docs/api/cards.ipynb +++ b/docs/api/cards.ipynb @@ -28,10 +28,10 @@ "id": "a5ef9454", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.341488Z", - "iopub.status.busy": "2024-07-25T06:16:56.341371Z", - "iopub.status.idle": "2024-07-25T06:16:56.616466Z", - "shell.execute_reply": "2024-07-25T06:16:56.616152Z" + "iopub.execute_input": "2025-08-30T21:15:37.006885Z", + "iopub.status.busy": "2025-08-30T21:15:37.006809Z", + "iopub.status.idle": "2025-08-30T21:15:37.233431Z", + "shell.execute_reply": "2025-08-30T21:15:37.233162Z" } }, "outputs": [], @@ -63,10 +63,10 @@ "id": "09970e68", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.618794Z", - "iopub.status.busy": "2024-07-25T06:16:56.618638Z", - "iopub.status.idle": "2024-07-25T06:16:56.623639Z", - "shell.execute_reply": "2024-07-25T06:16:56.623386Z" + "iopub.execute_input": "2025-08-30T21:15:37.235529Z", + "iopub.status.busy": "2025-08-30T21:15:37.235392Z", + "iopub.status.idle": "2025-08-30T21:15:37.240298Z", + "shell.execute_reply": "2025-08-30T21:15:37.240077Z" } }, "outputs": [ @@ -93,7 +93,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -111,10 +111,10 @@ "id": "c65fa811", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.625620Z", - "iopub.status.busy": "2024-07-25T06:16:56.625534Z", - "iopub.status.idle": "2024-07-25T06:16:56.629835Z", - "shell.execute_reply": "2024-07-25T06:16:56.629594Z" + "iopub.execute_input": "2025-08-30T21:15:37.242304Z", + "iopub.status.busy": "2025-08-30T21:15:37.242214Z", + "iopub.status.idle": "2025-08-30T21:15:37.246532Z", + "shell.execute_reply": "2025-08-30T21:15:37.246324Z" } }, "outputs": [ @@ -133,7 +133,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -151,10 +151,10 @@ "id": "de91c8c9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.631875Z", - "iopub.status.busy": "2024-07-25T06:16:56.631777Z", - "iopub.status.idle": "2024-07-25T06:16:56.635751Z", - "shell.execute_reply": "2024-07-25T06:16:56.635509Z" + "iopub.execute_input": "2025-08-30T21:15:37.248757Z", + "iopub.status.busy": "2025-08-30T21:15:37.248667Z", + "iopub.status.idle": "2025-08-30T21:15:37.252228Z", + "shell.execute_reply": "2025-08-30T21:15:37.252018Z" } }, "outputs": [ @@ -173,7 +173,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -191,10 +191,10 @@ "id": "6d76de09", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.637758Z", - "iopub.status.busy": "2024-07-25T06:16:56.637685Z", - "iopub.status.idle": "2024-07-25T06:16:56.640359Z", - "shell.execute_reply": "2024-07-25T06:16:56.640090Z" + "iopub.execute_input": "2025-08-30T21:15:37.254259Z", + "iopub.status.busy": "2025-08-30T21:15:37.254187Z", + "iopub.status.idle": "2025-08-30T21:15:37.256300Z", + "shell.execute_reply": "2025-08-30T21:15:37.256095Z" } }, "outputs": [ @@ -215,7 +215,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -233,10 +233,10 @@ "id": "69f97d00", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.642585Z", - "iopub.status.busy": "2024-07-25T06:16:56.642501Z", - "iopub.status.idle": "2024-07-25T06:16:56.644862Z", - "shell.execute_reply": "2024-07-25T06:16:56.644648Z" + "iopub.execute_input": "2025-08-30T21:15:37.258353Z", + "iopub.status.busy": "2025-08-30T21:15:37.258275Z", + "iopub.status.idle": "2025-08-30T21:15:37.260402Z", + "shell.execute_reply": "2025-08-30T21:15:37.260194Z" } }, "outputs": [ @@ -255,7 +255,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -285,10 +285,10 @@ "id": "53a776e9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.647190Z", - "iopub.status.busy": "2024-07-25T06:16:56.647106Z", - "iopub.status.idle": "2024-07-25T06:16:56.648657Z", - "shell.execute_reply": "2024-07-25T06:16:56.648403Z" + "iopub.execute_input": "2025-08-30T21:15:37.262154Z", + "iopub.status.busy": "2025-08-30T21:15:37.262076Z", + "iopub.status.idle": "2025-08-30T21:15:37.263400Z", + "shell.execute_reply": "2025-08-30T21:15:37.263191Z" } }, "outputs": [], @@ -312,10 +312,10 @@ "id": "752c4a43", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.650644Z", - "iopub.status.busy": "2024-07-25T06:16:56.650537Z", - "iopub.status.idle": "2024-07-25T06:16:56.657385Z", - "shell.execute_reply": "2024-07-25T06:16:56.657142Z" + "iopub.execute_input": "2025-08-30T21:15:37.265195Z", + "iopub.status.busy": "2025-08-30T21:15:37.265122Z", + "iopub.status.idle": "2025-08-30T21:15:37.270700Z", + "shell.execute_reply": "2025-08-30T21:15:37.270518Z" }, "scrolled": true }, @@ -324,9 +324,9 @@ "data": { "text/html": [ "\n", - "

class Markdown (text=None)[source]

metaflow.cards

A block of text formatted in Markdown.

Example:
```
current.card.append(
    Markdown(\"# This is a header appended from `@step` code\")
)
```

Parameters
----------
text : str
    Text formatted in Markdown.

\n", + "

class Markdown (text=None)[source]

metaflow.cards

A block of text formatted in Markdown.

Example:
```
current.card.append(
    Markdown(\"# This is a header appended from `@step` code\")
)
```

Parameters
----------
text : str
    Text formatted in Markdown.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -337,7 +337,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 8, @@ -355,10 +355,10 @@ "id": "d9210681", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.659459Z", - "iopub.status.busy": "2024-07-25T06:16:56.659374Z", - "iopub.status.idle": "2024-07-25T06:16:56.661631Z", - "shell.execute_reply": "2024-07-25T06:16:56.661423Z" + "iopub.execute_input": "2025-08-30T21:15:37.272377Z", + "iopub.status.busy": "2025-08-30T21:15:37.272294Z", + "iopub.status.idle": "2025-08-30T21:15:37.274435Z", + "shell.execute_reply": "2025-08-30T21:15:37.274241Z" } }, "outputs": [ @@ -366,9 +366,9 @@ "data": { "text/html": [ "\n", - "

method Markdown.update (self, text=None)[source]

#FIXME document

\n", + "

method Markdown.update (self, text=None)[source]

#FIXME document

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -377,7 +377,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -403,10 +403,10 @@ "id": "9c198774", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.663827Z", - "iopub.status.busy": "2024-07-25T06:16:56.663744Z", - "iopub.status.idle": "2024-07-25T06:16:56.670992Z", - "shell.execute_reply": "2024-07-25T06:16:56.670722Z" + "iopub.execute_input": "2025-08-30T21:15:37.276486Z", + "iopub.status.busy": "2025-08-30T21:15:37.276404Z", + "iopub.status.idle": "2025-08-30T21:15:37.283350Z", + "shell.execute_reply": "2025-08-30T21:15:37.283126Z" } }, "outputs": [ @@ -414,9 +414,9 @@ "data": { "text/html": [ "\n", - "

class Image (src=None, label=None, disable_updates: bool = True)[source]

metaflow.cards

An image.

`Images can be created directly from PNG/JPG/GIF `bytes`, `PIL.Image`s,
or Matplotlib figures. Note that the image data is embedded in the card,
so no external files are required to show the image.

Example: Create an `Image` from bytes:
```
current.card.append(
    Image(
        requests.get(\"https://www.gif-vif.com/hacker-cat.gif\").content,
        \"Image From Bytes\"
    )
)
```

Example: Create an `Image` from a Matplotlib figure
```
import pandas as pd
import numpy as np
current.card.append(
    Image.from_matplotlib(
        pandas.DataFrame(
            np.random.randint(0, 100, size=(15, 4)),
            columns=list(\"ABCD\"),
        ).plot()
    )
)
```

Example: Create an `Image` from a [PIL](https://pillow.readthedocs.io/) Image
```
from PIL import Image as PILImage
current.card.append(
    Image.from_pil_image(
        PILImage.fromarray(np.random.randn(1024, 768), \"RGB\"),
        \"From PIL Image\"
    )
)
```

Parameters
----------
src : bytes
    The image data in `bytes`.
label : str
    Optional label for the image.

\n", + "

class Image (src=None, label=None, disable_updates: bool = True)[source]

metaflow.cards

An image.

`Images can be created directly from PNG/JPG/GIF `bytes`, `PIL.Image`s,
or Matplotlib figures. Note that the image data is embedded in the card,
so no external files are required to show the image.

Example: Create an `Image` from bytes:
```
current.card.append(
    Image(
        requests.get(\"https://www.gif-vif.com/hacker-cat.gif\").content,
        \"Image From Bytes\"
    )
)
```

Example: Create an `Image` from a Matplotlib figure
```
import pandas as pd
import numpy as np
current.card.append(
    Image.from_matplotlib(
        pandas.DataFrame(
            np.random.randint(0, 100, size=(15, 4)),
            columns=list(\"ABCD\"),
        ).plot()
    )
)
```

Example: Create an `Image` from a [PIL](https://pillow.readthedocs.io/) Image
```
from PIL import Image as PILImage
current.card.append(
    Image.from_pil_image(
        PILImage.fromarray(np.random.randn(1024, 768), \"RGB\"),
        \"From PIL Image\"
    )
)
```

Parameters
----------
src : bytes
    The image data in `bytes`.
label : str
    Optional label for the image.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -431,7 +431,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -449,10 +449,10 @@ "id": "da3999a9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.672822Z", - "iopub.status.busy": "2024-07-25T06:16:56.672729Z", - "iopub.status.idle": "2024-07-25T06:16:56.675433Z", - "shell.execute_reply": "2024-07-25T06:16:56.675083Z" + "iopub.execute_input": "2025-08-30T21:15:37.285182Z", + "iopub.status.busy": "2025-08-30T21:15:37.285088Z", + "iopub.status.idle": "2025-08-30T21:15:37.287431Z", + "shell.execute_reply": "2025-08-30T21:15:37.287242Z" }, "scrolled": true }, @@ -461,9 +461,9 @@ "data": { "text/html": [ "\n", - "

method Image.from_matplotlib (plot, label: Optional[str] = None, disable_updates: bool = False)[source]

Create an `Image` from a Matplotlib plot.

Parameters
----------
plot :  matplotlib.figure.Figure or matplotlib.axes.Axes or matplotlib.axes._subplots.AxesSubplot
    a PIL axes (plot) object.
label : str, optional
    Optional label for the image.

\n", + "

method Image.from_matplotlib (plot, label: Optional[str] = None, disable_updates: bool = False)[source]

Create an `Image` from a Matplotlib plot.

Parameters
----------
plot :  matplotlib.figure.Figure or matplotlib.axes.Axes or matplotlib.axes._subplots.AxesSubplot
    a PIL axes (plot) object.
label : str, optional
    Optional label for the image.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -475,7 +475,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 11, @@ -493,10 +493,10 @@ "id": "432ec2c6", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.677654Z", - "iopub.status.busy": "2024-07-25T06:16:56.677554Z", - "iopub.status.idle": "2024-07-25T06:16:56.680038Z", - "shell.execute_reply": "2024-07-25T06:16:56.679807Z" + "iopub.execute_input": "2025-08-30T21:15:37.289165Z", + "iopub.status.busy": "2025-08-30T21:15:37.289080Z", + "iopub.status.idle": "2025-08-30T21:15:37.291349Z", + "shell.execute_reply": "2025-08-30T21:15:37.291165Z" }, "scrolled": true }, @@ -505,9 +505,9 @@ "data": { "text/html": [ "\n", - "

method Image.from_pil_image (pilimage, label: Optional[str] = None, disable_updates: bool = False)[source]

Create an `Image` from a PIL image.

Parameters
----------
pilimage : PIL.Image
    a PIL image object.
label : str, optional
    Optional label for the image.

\n", + "

method Image.from_pil_image (pilimage, label: Optional[str] = None, disable_updates: bool = False)[source]

Create an `Image` from a PIL image.

Parameters
----------
pilimage : PIL.Image
    a PIL image object.
label : str, optional
    Optional label for the image.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -519,7 +519,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 12, @@ -545,10 +545,10 @@ "id": "da4cf95b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.682447Z", - "iopub.status.busy": "2024-07-25T06:16:56.682350Z", - "iopub.status.idle": "2024-07-25T06:16:56.687254Z", - "shell.execute_reply": "2024-07-25T06:16:56.687007Z" + "iopub.execute_input": "2025-08-30T21:15:37.293106Z", + "iopub.status.busy": "2025-08-30T21:15:37.293029Z", + "iopub.status.idle": "2025-08-30T21:15:37.297400Z", + "shell.execute_reply": "2025-08-30T21:15:37.297198Z" } }, "outputs": [ @@ -556,9 +556,9 @@ "data": { "text/html": [ "\n", - "

class Artifact (artifact: Any, name: Optional[str] = None, compressed: bool = True)[source]

metaflow.cards

A pretty-printed version of any Python object.

Large objects are truncated using Python's built-in [`reprlib`](https://docs.python.org/3/library/reprlib.html).

Example:
```
from datetime import datetime
current.card.append(Artifact({'now': datetime.utcnow()}))
```

Parameters
----------
artifact : object
    Any Python object.
name : str, optional
    Optional label for the object.
compressed : bool, default: True
    Use a truncated representation.

\n", + "

class Artifact (artifact: Any, name: Optional[str] = None, compressed: bool = True)[source]

metaflow.cards

A pretty-printed version of any Python object.

Large objects are truncated using Python's built-in [`reprlib`](https://docs.python.org/3/library/reprlib.html).

Example:
```
from datetime import datetime
current.card.append(Artifact({'now': datetime.utcnow()}))
```

Parameters
----------
artifact : object
    Any Python object.
name : str, optional
    Optional label for the object.
compressed : bool, default: True
    Use a truncated representation.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -571,7 +571,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 13, @@ -597,10 +597,10 @@ "id": "7d1a016c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.689138Z", - "iopub.status.busy": "2024-07-25T06:16:56.689046Z", - "iopub.status.idle": "2024-07-25T06:16:56.694114Z", - "shell.execute_reply": "2024-07-25T06:16:56.693895Z" + "iopub.execute_input": "2025-08-30T21:15:37.299140Z", + "iopub.status.busy": "2025-08-30T21:15:37.299053Z", + "iopub.status.idle": "2025-08-30T21:15:37.304037Z", + "shell.execute_reply": "2025-08-30T21:15:37.303829Z" } }, "outputs": [ @@ -608,9 +608,9 @@ "data": { "text/html": [ "\n", - "

class Table (data: Optional[List[List[Union[str, metaflow.plugins.cards.card_modules.card.MetaflowCardComponent]]]] = None, headers: Optional[List[str]] = None, disable_updates: bool = False)[source]

metaflow.cards

A table.

The contents of the table can be text or numerical data, a Pandas dataframe,
or other card components: `Artifact`, `Image`, `Markdown` objects.

Example: Text and artifacts
```
from metaflow.cards import Table, Artifact
current.card.append(
    Table([
        ['first row', Artifact({'a': 2})],
        ['second row', Artifact(3)]
    ])
)
```

Example: Table from a Pandas dataframe
```
from metaflow.cards import Table
import pandas as pd
import numpy as np
current.card.append(
    Table.from_dataframe(
        pd.DataFrame(
            np.random.randint(0, 100, size=(15, 4)),
            columns=list(\"ABCD\")
        )
    )
)
```

Parameters
----------
data : List[List[str or MetaflowCardComponent]], optional
    List (rows) of lists (columns). Each item can be a string or a `MetaflowCardComponent`.
headers : List[str], optional
    Optional header row for the table.

\n", + "

class Table (data: Optional[List[List[Union[str, metaflow.plugins.cards.card_modules.card.MetaflowCardComponent]]]] = None, headers: Optional[List[str]] = None, disable_updates: bool = False)[source]

metaflow.cards

A table.

The contents of the table can be text or numerical data, a Pandas dataframe,
or other card components: `Artifact`, `Image`, `Markdown` objects.

Example: Text and artifacts
```
from metaflow.cards import Table, Artifact
current.card.append(
    Table([
        ['first row', Artifact({'a': 2})],
        ['second row', Artifact(3)]
    ])
)
```

Example: Table from a Pandas dataframe
```
from metaflow.cards import Table
import pandas as pd
import numpy as np
current.card.append(
    Table.from_dataframe(
        pd.DataFrame(
            np.random.randint(0, 100, size=(15, 4)),
            columns=list(\"ABCD\")
        )
    )
)
```

Parameters
----------
data : List[List[str or MetaflowCardComponent]], optional
    List (rows) of lists (columns). Each item can be a string or a `MetaflowCardComponent`.
headers : List[str], optional
    Optional header row for the table.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -625,7 +625,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 14, @@ -643,10 +643,10 @@ "id": "1e65fe26", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.695893Z", - "iopub.status.busy": "2024-07-25T06:16:56.695810Z", - "iopub.status.idle": "2024-07-25T06:16:56.698120Z", - "shell.execute_reply": "2024-07-25T06:16:56.697905Z" + "iopub.execute_input": "2025-08-30T21:15:37.306739Z", + "iopub.status.busy": "2025-08-30T21:15:37.306650Z", + "iopub.status.idle": "2025-08-30T21:15:37.309033Z", + "shell.execute_reply": "2025-08-30T21:15:37.308791Z" } }, "outputs": [ @@ -654,9 +654,9 @@ "data": { "text/html": [ "\n", - "

method Table.from_dataframe (dataframe=None, truncate: bool = True)[source]

Create a `Table` based on a Pandas dataframe.

Parameters
----------
dataframe : Optional[pandas.DataFrame]
    Pandas dataframe.
truncate : bool, default: True
    Truncate large dataframe instead of showing all rows (default: True).

\n", + "

method Table.from_dataframe (dataframe=None, truncate: bool = True)[source]

Create a `Table` based on a Pandas dataframe.

Parameters
----------
dataframe : Optional[pandas.DataFrame]
    Pandas dataframe.
truncate : bool, default: True
    Truncate large dataframe instead of showing all rows (default: True).

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -668,7 +668,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 15, @@ -694,10 +694,10 @@ "id": "95806342", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.699962Z", - "iopub.status.busy": "2024-07-25T06:16:56.699869Z", - "iopub.status.idle": "2024-07-25T06:16:56.709072Z", - "shell.execute_reply": "2024-07-25T06:16:56.708826Z" + "iopub.execute_input": "2025-08-30T21:15:37.313224Z", + "iopub.status.busy": "2025-08-30T21:15:37.313140Z", + "iopub.status.idle": "2025-08-30T21:15:37.322335Z", + "shell.execute_reply": "2025-08-30T21:15:37.322108Z" } }, "outputs": [ @@ -705,9 +705,9 @@ "data": { "text/html": [ "\n", - "

class VegaChart (spec: dict, show_controls: bool = False)[source]

metaflow.cards\n", + "

class VegaChart (spec: dict, show_controls: bool = False)[source]

metaflow.cards\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -716,7 +716,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 16, @@ -734,10 +734,10 @@ "id": "b852d84f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.710961Z", - "iopub.status.busy": "2024-07-25T06:16:56.710867Z", - "iopub.status.idle": "2024-07-25T06:16:56.713072Z", - "shell.execute_reply": "2024-07-25T06:16:56.712826Z" + "iopub.execute_input": "2025-08-30T21:15:37.324387Z", + "iopub.status.busy": "2025-08-30T21:15:37.324294Z", + "iopub.status.idle": "2025-08-30T21:15:37.326501Z", + "shell.execute_reply": "2025-08-30T21:15:37.326268Z" } }, "outputs": [ @@ -745,9 +745,9 @@ "data": { "text/html": [ "\n", - "

method VegaChart.from_altair_chart (altair_chart)[source]

\n", + "

method VegaChart.from_altair_chart (altair_chart)[source]

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -756,7 +756,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 17, @@ -774,10 +774,10 @@ "id": "1636aa90", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.715186Z", - "iopub.status.busy": "2024-07-25T06:16:56.715097Z", - "iopub.status.idle": "2024-07-25T06:16:56.717335Z", - "shell.execute_reply": "2024-07-25T06:16:56.717085Z" + "iopub.execute_input": "2025-08-30T21:15:37.329142Z", + "iopub.status.busy": "2025-08-30T21:15:37.329058Z", + "iopub.status.idle": "2025-08-30T21:15:37.331154Z", + "shell.execute_reply": "2025-08-30T21:15:37.330962Z" } }, "outputs": [ @@ -785,9 +785,9 @@ "data": { "text/html": [ "\n", - "

method VegaChart.update (self, spec=None)[source]

Update the chart.

Parameters
----------
spec : dict or altair.Chart
    The updated chart spec or an altair Chart Object.

\n", + "

method VegaChart.update (self, spec=None)[source]

Update the chart.

Parameters
----------
spec : dict or altair.Chart
    The updated chart spec or an altair Chart Object.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -798,7 +798,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 18, @@ -824,10 +824,10 @@ "id": "5484e8cc", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.719348Z", - "iopub.status.busy": "2024-07-25T06:16:56.719253Z", - "iopub.status.idle": "2024-07-25T06:16:56.725197Z", - "shell.execute_reply": "2024-07-25T06:16:56.724929Z" + "iopub.execute_input": "2025-08-30T21:15:37.333312Z", + "iopub.status.busy": "2025-08-30T21:15:37.333231Z", + "iopub.status.idle": "2025-08-30T21:15:37.338959Z", + "shell.execute_reply": "2025-08-30T21:15:37.338760Z" } }, "outputs": [ @@ -835,19 +835,19 @@ "data": { "text/html": [ "\n", - "

class ProgressBar (max: int = 100, label: str = None, value: int = 0, unit: str = None, metadata: str = None)[source]

metaflow.cards

A Progress bar for tracking progress of any task.

Example:
```
progress_bar = ProgressBar(
    max=100,
    label=\"Progress Bar\",
    value=0,
    unit=\"%\",
    metadata=\"0.1 items/s\"
)
current.card.append(
    progress_bar
)
for i in range(100):
    progress_bar.update(i, metadata=\"%s items/s\" % i)

```

Parameters
----------
max : int
    The maximum value of the progress bar.
label : str, optional
    Optional label for the progress bar.
value : int, optional
    Optional initial value of the progress bar.
unit : str, optional
    Optional unit for the progress bar.
metadata : str, optional
    Optional additional information to show on the progress bar.

\n", + "

class ProgressBar (max: int = 100, label: Optional[str] = None, value: int = 0, unit: Optional[str] = None, metadata: Optional[str] = None)[source]

metaflow.cards

A Progress bar for tracking progress of any task.

Example:
```
progress_bar = ProgressBar(
    max=100,
    label=\"Progress Bar\",
    value=0,
    unit=\"%\",
    metadata=\"0.1 items/s\"
)
current.card.append(
    progress_bar
)
for i in range(100):
    progress_bar.update(i, metadata=\"%s items/s\" % i)

```

Parameters
----------
max : int, default 100
    The maximum value of the progress bar.
label : str, optional, default None
    Optional label for the progress bar.
value : int, default 0
    Optional initial value of the progress bar.
unit : str, optional, default None
    Optional unit for the progress bar.
metadata : str, optional, default None
    Optional additional information to show on the progress bar.

\n", "
\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\n", "\n", - "\t\n", - "\t\n", - "\t\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", + "\t\n", + "\t\n", "\n", "\n", "\t\n", @@ -855,7 +855,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 19, @@ -873,10 +873,10 @@ "id": "7d6e7dab", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.727258Z", - "iopub.status.busy": "2024-07-25T06:16:56.727172Z", - "iopub.status.idle": "2024-07-25T06:16:56.729479Z", - "shell.execute_reply": "2024-07-25T06:16:56.729197Z" + "iopub.execute_input": "2025-08-30T21:15:37.341085Z", + "iopub.status.busy": "2025-08-30T21:15:37.340996Z", + "iopub.status.idle": "2025-08-30T21:15:37.342991Z", + "shell.execute_reply": "2025-08-30T21:15:37.342816Z" } }, "outputs": [ @@ -884,18 +884,18 @@ "data": { "text/html": [ "\n", - "

method ProgressBar.update (self, new_value: int, metadata: str = None)[source]

#FIXME document

\n", + "

method ProgressBar.update (self, new_value: int, metadata: Optional[str] = None)[source]

#FIXME document

\n", "
\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 20, @@ -925,10 +925,10 @@ "id": "3dc4881f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.731605Z", - "iopub.status.busy": "2024-07-25T06:16:56.731524Z", - "iopub.status.idle": "2024-07-25T06:16:56.732959Z", - "shell.execute_reply": "2024-07-25T06:16:56.732740Z" + "iopub.execute_input": "2025-08-30T21:15:37.344931Z", + "iopub.status.busy": "2025-08-30T21:15:37.344855Z", + "iopub.status.idle": "2025-08-30T21:15:37.346194Z", + "shell.execute_reply": "2025-08-30T21:15:37.346015Z" } }, "outputs": [], @@ -943,10 +943,10 @@ "id": "6210f683", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.734788Z", - "iopub.status.busy": "2024-07-25T06:16:56.734707Z", - "iopub.status.idle": "2024-07-25T06:16:56.737988Z", - "shell.execute_reply": "2024-07-25T06:16:56.737749Z" + "iopub.execute_input": "2025-08-30T21:15:37.347735Z", + "iopub.status.busy": "2025-08-30T21:15:37.347675Z", + "iopub.status.idle": "2025-08-30T21:15:37.350321Z", + "shell.execute_reply": "2025-08-30T21:15:37.350136Z" } }, "outputs": [ @@ -954,9 +954,9 @@ "data": { "text/html": [ "\n", - "

class MetaflowCard options[source]

metaflow.cards

Metaflow cards derive from this base class.

Subclasses of this class are called *card types*. The desired card
type `T` is defined in the `@card` decorator as `@card(type=T)`.

After a task with `@card(type=T, options=S)` finishes executing, Metaflow instantiates
a subclass `C` of `MetaflowCard` that has its `type` attribute set to `T`, i.e. `C.type=T`.
The constructor is given the options dictionary `S` that contains arbitrary
JSON-encodable data that is passed to the instance, parametrizing the card. The subclass
may override the constructor to capture and process the options.

The subclass needs to implement a `render(task)` method that produces the card
contents in HTML, given the finished task that is represented by a `Task` object.

Attributes
----------
type : str
    Card type string. Note that this should be a globally unique name, similar to a
    Python package name, to avoid name clashes between different custom cards.

Parameters
----------
options : Dict
    JSON-encodable dictionary containing user-definable options for the class.

\n", + "

class MetaflowCard options[source]

metaflow.cards

Metaflow cards derive from this base class.

Subclasses of this class are called *card types*. The desired card
type `T` is defined in the `@card` decorator as `@card(type=T)`.

After a task with `@card(type=T, options=S)` finishes executing, Metaflow instantiates
a subclass `C` of `MetaflowCard` that has its `type` attribute set to `T`, i.e. `C.type=T`.
The constructor is given the options dictionary `S` that contains arbitrary
JSON-encodable data that is passed to the instance, parametrizing the card. The subclass
may override the constructor to capture and process the options.

The subclass needs to implement a `render(task)` method that produces the card
contents in HTML, given the finished task that is represented by a `Task` object.

Attributes
----------
type : str
    Card type string. Note that this should be a globally unique name, similar to a
    Python package name, to avoid name clashes between different custom cards.

Parameters
----------
options : Dict
    JSON-encodable dictionary containing user-definable options for the class.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -970,7 +970,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 22, @@ -988,10 +988,10 @@ "id": "c71b83c5", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.739690Z", - "iopub.status.busy": "2024-07-25T06:16:56.739624Z", - "iopub.status.idle": "2024-07-25T06:16:56.818932Z", - "shell.execute_reply": "2024-07-25T06:16:56.818574Z" + "iopub.execute_input": "2025-08-30T21:15:37.352402Z", + "iopub.status.busy": "2025-08-30T21:15:37.352327Z", + "iopub.status.idle": "2025-08-30T21:15:37.417799Z", + "shell.execute_reply": "2025-08-30T21:15:37.417558Z" } }, "outputs": [ @@ -1014,7 +1014,7 @@ { "data": { "text/plain": [ - "" + "" ] }, "execution_count": 23, @@ -1032,10 +1032,10 @@ "id": "e2c444ee", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.821647Z", - "iopub.status.busy": "2024-07-25T06:16:56.821534Z", - "iopub.status.idle": "2024-07-25T06:16:56.824138Z", - "shell.execute_reply": "2024-07-25T06:16:56.823867Z" + "iopub.execute_input": "2025-08-30T21:15:37.419604Z", + "iopub.status.busy": "2025-08-30T21:15:37.419528Z", + "iopub.status.idle": "2025-08-30T21:15:37.421520Z", + "shell.execute_reply": "2025-08-30T21:15:37.421335Z" } }, "outputs": [ @@ -1043,9 +1043,9 @@ "data": { "text/html": [ "\n", - "

method MetaflowCard.render_runtime (self, task, data)[source]

\n", + "

method MetaflowCard.render_runtime (self, task, data)[source]

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1054,7 +1054,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 24, @@ -1072,10 +1072,10 @@ "id": "e360ce50", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.826778Z", - "iopub.status.busy": "2024-07-25T06:16:56.826668Z", - "iopub.status.idle": "2024-07-25T06:16:56.829343Z", - "shell.execute_reply": "2024-07-25T06:16:56.829068Z" + "iopub.execute_input": "2025-08-30T21:15:37.423756Z", + "iopub.status.busy": "2025-08-30T21:15:37.423671Z", + "iopub.status.idle": "2025-08-30T21:15:37.425660Z", + "shell.execute_reply": "2025-08-30T21:15:37.425474Z" } }, "outputs": [ @@ -1083,9 +1083,9 @@ "data": { "text/html": [ "\n", - "

method MetaflowCard.refresh (self, task, data)[source]

\n", + "

method MetaflowCard.refresh (self, task, data)[source]

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1094,7 +1094,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 25, diff --git a/docs/api/cards.md b/docs/api/cards.md index ca60828e..75d1a2a0 100644 --- a/docs/api/cards.md +++ b/docs/api/cards.md @@ -85,7 +85,7 @@ The components are added to cards in `@step` methods (or functions called from s ### Markdown - + @@ -97,7 +97,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -109,7 +109,7 @@ The components are added to cards in `@step` methods (or functions called from s ### Image - + @@ -125,7 +125,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -138,7 +138,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -153,7 +153,7 @@ The components are added to cards in `@step` methods (or functions called from s ### Artifact - + @@ -169,7 +169,7 @@ The components are added to cards in `@step` methods (or functions called from s ### Table - + @@ -185,7 +185,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -200,7 +200,7 @@ The components are added to cards in `@step` methods (or functions called from s ### VegaChart - + @@ -210,7 +210,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -220,7 +220,7 @@ The components are added to cards in `@step` methods (or functions called from s - + @@ -234,17 +234,17 @@ The components are added to cards in `@step` methods (or functions called from s ### ProgressBar - + - + - - - - - + + + + + @@ -253,9 +253,9 @@ The components are added to cards in `@step` methods (or functions called from s - + - + @@ -269,7 +269,7 @@ You can define custom cards types (`T` in `@card(type=T)`) by creating a Python Find detailed instructions, a starter template, and an example of a [simple static custom card](https://github.com/outerbounds/metaflow-card-html) and [an example of a dynamic card]( https://github.com/outerbounds/metaflow-card-scatter3d). - + @@ -346,13 +346,13 @@ ShowDoc(MetaflowCard.render) - + ``` - + @@ -362,7 +362,7 @@ ShowDoc(MetaflowCard.render) - + diff --git a/docs/api/client.ipynb b/docs/api/client.ipynb index c0029283..dae062b1 100644 --- a/docs/api/client.ipynb +++ b/docs/api/client.ipynb @@ -70,10 +70,10 @@ "id": "d1e0717d-dcdd-43f8-9116-1fbe15200503", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.350381Z", - "iopub.status.busy": "2024-07-25T06:16:57.350292Z", - "iopub.status.idle": "2024-07-25T06:16:57.577789Z", - "shell.execute_reply": "2024-07-25T06:16:57.577486Z" + "iopub.execute_input": "2025-08-30T21:15:33.967369Z", + "iopub.status.busy": "2025-08-30T21:15:33.967284Z", + "iopub.status.idle": "2025-08-30T21:15:34.145357Z", + "shell.execute_reply": "2025-08-30T21:15:34.145075Z" } }, "outputs": [], @@ -108,10 +108,10 @@ "id": "be46136f-68d0-4b73-b53c-a6da21023ea7", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.580144Z", - "iopub.status.busy": "2024-07-25T06:16:57.580007Z", - "iopub.status.idle": "2024-07-25T06:16:57.635667Z", - "shell.execute_reply": "2024-07-25T06:16:57.635201Z" + "iopub.execute_input": "2025-08-30T21:15:34.147424Z", + "iopub.status.busy": "2025-08-30T21:15:34.147299Z", + "iopub.status.idle": "2025-08-30T21:15:34.208485Z", + "shell.execute_reply": "2025-08-30T21:15:34.208247Z" } }, "outputs": [ @@ -119,11 +119,11 @@ "data": { "text/html": [ "\n", - "

class Metaflow ()[source]

Entry point to all objects in the Metaflow universe.

This object can be used to list all the flows present either through the explicit property
or by iterating over this object.

Attributes
----------
flows : List[Flow]
    Returns the list of all `Flow` objects known to this metadata provider. Note that only
    flows present in the current namespace will be returned. A `Flow` is present in a namespace
    if it has at least one run in the namespace.

\n", + "

class Metaflow (_current_metadata: Optional[str] = None)[source]

Entry point to all objects in the Metaflow universe.

This object can be used to list all the flows present either through the explicit property
or by iterating over this object.

Attributes
----------
flows : List[Flow]
    Returns the list of all `Flow` objects known to this metadata provider. Note that only
    flows present in the current namespace will be returned. A `Flow` is present in a namespace
    if it has at least one run in the namespace.

\n", "
\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\n", "\n", @@ -132,7 +132,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -160,10 +160,10 @@ "id": "aa03d75d-c0cc-4d5b-a65a-bab43e5061fe", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.639238Z", - "iopub.status.busy": "2024-07-25T06:16:57.639123Z", - "iopub.status.idle": "2024-07-25T06:16:57.649624Z", - "shell.execute_reply": "2024-07-25T06:16:57.649377Z" + "iopub.execute_input": "2025-08-30T21:15:34.210366Z", + "iopub.status.busy": "2025-08-30T21:15:34.210272Z", + "iopub.status.idle": "2025-08-30T21:15:34.220254Z", + "shell.execute_reply": "2025-08-30T21:15:34.220048Z" } }, "outputs": [ @@ -171,9 +171,9 @@ "data": { "text/html": [ "\n", - "

class Flow pathspec[source]

A Flow represents all existing flows with a certain name, in other words,
classes derived from `FlowSpec`. A container of `Run` objects.

Attributes
----------
latest_run : Run
    Latest `Run` (in progress or completed, successfully or not) of this flow.
latest_successful_run : Run
    Latest successfully completed `Run` of this flow.

\n", + "

class Flow pathspec[source]

A Flow represents all existing flows with a certain name, in other words,
classes derived from `FlowSpec`. A container of `Run` objects.

Attributes
----------
latest_run : Run
    Latest `Run` (in progress or completed, successfully or not) of this flow.
latest_successful_run : Run
    Latest successfully completed `Run` of this flow.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -185,7 +185,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -203,10 +203,10 @@ "id": "e0da0de1", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.651561Z", - "iopub.status.busy": "2024-07-25T06:16:57.651461Z", - "iopub.status.idle": "2024-07-25T06:16:57.654260Z", - "shell.execute_reply": "2024-07-25T06:16:57.653992Z" + "iopub.execute_input": "2025-08-30T21:15:34.222338Z", + "iopub.status.busy": "2025-08-30T21:15:34.222263Z", + "iopub.status.idle": "2025-08-30T21:15:34.224519Z", + "shell.execute_reply": "2025-08-30T21:15:34.224297Z" } }, "outputs": [ @@ -214,9 +214,9 @@ "data": { "text/html": [ "\n", - "

method Flow.runs (self, *tags: str) -> Iterator[metaflow.client.core.Run][source]

Returns an iterator over all `Run`s of this flow.

An optional filter is available that allows you to filter on tags.
If multiple tags are specified, only runs that have all the
specified tags are returned.

Parameters
----------
tags : str
    Tags to match.

Yields
------
Run
    `Run` objects in this flow.

\n", + "

method Flow.runs (self, *tags: str) -> Iterator[metaflow.client.core.Run][source]

Returns an iterator over all `Run`s of this flow.

An optional filter is available that allows you to filter on tags.
If multiple tags are specified, only runs that have all the
specified tags are returned.

Parameters
----------
tags : str
    Tags to match.

Yields
------
Run
    `Run` objects in this flow.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -230,7 +230,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -256,10 +256,10 @@ "id": "4302ed53", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.656361Z", - "iopub.status.busy": "2024-07-25T06:16:57.656263Z", - "iopub.status.idle": "2024-07-25T06:16:57.667437Z", - "shell.execute_reply": "2024-07-25T06:16:57.667078Z" + "iopub.execute_input": "2025-08-30T21:15:34.226615Z", + "iopub.status.busy": "2025-08-30T21:15:34.226535Z", + "iopub.status.idle": "2025-08-30T21:15:34.237013Z", + "shell.execute_reply": "2025-08-30T21:15:34.236784Z" } }, "outputs": [ @@ -267,9 +267,9 @@ "data": { "text/html": [ "\n", - "

class Run pathspec[source]

A `Run` represents an execution of a `Flow`. It is a container of `Step`s.

Attributes
----------
data : MetaflowData
    a shortcut to run['end'].task.data, i.e. data produced by this run.
successful : bool
    True if the run completed successfully.
finished : bool
    True if the run completed.
finished_at : datetime
    Time this run finished.
code : MetaflowCode
    Code package for this run (if present). See `MetaflowCode`.
trigger : MetaflowTrigger
    Information about event(s) that triggered this run (if present). See `MetaflowTrigger`.
end_task : Task
    `Task` for the end step (if it is present already).

\n", + "

class Run pathspec[source]

A `Run` represents an execution of a `Flow`. It is a container of `Step`s.

Attributes
----------
data : MetaflowData
    a shortcut to run['end'].task.data, i.e. data produced by this run.
successful : bool
    True if the run completed successfully.
finished : bool
    True if the run completed.
finished_at : datetime
    Time this run finished.
code : MetaflowCode
    Code package for this run (if present). See `MetaflowCode`.
trigger : MetaflowTrigger
    Information about event(s) that triggered this run (if present). See `MetaflowTrigger`.
end_task : Task
    `Task` for the end step (if it is present already).

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -286,7 +286,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -304,10 +304,10 @@ "id": "5d0ad8c7", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.670190Z", - "iopub.status.busy": "2024-07-25T06:16:57.670055Z", - "iopub.status.idle": "2024-07-25T06:16:57.672698Z", - "shell.execute_reply": "2024-07-25T06:16:57.672414Z" + "iopub.execute_input": "2025-08-30T21:15:34.238902Z", + "iopub.status.busy": "2025-08-30T21:15:34.238813Z", + "iopub.status.idle": "2025-08-30T21:15:34.241179Z", + "shell.execute_reply": "2025-08-30T21:15:34.240959Z" } }, "outputs": [ @@ -315,9 +315,9 @@ "data": { "text/html": [ "\n", - "

method Run.add_tag (self, tag: str)[source]

Add a tag to this `Run`.

Note that if the tag is already a system tag, it is not added as a user tag,
and no error is thrown.

Parameters
----------
tag : str
    Tag to add.

\n", + "

method Run.add_tag (self, tag: str)[source]

Add a tag to this `Run`.

Note that if the tag is already a system tag, it is not added as a user tag,
and no error is thrown.

Parameters
----------
tag : str
    Tag to add.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -328,7 +328,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -346,10 +346,10 @@ "id": "037da6e3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.675043Z", - "iopub.status.busy": "2024-07-25T06:16:57.674938Z", - "iopub.status.idle": "2024-07-25T06:16:57.677501Z", - "shell.execute_reply": "2024-07-25T06:16:57.677260Z" + "iopub.execute_input": "2025-08-30T21:15:34.242875Z", + "iopub.status.busy": "2025-08-30T21:15:34.242797Z", + "iopub.status.idle": "2025-08-30T21:15:34.244842Z", + "shell.execute_reply": "2025-08-30T21:15:34.244649Z" } }, "outputs": [ @@ -357,9 +357,9 @@ "data": { "text/html": [ "\n", - "

method Run.add_tags (self, tags: Iterable[str])[source]

Add one or more tags to this `Run`.

Note that if any tag is already a system tag, it is not added as a user tag
and no error is thrown.

Parameters
----------
tags : Iterable[str]
    Tags to add.

\n", + "

method Run.add_tags (self, tags: Iterable[str])[source]

Add one or more tags to this `Run`.

Note that if any tag is already a system tag, it is not added as a user tag
and no error is thrown.

Parameters
----------
tags : Iterable[str]
    Tags to add.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -370,7 +370,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 7, @@ -388,10 +388,10 @@ "id": "0ebd8693", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.679621Z", - "iopub.status.busy": "2024-07-25T06:16:57.679535Z", - "iopub.status.idle": "2024-07-25T06:16:57.682013Z", - "shell.execute_reply": "2024-07-25T06:16:57.681753Z" + "iopub.execute_input": "2025-08-30T21:15:34.246887Z", + "iopub.status.busy": "2025-08-30T21:15:34.246804Z", + "iopub.status.idle": "2025-08-30T21:15:34.248806Z", + "shell.execute_reply": "2025-08-30T21:15:34.248624Z" } }, "outputs": [ @@ -399,9 +399,9 @@ "data": { "text/html": [ "\n", - "

method Run.remove_tag (self, tag: str)[source]

Remove one tag from this `Run`.

Removing a system tag is an error. Removing a non-existent
user tag is a no-op.

Parameters
----------
tag : str
    Tag to remove.

\n", + "

method Run.remove_tag (self, tag: str)[source]

Remove one tag from this `Run`.

Removing a system tag is an error. Removing a non-existent
user tag is a no-op.

Parameters
----------
tag : str
    Tag to remove.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -412,7 +412,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 8, @@ -430,10 +430,10 @@ "id": "0dd3c17a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.683864Z", - "iopub.status.busy": "2024-07-25T06:16:57.683778Z", - "iopub.status.idle": "2024-07-25T06:16:57.686125Z", - "shell.execute_reply": "2024-07-25T06:16:57.685886Z" + "iopub.execute_input": "2025-08-30T21:15:34.250993Z", + "iopub.status.busy": "2025-08-30T21:15:34.250920Z", + "iopub.status.idle": "2025-08-30T21:15:34.252901Z", + "shell.execute_reply": "2025-08-30T21:15:34.252691Z" } }, "outputs": [ @@ -441,9 +441,9 @@ "data": { "text/html": [ "\n", - "

method Run.remove_tags (self, tags: Iterable[str])[source]

Remove one or more tags to this `Run`.

Removing a system tag will result in an error. Removing a non-existent
user tag is a no-op.

Parameters
----------
tags : Iterable[str]
    Tags to remove.

\n", + "

method Run.remove_tags (self, tags: Iterable[str])[source]

Remove one or more tags to this `Run`.

Removing a system tag will result in an error. Removing a non-existent
user tag is a no-op.

Parameters
----------
tags : Iterable[str]
    Tags to remove.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -454,7 +454,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -472,10 +472,10 @@ "id": "2e97ed7e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.688756Z", - "iopub.status.busy": "2024-07-25T06:16:57.688669Z", - "iopub.status.idle": "2024-07-25T06:16:57.691195Z", - "shell.execute_reply": "2024-07-25T06:16:57.690885Z" + "iopub.execute_input": "2025-08-30T21:15:34.255209Z", + "iopub.status.busy": "2025-08-30T21:15:34.255117Z", + "iopub.status.idle": "2025-08-30T21:15:34.257286Z", + "shell.execute_reply": "2025-08-30T21:15:34.257079Z" } }, "outputs": [ @@ -483,9 +483,9 @@ "data": { "text/html": [ "\n", - "

method Run.replace_tag (self, tag_to_remove: str, tag_to_add: str)[source]

Remove a tag and add a tag atomically. Removal is done first.
The rules for `Run.add_tag` and `Run.remove_tag` also apply here.

Parameters
----------
tag_to_remove : str
    Tag to remove.
tag_to_add : str
    Tag to add.

\n", + "

method Run.replace_tag (self, tag_to_remove: str, tag_to_add: str)[source]

Remove a tag and add a tag atomically. Removal is done first.
The rules for `Run.add_tag` and `Run.remove_tag` also apply here.

Parameters
----------
tag_to_remove : str
    Tag to remove.
tag_to_add : str
    Tag to add.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -497,7 +497,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -515,10 +515,10 @@ "id": "782ee581", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.693713Z", - "iopub.status.busy": "2024-07-25T06:16:57.693614Z", - "iopub.status.idle": "2024-07-25T06:16:57.696060Z", - "shell.execute_reply": "2024-07-25T06:16:57.695825Z" + "iopub.execute_input": "2025-08-30T21:15:34.259876Z", + "iopub.status.busy": "2025-08-30T21:15:34.259753Z", + "iopub.status.idle": "2025-08-30T21:15:34.262200Z", + "shell.execute_reply": "2025-08-30T21:15:34.261987Z" } }, "outputs": [ @@ -526,9 +526,9 @@ "data": { "text/html": [ "\n", - "

method Run.replace_tags (self, tags_to_remove: Iterable[str], tags_to_add: Iterable[str])[source]

Remove and add tags atomically; the removal is done first.
The rules for `Run.add_tag` and `Run.remove_tag` also apply here.

Parameters
----------
tags_to_remove : Iterable[str]
    Tags to remove.
tags_to_add : Iterable[str]
    Tags to add.

\n", + "

method Run.replace_tags (self, tags_to_remove: Iterable[str], tags_to_add: Iterable[str])[source]

Remove and add tags atomically; the removal is done first.
The rules for `Run.add_tag` and `Run.remove_tag` also apply here.

Parameters
----------
tags_to_remove : Iterable[str]
    Tags to remove.
tags_to_add : Iterable[str]
    Tags to add.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -540,7 +540,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 11, @@ -566,10 +566,10 @@ "id": "e56c7d2d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.698624Z", - "iopub.status.busy": "2024-07-25T06:16:57.698526Z", - "iopub.status.idle": "2024-07-25T06:16:57.708117Z", - "shell.execute_reply": "2024-07-25T06:16:57.707865Z" + "iopub.execute_input": "2025-08-30T21:15:34.264307Z", + "iopub.status.busy": "2025-08-30T21:15:34.264231Z", + "iopub.status.idle": "2025-08-30T21:15:34.292867Z", + "shell.execute_reply": "2025-08-30T21:15:34.292628Z" } }, "outputs": [ @@ -577,9 +577,9 @@ "data": { "text/html": [ "\n", - "

class Step pathspec[source]

A `Step` represents a user-defined step, that is, a method annotated with the `@step` decorator.

It contains `Task` objects associated with the step, that is, all executions of the
`Step`. The step may contain multiple `Task`s in the case of a foreach step.

Attributes
----------
task : Task
    The first `Task` object in this step. This is a shortcut for retrieving the only
    task contained in a non-foreach step.
finished_at : datetime
    Time when the latest `Task` of this step finished. Note that in the case of foreaches,
    this time may change during execution of the step.
environment_info : Dict[str, Any]
    Information about the execution environment.

\n", + "

class Step pathspec[source]

A `Step` represents a user-defined step, that is, a method annotated with the `@step` decorator.

It contains `Task` objects associated with the step, that is, all executions of the
`Step`. The step may contain multiple `Task`s in the case of a foreach step.

Attributes
----------
task : Task
    The first `Task` object in this step. This is a shortcut for retrieving the only
    task contained in a non-foreach step.
finished_at : datetime
    Time when the latest `Task` of this step finished. Note that in the case of foreaches,
    this time may change during execution of the step.
environment_info : Dict[str, Any]
    Information about the execution environment.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -592,7 +592,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 12, @@ -618,10 +618,10 @@ "id": "43db4359", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.710327Z", - "iopub.status.busy": "2024-07-25T06:16:57.710242Z", - "iopub.status.idle": "2024-07-25T06:16:57.721477Z", - "shell.execute_reply": "2024-07-25T06:16:57.721224Z" + "iopub.execute_input": "2025-08-30T21:15:34.295342Z", + "iopub.status.busy": "2025-08-30T21:15:34.295232Z", + "iopub.status.idle": "2025-08-30T21:15:34.307456Z", + "shell.execute_reply": "2025-08-30T21:15:34.307239Z" } }, "outputs": [ @@ -629,9 +629,9 @@ "data": { "text/html": [ "\n", - "

class Task pathspec, attempt=None[source]

A `Task` represents an execution of a `Step`.

It contains all `DataArtifact` objects produced by the task as
well as metadata related to execution.

Note that the `@retry` decorator may cause multiple attempts of
the task to be present. Usually you want the latest attempt, which
is what instantiating a `Task` object returns by default. If
you need to e.g. retrieve logs from a failed attempt, you can
explicitly get information about a specific attempt by using the
following syntax when creating a task:

`Task('flow/run/step/task', attempt=)`

where `attempt=0` corresponds to the first attempt etc.

Attributes
----------
metadata : List[Metadata]
    List of all metadata events associated with the task.
metadata_dict : Dict[str, str]
    A condensed version of `metadata`: A dictionary where keys
    are names of metadata events and values the latest corresponding event.
data : MetaflowData
    Container of all data artifacts produced by this task. Note that this
    call downloads all data locally, so it can be slower than accessing
    artifacts individually. See `MetaflowData` for more information.
artifacts : MetaflowArtifacts
    Container of `DataArtifact` objects produced by this task.
successful : bool
    True if the task completed successfully.
finished : bool
    True if the task completed.
exception : object
    Exception raised by this task if there was one.
finished_at : datetime
    Time this task finished.
runtime_name : str
    Runtime this task was executed on.
stdout : str
    Standard output for the task execution.
stderr : str
    Standard error output for the task execution.
code : MetaflowCode
    Code package for this task (if present). See `MetaflowCode`.
environment_info : Dict[str, str]
    Information about the execution environment.

\n", + "

class Task pathspec, attempt=None[source]

A `Task` represents an execution of a `Step`.

It contains all `DataArtifact` objects produced by the task as
well as metadata related to execution.

Note that the `@retry` decorator may cause multiple attempts of
the task to be present. Usually you want the latest attempt, which
is what instantiating a `Task` object returns by default. If
you need to e.g. retrieve logs from a failed attempt, you can
explicitly get information about a specific attempt by using the
following syntax when creating a task:

`Task('flow/run/step/task', attempt=)`

where `attempt=0` corresponds to the first attempt etc.

Attributes
----------
metadata : List[Metadata]
    List of all metadata events associated with the task.
metadata_dict : Dict[str, str]
    A condensed version of `metadata`: A dictionary where keys
    are names of metadata events and values the latest corresponding event.
data : MetaflowData
    Container of all data artifacts produced by this task. Note that this
    call downloads all data locally, so it can be slower than accessing
    artifacts individually. See `MetaflowData` for more information.
artifacts : MetaflowArtifacts
    Container of `DataArtifact` objects produced by this task.
successful : bool
    True if the task completed successfully.
finished : bool
    True if the task completed.
exception : object
    Exception raised by this task if there was one.
finished_at : datetime
    Time this task finished.
runtime_name : str
    Runtime this task was executed on.
stdout : str
    Standard output for the task execution.
stderr : str
    Standard error output for the task execution.
code : MetaflowCode
    Code package for this task (if present). See `MetaflowCode`.
environment_info : Dict[str, str]
    Information about the execution environment.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -654,7 +654,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 13, @@ -672,10 +672,10 @@ "id": "fdb93b31", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.723708Z", - "iopub.status.busy": "2024-07-25T06:16:57.723614Z", - "iopub.status.idle": "2024-07-25T06:16:57.726284Z", - "shell.execute_reply": "2024-07-25T06:16:57.726052Z" + "iopub.execute_input": "2025-08-30T21:15:34.309351Z", + "iopub.status.busy": "2025-08-30T21:15:34.309246Z", + "iopub.status.idle": "2025-08-30T21:15:34.312013Z", + "shell.execute_reply": "2025-08-30T21:15:34.311807Z" } }, "outputs": [ @@ -683,9 +683,9 @@ "data": { "text/html": [ "\n", - "

method Task.loglines (self, stream: str, as_unicode: bool = True, meta_dict: Optional[Dict[str, Any]] = None) -> Iterator[Tuple[datetime.datetime, str]][source]

Return an iterator over (utc_timestamp, logline) tuples.

Parameters
----------
stream : str
    Either 'stdout' or 'stderr'.
as_unicode : bool, default: True
    If as_unicode=False, each logline is returned as a byte object. Otherwise,
    it is returned as a (unicode) string.

Yields
------
Tuple[datetime, str]
    Tuple of timestamp, logline pairs.

\n", + "

method Task.loglines (self, stream: str, as_unicode: bool = True, meta_dict: Optional[Dict[str, Any]] = None) -> Iterator[Tuple[datetime.datetime, str]][source]

Return an iterator over (utc_timestamp, logline) tuples.

Parameters
----------
stream : str
    Either 'stdout' or 'stderr'.
as_unicode : bool, default: True
    If as_unicode=False, each logline is returned as a byte object. Otherwise,
    it is returned as a (unicode) string.

Yields
------
Tuple[datetime, str]
    Tuple of timestamp, logline pairs.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -700,7 +700,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 14, @@ -726,10 +726,10 @@ "id": "5dad835f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.728300Z", - "iopub.status.busy": "2024-07-25T06:16:57.728224Z", - "iopub.status.idle": "2024-07-25T06:16:57.756648Z", - "shell.execute_reply": "2024-07-25T06:16:57.756377Z" + "iopub.execute_input": "2025-08-30T21:15:34.314169Z", + "iopub.status.busy": "2025-08-30T21:15:34.314096Z", + "iopub.status.idle": "2025-08-30T21:15:34.322538Z", + "shell.execute_reply": "2025-08-30T21:15:34.322307Z" } }, "outputs": [ @@ -737,9 +737,9 @@ "data": { "text/html": [ "\n", - "

class DataArtifact pathspec[source]

A single data artifact and associated metadata. Note that this object does
not contain other objects as it is the leaf object in the hierarchy.

Attributes
----------
data : object
    The data contained in this artifact, that is, the object produced during
    execution of this run.
sha : string
    A unique ID of this artifact.
finished_at : datetime
    Corresponds roughly to the `Task.finished_at` time of the parent `Task`.
    An alias for `DataArtifact.created_at`.

\n", + "

class DataArtifact pathspec[source]

A single data artifact and associated metadata. Note that this object does
not contain other objects as it is the leaf object in the hierarchy.

Attributes
----------
data : object
    The data contained in this artifact, that is, the object produced during
    execution of this run.
sha : string
    A unique ID of this artifact.
finished_at : datetime
    Corresponds roughly to the `Task.finished_at` time of the parent `Task`.
    An alias for `DataArtifact.created_at`.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -752,7 +752,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 15, @@ -786,10 +786,10 @@ "id": "7c6ef99c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.759463Z", - "iopub.status.busy": "2024-07-25T06:16:57.759351Z", - "iopub.status.idle": "2024-07-25T06:16:57.767882Z", - "shell.execute_reply": "2024-07-25T06:16:57.767632Z" + "iopub.execute_input": "2025-08-30T21:15:34.324196Z", + "iopub.status.busy": "2025-08-30T21:15:34.324134Z", + "iopub.status.idle": "2025-08-30T21:15:34.332486Z", + "shell.execute_reply": "2025-08-30T21:15:34.332290Z" } }, "outputs": [ @@ -797,9 +797,9 @@ "data": { "text/html": [ "\n", - "

class MetaflowData [source]

Container of data artifacts produced by a `Task`. This object is
instantiated through `Task.data`.

`MetaflowData` allows results to be retrieved by their name
through a convenient dot notation:

```python
Task(...).data.my_object
```

You can also test the existence of an object

```python
if 'my_object' in Task(...).data:
    print('my_object found')
```

Note that this container relies on the local cache to load all data
artifacts. If your `Task` contains a lot of data, a more efficient
approach is to load artifacts individually like so

```
Task(...)['my_object'].data
```

\n", + "

class MetaflowData [source]

Container of data artifacts produced by a `Task`. This object is
instantiated through `Task.data`.

`MetaflowData` allows results to be retrieved by their name
through a convenient dot notation:

```python
Task(...).data.my_object
```

You can also test the existence of an object

```python
if 'my_object' in Task(...).data:
    print('my_object found')
```

Note that this container relies on the local cache to load all data
artifacts. If your `Task` contains a lot of data, a more efficient
approach is to load artifacts individually like so

```
Task(...)['my_object'].data
```

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -808,7 +808,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 16, @@ -837,10 +837,10 @@ "id": "5f454538", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.770085Z", - "iopub.status.busy": "2024-07-25T06:16:57.769991Z", - "iopub.status.idle": "2024-07-25T06:16:57.777954Z", - "shell.execute_reply": "2024-07-25T06:16:57.777737Z" + "iopub.execute_input": "2025-08-30T21:15:34.334413Z", + "iopub.status.busy": "2025-08-30T21:15:34.334334Z", + "iopub.status.idle": "2025-08-30T21:15:34.342310Z", + "shell.execute_reply": "2025-08-30T21:15:34.342123Z" } }, "outputs": [ @@ -848,9 +848,9 @@ "data": { "text/html": [ "\n", - "

class MetaflowCode [source]

Snapshot of the code used to execute this `Run`. Instantiate the object through
`Run(...).code` (if any step is executed remotely) or `Task(...).code` for an
individual task. The code package is the same for all steps of a `Run`.

`MetaflowCode` includes a package of the user-defined `FlowSpec` class and supporting
files, as well as a snapshot of the Metaflow library itself.

Currently, `MetaflowCode` objects are stored only for `Run`s that have at least one `Step`
executing outside the user's local environment.

The `TarFile` for the `Run` is given by `Run(...).code.tarball`

Attributes
----------
path : str
    Location (in the datastore provider) of the code package.
info : Dict[str, str]
    Dictionary of information related to this code-package.
flowspec : str
    Source code of the file containing the `FlowSpec` in this code package.
tarball : TarFile
    Python standard library `tarfile.TarFile` archive containing all the code.

\n", + "

class MetaflowCode [source]

Snapshot of the code used to execute this `Run`. Instantiate the object through
`Run(...).code` (if any step is executed remotely) or `Task(...).code` for an
individual task. The code package is the same for all steps of a `Run`.

`MetaflowCode` includes a package of the user-defined `FlowSpec` class and supporting
files, as well as a snapshot of the Metaflow library itself.

Currently, `MetaflowCode` objects are stored only for `Run`s that have at least one `Step`
executing outside the user's local environment.

The `TarFile` for the `Run` is given by `Run(...).code.tarball`

Attributes
----------
path : str
    Location (in the datastore provider) of the code package.
info : Dict[str, str]
    Dictionary of information related to this code-package.
flowspec : str
    Source code of the file containing the `FlowSpec` in this code package.
tarball : TarFile
    Python standard library `tarfile.TarFile` archive containing all the code.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -864,7 +864,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 17, @@ -906,10 +906,10 @@ "id": "b3817f2e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.780159Z", - "iopub.status.busy": "2024-07-25T06:16:57.780077Z", - "iopub.status.idle": "2024-07-25T06:16:57.782016Z", - "shell.execute_reply": "2024-07-25T06:16:57.781773Z" + "iopub.execute_input": "2025-08-30T21:15:34.344395Z", + "iopub.status.busy": "2025-08-30T21:15:34.344323Z", + "iopub.status.idle": "2025-08-30T21:15:34.346104Z", + "shell.execute_reply": "2025-08-30T21:15:34.345924Z" } }, "outputs": [ @@ -930,7 +930,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 18, @@ -948,10 +948,10 @@ "id": "9192a187", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.784284Z", - "iopub.status.busy": "2024-07-25T06:16:57.784201Z", - "iopub.status.idle": "2024-07-25T06:16:57.785987Z", - "shell.execute_reply": "2024-07-25T06:16:57.785765Z" + "iopub.execute_input": "2025-08-30T21:15:34.348085Z", + "iopub.status.busy": "2025-08-30T21:15:34.348008Z", + "iopub.status.idle": "2025-08-30T21:15:34.349803Z", + "shell.execute_reply": "2025-08-30T21:15:34.349627Z" } }, "outputs": [ @@ -972,7 +972,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 19, @@ -990,10 +990,10 @@ "id": "97c20808", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.788415Z", - "iopub.status.busy": "2024-07-25T06:16:57.788332Z", - "iopub.status.idle": "2024-07-25T06:16:57.790307Z", - "shell.execute_reply": "2024-07-25T06:16:57.790084Z" + "iopub.execute_input": "2025-08-30T21:15:34.351773Z", + "iopub.status.busy": "2025-08-30T21:15:34.351703Z", + "iopub.status.idle": "2025-08-30T21:15:34.353443Z", + "shell.execute_reply": "2025-08-30T21:15:34.353245Z" } }, "outputs": [ @@ -1014,7 +1014,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 20, @@ -1032,10 +1032,10 @@ "id": "ba6912f9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.792604Z", - "iopub.status.busy": "2024-07-25T06:16:57.792522Z", - "iopub.status.idle": "2024-07-25T06:16:57.794500Z", - "shell.execute_reply": "2024-07-25T06:16:57.794291Z" + "iopub.execute_input": "2025-08-30T21:15:34.355344Z", + "iopub.status.busy": "2025-08-30T21:15:34.355268Z", + "iopub.status.idle": "2025-08-30T21:15:34.357097Z", + "shell.execute_reply": "2025-08-30T21:15:34.356915Z" } }, "outputs": [ @@ -1056,7 +1056,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 21, @@ -1074,10 +1074,10 @@ "id": "dc2fb830", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.796609Z", - "iopub.status.busy": "2024-07-25T06:16:57.796507Z", - "iopub.status.idle": "2024-07-25T06:16:57.799713Z", - "shell.execute_reply": "2024-07-25T06:16:57.799471Z" + "iopub.execute_input": "2025-08-30T21:15:34.359290Z", + "iopub.status.busy": "2025-08-30T21:15:34.359208Z", + "iopub.status.idle": "2025-08-30T21:15:34.362117Z", + "shell.execute_reply": "2025-08-30T21:15:34.361899Z" } }, "outputs": [ @@ -1098,7 +1098,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 22, @@ -1126,10 +1126,10 @@ "id": "249a8772", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.801634Z", - "iopub.status.busy": "2024-07-25T06:16:57.801535Z", - "iopub.status.idle": "2024-07-25T06:16:57.804565Z", - "shell.execute_reply": "2024-07-25T06:16:57.804334Z" + "iopub.execute_input": "2025-08-30T21:15:34.363785Z", + "iopub.status.busy": "2025-08-30T21:15:34.363708Z", + "iopub.status.idle": "2025-08-30T21:15:34.366285Z", + "shell.execute_reply": "2025-08-30T21:15:34.366095Z" } }, "outputs": [ @@ -1153,7 +1153,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 23, @@ -1179,10 +1179,10 @@ "id": "a8ff9d8a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.806732Z", - "iopub.status.busy": "2024-07-25T06:16:57.806634Z", - "iopub.status.idle": "2024-07-25T06:16:57.809064Z", - "shell.execute_reply": "2024-07-25T06:16:57.808806Z" + "iopub.execute_input": "2025-08-30T21:15:34.368385Z", + "iopub.status.busy": "2025-08-30T21:15:34.368310Z", + "iopub.status.idle": "2025-08-30T21:15:34.370367Z", + "shell.execute_reply": "2025-08-30T21:15:34.370185Z" } }, "outputs": [ @@ -1190,9 +1190,9 @@ "data": { "text/html": [ "\n", - "

function namespace (ns: Optional[str]) -> Optional[str][source]

metaflow

Switch namespace to the one provided.

This call has a global effect. No objects outside this namespace
will be accessible. To access all objects regardless of namespaces,
pass None to this call.

Parameters
----------
ns : str, optional
    Namespace to switch to or None to ignore namespaces.

Returns
-------
str, optional
    Namespace set (result of get_namespace()).

\n", + "

function namespace (ns: Optional[str]) -> Optional[str][source]

metaflow

Switch namespace to the one provided.

This call has a global effect. No objects outside this namespace
will be accessible. To access all objects regardless of namespaces,
pass None to this call.

Parameters
----------
ns : str, optional
    Namespace to switch to or None to ignore namespaces.

Returns
-------
str, optional
    Namespace set (result of get_namespace()).

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1206,7 +1206,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 24, @@ -1226,10 +1226,10 @@ "id": "864e244f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.811079Z", - "iopub.status.busy": "2024-07-25T06:16:57.810986Z", - "iopub.status.idle": "2024-07-25T06:16:57.813253Z", - "shell.execute_reply": "2024-07-25T06:16:57.813026Z" + "iopub.execute_input": "2025-08-30T21:15:34.372382Z", + "iopub.status.busy": "2025-08-30T21:15:34.372311Z", + "iopub.status.idle": "2025-08-30T21:15:34.374187Z", + "shell.execute_reply": "2025-08-30T21:15:34.373997Z" } }, "outputs": [ @@ -1237,9 +1237,9 @@ "data": { "text/html": [ "\n", - "

function get_namespace () -> Optional[str][source]

Return the current namespace that is currently being used to filter objects.

The namespace is a tag associated with all objects in Metaflow.

Returns
-------
str, optional
    The current namespace used to filter objects.

\n", + "

function get_namespace () -> Optional[str][source]

Return the current namespace that is currently being used to filter objects.

The namespace is a tag associated with all objects in Metaflow.

Returns
-------
str, optional
    The current namespace used to filter objects.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1250,7 +1250,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 25, @@ -1268,10 +1268,10 @@ "id": "af4a31de", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.815463Z", - "iopub.status.busy": "2024-07-25T06:16:57.815378Z", - "iopub.status.idle": "2024-07-25T06:16:57.817523Z", - "shell.execute_reply": "2024-07-25T06:16:57.817314Z" + "iopub.execute_input": "2025-08-30T21:15:34.376172Z", + "iopub.status.busy": "2025-08-30T21:15:34.376103Z", + "iopub.status.idle": "2025-08-30T21:15:34.377994Z", + "shell.execute_reply": "2025-08-30T21:15:34.377792Z" } }, "outputs": [ @@ -1279,9 +1279,9 @@ "data": { "text/html": [ "\n", - "

function default_namespace () -> str[source]

Resets the namespace used to filter objects to the default one, i.e. the one that was
used prior to any `namespace` calls.

Returns
-------
str
    The result of get_namespace() after the namespace has been reset.

\n", + "

function default_namespace () -> str[source]

Resets the namespace used to filter objects to the default one, i.e. the one that was
used prior to any `namespace` calls.

Returns
-------
str
    The result of get_namespace() after the namespace has been reset.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1292,7 +1292,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 26, @@ -1318,10 +1318,10 @@ "id": "4b040aaf", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.819632Z", - "iopub.status.busy": "2024-07-25T06:16:57.819555Z", - "iopub.status.idle": "2024-07-25T06:16:57.822062Z", - "shell.execute_reply": "2024-07-25T06:16:57.821854Z" + "iopub.execute_input": "2025-08-30T21:15:34.379743Z", + "iopub.status.busy": "2025-08-30T21:15:34.379670Z", + "iopub.status.idle": "2025-08-30T21:15:34.381764Z", + "shell.execute_reply": "2025-08-30T21:15:34.381588Z" } }, "outputs": [ @@ -1329,9 +1329,9 @@ "data": { "text/html": [ "\n", - "

function metadata (ms: str) -> str[source]

Switch Metadata provider.

This call has a global effect. Selecting the local metadata will,
for example, not allow access to information stored in remote
metadata providers.

Note that you don't typically have to call this function directly. Usually
the metadata provider is set through the Metaflow configuration file. If you
need to switch between multiple providers, you can use the `METAFLOW_PROFILE`
environment variable to switch between configurations.

Parameters
----------
ms : str
    Can be a path (selects local metadata), a URL starting with http (selects
    the service metadata) or an explicit specification @; as an
    example, you can specify local@ or service@.

Returns
-------
str
    The description of the metadata selected (equivalent to the result of
    get_metadata()).

\n", + "

function metadata (ms: str) -> str[source]

Switch Metadata provider.

This call has a global effect. Selecting the local metadata will,
for example, not allow access to information stored in remote
metadata providers.

Note that you don't typically have to call this function directly. Usually
the metadata provider is set through the Metaflow configuration file. If you
need to switch between multiple providers, you can use the `METAFLOW_PROFILE`
environment variable to switch between configurations.

Parameters
----------
ms : str
    Can be a path (selects local metadata), a URL starting with http (selects
    the service metadata) or an explicit specification @; as an
    example, you can specify local@ or service@.

Returns
-------
str
    The description of the metadata selected (equivalent to the result of
    get_metadata()).

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1345,7 +1345,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 27, @@ -1365,10 +1365,10 @@ "id": "940db9bf", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.824057Z", - "iopub.status.busy": "2024-07-25T06:16:57.823981Z", - "iopub.status.idle": "2024-07-25T06:16:57.826138Z", - "shell.execute_reply": "2024-07-25T06:16:57.825906Z" + "iopub.execute_input": "2025-08-30T21:15:34.383755Z", + "iopub.status.busy": "2025-08-30T21:15:34.383679Z", + "iopub.status.idle": "2025-08-30T21:15:34.385722Z", + "shell.execute_reply": "2025-08-30T21:15:34.385540Z" } }, "outputs": [ @@ -1376,9 +1376,9 @@ "data": { "text/html": [ "\n", - "

function get_metadata () -> str[source]

Returns the current Metadata provider.

If this is not set explicitly using `metadata`, the default value is
determined through the Metaflow configuration. You can use this call to
check that your configuration is set up properly.

If multiple configuration profiles are present, this call returns the one
selected through the `METAFLOW_PROFILE` environment variable.

Returns
-------
str
    Information about the Metadata provider currently selected. This information typically
    returns provider specific information (like URL for remote providers or local paths for
    local providers).

\n", + "

function get_metadata () -> str[source]

Returns the current Metadata provider.

If this is not set explicitly using `metadata`, the default value is
determined through the Metaflow configuration. You can use this call to
check that your configuration is set up properly.

If multiple configuration profiles are present, this call returns the one
selected through the `METAFLOW_PROFILE` environment variable.

Returns
-------
str
    Information about the Metadata provider currently selected. This information typically
    returns provider specific information (like URL for remote providers or local paths for
    local providers).

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1389,7 +1389,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 28, @@ -1407,10 +1407,10 @@ "id": "4fa4b981", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.828277Z", - "iopub.status.busy": "2024-07-25T06:16:57.828192Z", - "iopub.status.idle": "2024-07-25T06:16:57.830401Z", - "shell.execute_reply": "2024-07-25T06:16:57.830156Z" + "iopub.execute_input": "2025-08-30T21:15:34.387335Z", + "iopub.status.busy": "2025-08-30T21:15:34.387263Z", + "iopub.status.idle": "2025-08-30T21:15:34.389237Z", + "shell.execute_reply": "2025-08-30T21:15:34.389046Z" } }, "outputs": [ @@ -1418,9 +1418,9 @@ "data": { "text/html": [ "\n", - "

function default_metadata () -> str[source]

Resets the Metadata provider to the default value, that is, to the value
that was used prior to any `metadata` calls.

Returns
-------
str
    The result of get_metadata() after resetting the provider.

\n", + "

function default_metadata () -> str[source]

Resets the Metadata provider to the default value, that is, to the value
that was used prior to any `metadata` calls.

Returns
-------
str
    The result of get_metadata() after resetting the provider.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -1431,7 +1431,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 29, diff --git a/docs/api/client.md b/docs/api/client.md index ccbc679e..7bc31a2f 100644 --- a/docs/api/client.md +++ b/docs/api/client.md @@ -57,9 +57,9 @@ This module accesses all objects through the current metadata provider - either ### Metaflow - + - + @@ -71,7 +71,7 @@ This module accesses all objects through the current metadata provider - either ### Flow - + @@ -84,7 +84,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -101,7 +101,7 @@ This module accesses all objects through the current metadata provider - either ### Run - + @@ -119,7 +119,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -131,7 +131,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -143,7 +143,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -155,7 +155,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -167,7 +167,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -180,7 +180,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -195,7 +195,7 @@ This module accesses all objects through the current metadata provider - either ### Step - + @@ -211,7 +211,7 @@ This module accesses all objects through the current metadata provider - either ### Task - + @@ -235,7 +235,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -253,7 +253,7 @@ This module accesses all objects through the current metadata provider - either ### DataArtifact - + @@ -271,7 +271,7 @@ This module accesses all objects through the current metadata provider - either ### MetaflowData - + @@ -283,7 +283,7 @@ This module accesses all objects through the current metadata provider - either ### MetaflowCode - + @@ -383,7 +383,7 @@ This module accesses all objects through the current metadata provider - either ## Namespace functions - + @@ -398,7 +398,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -410,7 +410,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -424,7 +424,7 @@ This module accesses all objects through the current metadata provider - either ## Metadata functions - + @@ -439,7 +439,7 @@ This module accesses all objects through the current metadata provider - either - + @@ -451,7 +451,7 @@ This module accesses all objects through the current metadata provider - either - + diff --git a/docs/api/current.ipynb b/docs/api/current.ipynb index a038653d..62104611 100644 --- a/docs/api/current.ipynb +++ b/docs/api/current.ipynb @@ -24,10 +24,10 @@ "id": "d4c02781", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:55.863821Z", - "iopub.status.busy": "2024-07-25T06:16:55.863687Z", - "iopub.status.idle": "2024-07-25T06:16:56.135122Z", - "shell.execute_reply": "2024-07-25T06:16:56.134781Z" + "iopub.execute_input": "2025-08-30T21:15:33.477483Z", + "iopub.status.busy": "2025-08-30T21:15:33.477378Z", + "iopub.status.idle": "2025-08-30T21:15:33.691974Z", + "shell.execute_reply": "2025-08-30T21:15:33.691701Z" } }, "outputs": [], @@ -182,10 +182,10 @@ "id": "7fdb7184", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.137589Z", - "iopub.status.busy": "2024-07-25T06:16:56.137412Z", - "iopub.status.idle": "2024-07-25T06:16:56.141437Z", - "shell.execute_reply": "2024-07-25T06:16:56.141114Z" + "iopub.execute_input": "2025-08-30T21:15:33.694080Z", + "iopub.status.busy": "2025-08-30T21:15:33.693946Z", + "iopub.status.idle": "2025-08-30T21:15:33.697285Z", + "shell.execute_reply": "2025-08-30T21:15:33.697054Z" } }, "outputs": [ @@ -204,7 +204,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -222,10 +222,10 @@ "id": "fa61bd9b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.143388Z", - "iopub.status.busy": "2024-07-25T06:16:56.143306Z", - "iopub.status.idle": "2024-07-25T06:16:56.145573Z", - "shell.execute_reply": "2024-07-25T06:16:56.145253Z" + "iopub.execute_input": "2025-08-30T21:15:33.699015Z", + "iopub.status.busy": "2025-08-30T21:15:33.698947Z", + "iopub.status.idle": "2025-08-30T21:15:33.700861Z", + "shell.execute_reply": "2025-08-30T21:15:33.700644Z" } }, "outputs": [ @@ -244,7 +244,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -262,10 +262,10 @@ "id": "16a90815", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.147932Z", - "iopub.status.busy": "2024-07-25T06:16:56.147823Z", - "iopub.status.idle": "2024-07-25T06:16:56.150134Z", - "shell.execute_reply": "2024-07-25T06:16:56.149852Z" + "iopub.execute_input": "2025-08-30T21:15:33.702968Z", + "iopub.status.busy": "2025-08-30T21:15:33.702879Z", + "iopub.status.idle": "2025-08-30T21:15:33.704809Z", + "shell.execute_reply": "2025-08-30T21:15:33.704599Z" } }, "outputs": [ @@ -284,7 +284,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -302,10 +302,10 @@ "id": "f278667f", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.152471Z", - "iopub.status.busy": "2024-07-25T06:16:56.152367Z", - "iopub.status.idle": "2024-07-25T06:16:56.154533Z", - "shell.execute_reply": "2024-07-25T06:16:56.154202Z" + "iopub.execute_input": "2025-08-30T21:15:33.706887Z", + "iopub.status.busy": "2025-08-30T21:15:33.706808Z", + "iopub.status.idle": "2025-08-30T21:15:33.708595Z", + "shell.execute_reply": "2025-08-30T21:15:33.708410Z" } }, "outputs": [ @@ -324,7 +324,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -342,10 +342,10 @@ "id": "f4be98ad", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.157036Z", - "iopub.status.busy": "2024-07-25T06:16:56.156950Z", - "iopub.status.idle": "2024-07-25T06:16:56.158869Z", - "shell.execute_reply": "2024-07-25T06:16:56.158635Z" + "iopub.execute_input": "2025-08-30T21:15:33.710806Z", + "iopub.status.busy": "2025-08-30T21:15:33.710718Z", + "iopub.status.idle": "2025-08-30T21:15:33.712612Z", + "shell.execute_reply": "2025-08-30T21:15:33.712370Z" } }, "outputs": [ @@ -364,7 +364,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -382,10 +382,10 @@ "id": "a97ad305", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.161044Z", - "iopub.status.busy": "2024-07-25T06:16:56.160962Z", - "iopub.status.idle": "2024-07-25T06:16:56.163040Z", - "shell.execute_reply": "2024-07-25T06:16:56.162796Z" + "iopub.execute_input": "2025-08-30T21:15:33.714609Z", + "iopub.status.busy": "2025-08-30T21:15:33.714522Z", + "iopub.status.idle": "2025-08-30T21:15:33.716373Z", + "shell.execute_reply": "2025-08-30T21:15:33.716154Z" } }, "outputs": [ @@ -404,7 +404,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 7, @@ -422,10 +422,10 @@ "id": "b73e163c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.165295Z", - "iopub.status.busy": "2024-07-25T06:16:56.165198Z", - "iopub.status.idle": "2024-07-25T06:16:56.167437Z", - "shell.execute_reply": "2024-07-25T06:16:56.167116Z" + "iopub.execute_input": "2025-08-30T21:15:33.718313Z", + "iopub.status.busy": "2025-08-30T21:15:33.718223Z", + "iopub.status.idle": "2025-08-30T21:15:33.720087Z", + "shell.execute_reply": "2025-08-30T21:15:33.719896Z" } }, "outputs": [ @@ -444,7 +444,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 8, @@ -462,10 +462,10 @@ "id": "16dd0a74", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.169732Z", - "iopub.status.busy": "2024-07-25T06:16:56.169604Z", - "iopub.status.idle": "2024-07-25T06:16:56.171888Z", - "shell.execute_reply": "2024-07-25T06:16:56.171650Z" + "iopub.execute_input": "2025-08-30T21:15:33.722134Z", + "iopub.status.busy": "2025-08-30T21:15:33.722045Z", + "iopub.status.idle": "2025-08-30T21:15:33.723923Z", + "shell.execute_reply": "2025-08-30T21:15:33.723719Z" } }, "outputs": [ @@ -484,7 +484,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -502,10 +502,10 @@ "id": "8f758f03", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.174219Z", - "iopub.status.busy": "2024-07-25T06:16:56.174131Z", - "iopub.status.idle": "2024-07-25T06:16:56.176053Z", - "shell.execute_reply": "2024-07-25T06:16:56.175835Z" + "iopub.execute_input": "2025-08-30T21:15:33.726059Z", + "iopub.status.busy": "2025-08-30T21:15:33.725968Z", + "iopub.status.idle": "2025-08-30T21:15:33.727802Z", + "shell.execute_reply": "2025-08-30T21:15:33.727602Z" } }, "outputs": [ @@ -524,7 +524,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -542,10 +542,10 @@ "id": "feaa6830", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.178305Z", - "iopub.status.busy": "2024-07-25T06:16:56.178226Z", - "iopub.status.idle": "2024-07-25T06:16:56.180109Z", - "shell.execute_reply": "2024-07-25T06:16:56.179858Z" + "iopub.execute_input": "2025-08-30T21:15:33.729842Z", + "iopub.status.busy": "2025-08-30T21:15:33.729766Z", + "iopub.status.idle": "2025-08-30T21:15:33.731528Z", + "shell.execute_reply": "2025-08-30T21:15:33.731338Z" } }, "outputs": [ @@ -564,7 +564,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 11, @@ -582,10 +582,10 @@ "id": "66c47292", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.182576Z", - "iopub.status.busy": "2024-07-25T06:16:56.182485Z", - "iopub.status.idle": "2024-07-25T06:16:56.184502Z", - "shell.execute_reply": "2024-07-25T06:16:56.184224Z" + "iopub.execute_input": "2025-08-30T21:15:33.733498Z", + "iopub.status.busy": "2025-08-30T21:15:33.733429Z", + "iopub.status.idle": "2025-08-30T21:15:33.735157Z", + "shell.execute_reply": "2025-08-30T21:15:33.734963Z" } }, "outputs": [ @@ -604,7 +604,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 12, @@ -642,10 +642,10 @@ "id": "437b253d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.187401Z", - "iopub.status.busy": "2024-07-25T06:16:56.187297Z", - "iopub.status.idle": "2024-07-25T06:16:56.189596Z", - "shell.execute_reply": "2024-07-25T06:16:56.189229Z" + "iopub.execute_input": "2025-08-30T21:15:33.737099Z", + "iopub.status.busy": "2025-08-30T21:15:33.737026Z", + "iopub.status.idle": "2025-08-30T21:15:33.738815Z", + "shell.execute_reply": "2025-08-30T21:15:33.738589Z" } }, "outputs": [ @@ -664,7 +664,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 13, @@ -682,10 +682,10 @@ "id": "fe5e9e0a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.194090Z", - "iopub.status.busy": "2024-07-25T06:16:56.193991Z", - "iopub.status.idle": "2024-07-25T06:16:56.196057Z", - "shell.execute_reply": "2024-07-25T06:16:56.195807Z" + "iopub.execute_input": "2025-08-30T21:15:33.740858Z", + "iopub.status.busy": "2025-08-30T21:15:33.740785Z", + "iopub.status.idle": "2025-08-30T21:15:33.742463Z", + "shell.execute_reply": "2025-08-30T21:15:33.742312Z" } }, "outputs": [ @@ -704,7 +704,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 14, @@ -722,10 +722,10 @@ "id": "1097fb92", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.198372Z", - "iopub.status.busy": "2024-07-25T06:16:56.198283Z", - "iopub.status.idle": "2024-07-25T06:16:56.200219Z", - "shell.execute_reply": "2024-07-25T06:16:56.199992Z" + "iopub.execute_input": "2025-08-30T21:15:33.744409Z", + "iopub.status.busy": "2025-08-30T21:15:33.744331Z", + "iopub.status.idle": "2025-08-30T21:15:33.746102Z", + "shell.execute_reply": "2025-08-30T21:15:33.745904Z" } }, "outputs": [ @@ -744,7 +744,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 15, @@ -762,10 +762,10 @@ "id": "836d028b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.202408Z", - "iopub.status.busy": "2024-07-25T06:16:56.202327Z", - "iopub.status.idle": "2024-07-25T06:16:56.204279Z", - "shell.execute_reply": "2024-07-25T06:16:56.204025Z" + "iopub.execute_input": "2025-08-30T21:15:33.748027Z", + "iopub.status.busy": "2025-08-30T21:15:33.747956Z", + "iopub.status.idle": "2025-08-30T21:15:33.749633Z", + "shell.execute_reply": "2025-08-30T21:15:33.749464Z" } }, "outputs": [ @@ -784,7 +784,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 16, @@ -802,10 +802,10 @@ "id": "96348e52", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.206534Z", - "iopub.status.busy": "2024-07-25T06:16:56.206443Z", - "iopub.status.idle": "2024-07-25T06:16:56.208449Z", - "shell.execute_reply": "2024-07-25T06:16:56.208178Z" + "iopub.execute_input": "2025-08-30T21:15:33.751575Z", + "iopub.status.busy": "2025-08-30T21:15:33.751507Z", + "iopub.status.idle": "2025-08-30T21:15:33.753175Z", + "shell.execute_reply": "2025-08-30T21:15:33.752997Z" } }, "outputs": [ @@ -824,7 +824,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 17, @@ -853,10 +853,10 @@ "id": "640602e7", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.210527Z", - "iopub.status.busy": "2024-07-25T06:16:56.210441Z", - "iopub.status.idle": "2024-07-25T06:16:56.212903Z", - "shell.execute_reply": "2024-07-25T06:16:56.212663Z" + "iopub.execute_input": "2025-08-30T21:15:33.755069Z", + "iopub.status.busy": "2025-08-30T21:15:33.754992Z", + "iopub.status.idle": "2025-08-30T21:15:33.757088Z", + "shell.execute_reply": "2025-08-30T21:15:33.756902Z" } }, "outputs": [ @@ -880,7 +880,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 18, @@ -898,10 +898,10 @@ "id": "0709cf75", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.214782Z", - "iopub.status.busy": "2024-07-25T06:16:56.214699Z", - "iopub.status.idle": "2024-07-25T06:16:56.216878Z", - "shell.execute_reply": "2024-07-25T06:16:56.216624Z" + "iopub.execute_input": "2025-08-30T21:15:33.759175Z", + "iopub.status.busy": "2025-08-30T21:15:33.759100Z", + "iopub.status.idle": "2025-08-30T21:15:33.761211Z", + "shell.execute_reply": "2025-08-30T21:15:33.760982Z" } }, "outputs": [ @@ -923,7 +923,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 19, @@ -941,10 +941,10 @@ "id": "7ef69ab6", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.218955Z", - "iopub.status.busy": "2024-07-25T06:16:56.218859Z", - "iopub.status.idle": "2024-07-25T06:16:56.220971Z", - "shell.execute_reply": "2024-07-25T06:16:56.220735Z" + "iopub.execute_input": "2025-08-30T21:15:33.763603Z", + "iopub.status.busy": "2025-08-30T21:15:33.763523Z", + "iopub.status.idle": "2025-08-30T21:15:33.765462Z", + "shell.execute_reply": "2025-08-30T21:15:33.765269Z" } }, "outputs": [ @@ -965,7 +965,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 20, @@ -983,10 +983,10 @@ "id": "f667ebcd", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.223189Z", - "iopub.status.busy": "2024-07-25T06:16:56.223109Z", - "iopub.status.idle": "2024-07-25T06:16:56.225221Z", - "shell.execute_reply": "2024-07-25T06:16:56.224988Z" + "iopub.execute_input": "2025-08-30T21:15:33.767715Z", + "iopub.status.busy": "2025-08-30T21:15:33.767622Z", + "iopub.status.idle": "2025-08-30T21:15:33.769603Z", + "shell.execute_reply": "2025-08-30T21:15:33.769420Z" } }, "outputs": [ @@ -1007,7 +1007,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 21, @@ -1025,10 +1025,10 @@ "id": "6d27efa0", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.227442Z", - "iopub.status.busy": "2024-07-25T06:16:56.227348Z", - "iopub.status.idle": "2024-07-25T06:16:56.229481Z", - "shell.execute_reply": "2024-07-25T06:16:56.229231Z" + "iopub.execute_input": "2025-08-30T21:15:33.771806Z", + "iopub.status.busy": "2025-08-30T21:15:33.771729Z", + "iopub.status.idle": "2025-08-30T21:15:33.773607Z", + "shell.execute_reply": "2025-08-30T21:15:33.773413Z" } }, "outputs": [ @@ -1047,7 +1047,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 22, @@ -1065,10 +1065,10 @@ "id": "a7914220", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.231731Z", - "iopub.status.busy": "2024-07-25T06:16:56.231634Z", - "iopub.status.idle": "2024-07-25T06:16:56.233757Z", - "shell.execute_reply": "2024-07-25T06:16:56.233513Z" + "iopub.execute_input": "2025-08-30T21:15:33.775686Z", + "iopub.status.busy": "2025-08-30T21:15:33.775603Z", + "iopub.status.idle": "2025-08-30T21:15:33.777598Z", + "shell.execute_reply": "2025-08-30T21:15:33.777388Z" } }, "outputs": [ @@ -1087,7 +1087,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 23, @@ -1116,10 +1116,10 @@ "id": "2ad67aa6", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.235976Z", - "iopub.status.busy": "2024-07-25T06:16:56.235885Z", - "iopub.status.idle": "2024-07-25T06:16:56.237887Z", - "shell.execute_reply": "2024-07-25T06:16:56.237650Z" + "iopub.execute_input": "2025-08-30T21:15:33.779602Z", + "iopub.status.busy": "2025-08-30T21:15:33.779527Z", + "iopub.status.idle": "2025-08-30T21:15:33.781399Z", + "shell.execute_reply": "2025-08-30T21:15:33.781186Z" } }, "outputs": [ @@ -1138,7 +1138,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 24, diff --git a/docs/api/flow-decorators/conda_base.ipynb b/docs/api/flow-decorators/conda_base.ipynb index 7f0fc10e..679a75cf 100644 --- a/docs/api/flow-decorators/conda_base.ipynb +++ b/docs/api/flow-decorators/conda_base.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:04.514901Z", - "iopub.status.busy": "2024-07-25T06:17:04.514765Z", - "iopub.status.idle": "2024-07-25T06:17:04.822231Z", - "shell.execute_reply": "2024-07-25T06:17:04.821959Z" + "iopub.execute_input": "2025-08-30T21:15:42.204021Z", + "iopub.status.busy": "2025-08-30T21:15:42.203933Z", + "iopub.status.idle": "2025-08-30T21:15:42.418765Z", + "shell.execute_reply": "2025-08-30T21:15:42.418440Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:04.824518Z", - "iopub.status.busy": "2024-07-25T06:17:04.824364Z", - "iopub.status.idle": "2024-07-25T06:17:04.831032Z", - "shell.execute_reply": "2024-07-25T06:17:04.830771Z" + "iopub.execute_input": "2025-08-30T21:15:42.420813Z", + "iopub.status.busy": "2025-08-30T21:15:42.420687Z", + "iopub.status.idle": "2025-08-30T21:15:42.426806Z", + "shell.execute_reply": "2025-08-30T21:15:42.426571Z" } }, "outputs": [ @@ -51,9 +51,9 @@ "data": { "text/html": [ "\n", - "

decorator @conda_base (...)[source]

metaflow

Specifies the Conda environment for all steps of the flow.

Use `@conda_base` to set common libraries required by all
steps and use `@conda` to specify step-specific additions.

Parameters
----------
packages : Dict[str, str], default {}
    Packages to use for this flow. The key is the name of the package
    and the value is the version to use.
libraries : Dict[str, str], default {}
    Supported for backward compatibility. When used with packages, packages will take precedence.
python : str, optional, default None
    Version of Python to use, e.g. '3.7.4'. A default value of None implies
    that the version used will correspond to the version of the Python interpreter used to start the run.
disabled : bool, default False
    If set to True, disables Conda.

\n", + "

decorator @conda_base (...)[source]

metaflow

Specifies the Conda environment for all steps of the flow.

Use `@conda_base` to set common libraries required by all
steps and use `@conda` to specify step-specific additions.

Parameters
----------
packages : Dict[str, str], default {}
    Packages to use for this flow. The key is the name of the package
    and the value is the version to use.
libraries : Dict[str, str], default {}
    Supported for backward compatibility. When used with packages, packages will take precedence.
python : str, optional, default None
    Version of Python to use, e.g. '3.7.4'. A default value of None implies
    that the version used will correspond to the version of the Python interpreter used to start the run.
disabled : bool, default False
    If set to True, disables Conda.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -67,7 +67,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/flow-decorators/conda_base.md b/docs/api/flow-decorators/conda_base.md index fbce3bef..03f6c25a 100644 --- a/docs/api/flow-decorators/conda_base.md +++ b/docs/api/flow-decorators/conda_base.md @@ -7,7 +7,7 @@ The libraries are installed from [Conda repositories](https://anaconda.org/). Fo - + diff --git a/docs/api/flow-decorators/project.ipynb b/docs/api/flow-decorators/project.ipynb index bc7201a0..98be5a36 100644 --- a/docs/api/flow-decorators/project.ipynb +++ b/docs/api/flow-decorators/project.ipynb @@ -22,10 +22,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:05.193031Z", - "iopub.status.busy": "2024-07-25T06:17:05.192942Z", - "iopub.status.idle": "2024-07-25T06:17:05.451383Z", - "shell.execute_reply": "2024-07-25T06:17:05.451056Z" + "iopub.execute_input": "2025-08-30T21:15:42.661148Z", + "iopub.status.busy": "2025-08-30T21:15:42.661058Z", + "iopub.status.idle": "2025-08-30T21:15:42.884728Z", + "shell.execute_reply": "2025-08-30T21:15:42.884414Z" } }, "outputs": [], @@ -44,10 +44,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:05.453564Z", - "iopub.status.busy": "2024-07-25T06:17:05.453413Z", - "iopub.status.idle": "2024-07-25T06:17:05.460020Z", - "shell.execute_reply": "2024-07-25T06:17:05.459767Z" + "iopub.execute_input": "2025-08-30T21:15:42.886778Z", + "iopub.status.busy": "2025-08-30T21:15:42.886633Z", + "iopub.status.idle": "2025-08-30T21:15:42.892851Z", + "shell.execute_reply": "2025-08-30T21:15:42.892669Z" } }, "outputs": [ @@ -59,9 +59,9 @@ "@@ Returns \n", "------- in \n", "Specifies what flows belong to the same project.\n", - "... in the docstring of ProjectDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/project_decorator.py.\n", + "... in the docstring of ProjectDecorator in /Users/ville/src/metaflow/metaflow/plugins/project_decorator.py.\n", " warn(msg)\n", - "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of ProjectDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/project_decorator.py.\n", + "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of ProjectDecorator in /Users/ville/src/metaflow/metaflow/plugins/project_decorator.py.\n", " warn(msg)\n" ] }, @@ -69,7 +69,7 @@ "data": { "text/html": [ "\n", - "

decorator @project (...)[source]

metaflow

Specifies what flows belong to the same project.

A project-specific namespace is created for all flows that
use the same `@project(name)`.

Parameters
----------
name : str
    Project name. Make sure that the name is unique amongst all
    projects that use the same production scheduler. The name may
    contain only lowercase alphanumeric characters and underscores.

MF Add To Current
-----------------
project_name -> str
    The name of the project assigned to this flow, i.e. `X` in `@project(name=X)`.

    @@ Returns
    -------
    str
        Project name.

project_flow_name -> str
    The flow name prefixed with the current project and branch. This name identifies
    the deployment on a production scheduler.

    @@ Returns
    -------
    str
        Flow name prefixed with project information.

branch_name -> str
    The current branch, i.e. `X` in `--branch=X` set during deployment or run.

    @@ Returns
    -------
    str
        Branch name.

is_user_branch -> bool
    True if the flow is deployed without a specific `--branch` or a `--production`
    flag.

    @@ Returns
    -------
    bool
        True if the deployment does not correspond to a specific branch.

is_production -> bool
    True if the flow is deployed with the `--production` flag

    @@ Returns
    -------
    bool
        True if the flow is deployed with `--production`.

\n", + "

decorator @project (...)[source]

metaflow

Specifies what flows belong to the same project.

A project-specific namespace is created for all flows that
use the same `@project(name)`.

Parameters
----------
name : str
    Project name. Make sure that the name is unique amongst all
    projects that use the same production scheduler. The name may
    contain only lowercase alphanumeric characters and underscores.

branch : Optional[str], default None
    The branch to use. If not specified, the branch is set to
    `user.` unless `production` is set to `True`. This can
    also be set on the command line using `--branch` as a top-level option.
    It is an error to specify `branch` in the decorator and on the command line.

production : bool, default False
    Whether or not the branch is the production branch. This can also be set on the
    command line using `--production` as a top-level option. It is an error to specify
    `production` in the decorator and on the command line.
    The project branch name will be:
      - if `branch` is specified:
        - if `production` is True: `prod.`
        - if `production` is False: `test.`
      - if `branch` is not specified:
        - if `production` is True: `prod`
        - if `production` is False: `user.`

MF Add To Current
-----------------
project_name -> str
    The name of the project assigned to this flow, i.e. `X` in `@project(name=X)`.

    @@ Returns
    -------
    str
        Project name.

project_flow_name -> str
    The flow name prefixed with the current project and branch. This name identifies
    the deployment on a production scheduler.

    @@ Returns
    -------
    str
        Flow name prefixed with project information.

branch_name -> str
    The current branch, i.e. `X` in `--branch=X` set during deployment or run.

    @@ Returns
    -------
    str
        Branch name.

is_user_branch -> bool
    True if the flow is deployed without a specific `--branch` or a `--production`
    flag.

    @@ Returns
    -------
    bool
        True if the deployment does not correspond to a specific branch.

is_production -> bool
    True if the flow is deployed with the `--production` flag

    @@ Returns
    -------
    bool
        True if the flow is deployed with `--production`.

\n", "
\n", "\n", "\n", @@ -78,11 +78,13 @@ "\n", "\n", "\t\n", + "\t` unless `production` is set to `True`. This can\\nalso be set on the command line using `--branch` as a top-level option.\\nIt is an error to specify `branch` in the decorator and on the command line.\" />\n", + "\t`\\n - if `production` is False: `test.`\\n - if `branch` is not specified:\\n - if `production` is True: `prod`\\n - if `production` is False: `user.`\" />\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/flow-decorators/project.md b/docs/api/flow-decorators/project.md index 57990a2c..188ed85e 100644 --- a/docs/api/flow-decorators/project.md +++ b/docs/api/flow-decorators/project.md @@ -18,6 +18,8 @@ For more information, see [Coordinating Larger Metaflow Projects](/production/co + +
diff --git a/docs/api/flow-decorators/schedule.ipynb b/docs/api/flow-decorators/schedule.ipynb index 38f368f5..4d5d3ed4 100644 --- a/docs/api/flow-decorators/schedule.ipynb +++ b/docs/api/flow-decorators/schedule.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:05.741334Z", - "iopub.status.busy": "2024-07-25T06:17:05.741233Z", - "iopub.status.idle": "2024-07-25T06:17:05.999865Z", - "shell.execute_reply": "2024-07-25T06:17:05.999582Z" + "iopub.execute_input": "2025-08-30T21:15:43.160855Z", + "iopub.status.busy": "2025-08-30T21:15:43.160748Z", + "iopub.status.idle": "2025-08-30T21:15:43.396555Z", + "shell.execute_reply": "2025-08-30T21:15:43.396248Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:06.002165Z", - "iopub.status.busy": "2024-07-25T06:17:06.002025Z", - "iopub.status.idle": "2024-07-25T06:17:06.007063Z", - "shell.execute_reply": "2024-07-25T06:17:06.006812Z" + "iopub.execute_input": "2025-08-30T21:15:43.399331Z", + "iopub.status.busy": "2025-08-30T21:15:43.399179Z", + "iopub.status.idle": "2025-08-30T21:15:43.404109Z", + "shell.execute_reply": "2025-08-30T21:15:43.403864Z" } }, "outputs": [ @@ -68,7 +68,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/flow-decorators/trigger.ipynb b/docs/api/flow-decorators/trigger.ipynb index 42e6c8b1..3a25a4b6 100644 --- a/docs/api/flow-decorators/trigger.ipynb +++ b/docs/api/flow-decorators/trigger.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:06.204233Z", - "iopub.status.busy": "2024-07-25T06:17:06.204133Z", - "iopub.status.idle": "2024-07-25T06:17:06.488531Z", - "shell.execute_reply": "2024-07-25T06:17:06.488256Z" + "iopub.execute_input": "2025-08-30T21:15:43.661779Z", + "iopub.status.busy": "2025-08-30T21:15:43.661665Z", + "iopub.status.idle": "2025-08-30T21:15:43.884519Z", + "shell.execute_reply": "2025-08-30T21:15:43.884202Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:06.490772Z", - "iopub.status.busy": "2024-07-25T06:17:06.490640Z", - "iopub.status.idle": "2024-07-25T06:17:06.497814Z", - "shell.execute_reply": "2024-07-25T06:17:06.497586Z" + "iopub.execute_input": "2025-08-30T21:15:43.886677Z", + "iopub.status.busy": "2025-08-30T21:15:43.886528Z", + "iopub.status.idle": "2025-08-30T21:15:43.895643Z", + "shell.execute_reply": "2025-08-30T21:15:43.895399Z" } }, "outputs": [ @@ -55,9 +55,9 @@ "@@ Returns \n", "------- in \n", "Specifies the event(s) that this flow depends on.\n", - "... in the docstring of TriggerDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/events_decorator.py.\n", + "... in the docstring of TriggerDecorator in /Users/ville/src/metaflow/metaflow/plugins/events_decorator.py.\n", " warn(msg)\n", - "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of TriggerDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/events_decorator.py.\n", + "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of TriggerDecorator in /Users/ville/src/metaflow/metaflow/plugins/events_decorator.py.\n", " warn(msg)\n" ] }, @@ -65,9 +65,9 @@ "data": { "text/html": [ "\n", - "

decorator @trigger (...)[source]

metaflow

Specifies the event(s) that this flow depends on.

```
@trigger(event='foo')
```
or
```
@trigger(events=['foo', 'bar'])
```

Additionally, you can specify the parameter mappings
to map event payload to Metaflow parameters for the flow.
```
@trigger(event={'name':'foo', 'parameters':{'flow_param': 'event_field'}})
```
or
```
@trigger(events=[{'name':'foo', 'parameters':{'flow_param_1': 'event_field_1'},
                 {'name':'bar', 'parameters':{'flow_param_2': 'event_field_2'}])
```

'parameters' can also be a list of strings and tuples like so:
```
@trigger(event={'name':'foo', 'parameters':['common_name', ('flow_param', 'event_field')]})
```
This is equivalent to:
```
@trigger(event={'name':'foo', 'parameters':{'common_name': 'common_name', 'flow_param': 'event_field'}})
```

Parameters
----------
event : Union[str, Dict[str, Any]], optional, default None
    Event dependency for this flow.
events : List[Union[str, Dict[str, Any]]], default []
    Events dependency for this flow.
options : Dict[str, Any], default {}
    Backend-specific configuration for tuning eventing behavior.

MF Add To Current
-----------------
trigger -> metaflow.events.Trigger
    Returns `Trigger` if the current run is triggered by an event

    @@ Returns
    -------
    Trigger
        `Trigger` if triggered by an event

\n", + "

decorator @trigger (...)[source]

metaflow

Specifies the event(s) that this flow depends on.

```
@trigger(event='foo')
```
or
```
@trigger(events=['foo', 'bar'])
```

Additionally, you can specify the parameter mappings
to map event payload to Metaflow parameters for the flow.
```
@trigger(event={'name':'foo', 'parameters':{'flow_param': 'event_field'}})
```
or
```
@trigger(events=[{'name':'foo', 'parameters':{'flow_param_1': 'event_field_1'},
                 {'name':'bar', 'parameters':{'flow_param_2': 'event_field_2'}])
```

'parameters' can also be a list of strings and tuples like so:
```
@trigger(event={'name':'foo', 'parameters':['common_name', ('flow_param', 'event_field')]})
```
This is equivalent to:
```
@trigger(event={'name':'foo', 'parameters':{'common_name': 'common_name', 'flow_param': 'event_field'}})
```

Parameters
----------
event : Union[str, Dict[str, Any]], optional, default None
    Event dependency for this flow.
events : List[Union[str, Dict[str, Any]]], default []
    Events dependency for this flow.
options : Dict[str, Any], default {}
    Backend-specific configuration for tuning eventing behavior.

MF Add To Current
-----------------
trigger -> metaflow.events.Trigger
    Returns `Trigger` if the current run is triggered by an event

    @@ Returns
    -------
    Trigger
        `Trigger` if triggered by an event

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -80,7 +80,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/flow-decorators/trigger.md b/docs/api/flow-decorators/trigger.md index 4bc09c42..ae7fa99d 100644 --- a/docs/api/flow-decorators/trigger.md +++ b/docs/api/flow-decorators/trigger.md @@ -7,7 +7,7 @@ Read more in [Triggering Flows Based on External Events](/production/event-trigg - + diff --git a/docs/api/flow-decorators/trigger_on_finish.ipynb b/docs/api/flow-decorators/trigger_on_finish.ipynb index 2af718b6..a4edb18e 100644 --- a/docs/api/flow-decorators/trigger_on_finish.ipynb +++ b/docs/api/flow-decorators/trigger_on_finish.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:06.568582Z", - "iopub.status.busy": "2024-07-25T06:17:06.568499Z", - "iopub.status.idle": "2024-07-25T06:17:06.834174Z", - "shell.execute_reply": "2024-07-25T06:17:06.833777Z" + "iopub.execute_input": "2025-08-30T21:15:44.163175Z", + "iopub.status.busy": "2025-08-30T21:15:44.162638Z", + "iopub.status.idle": "2025-08-30T21:15:44.387772Z", + "shell.execute_reply": "2025-08-30T21:15:44.387508Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:06.836450Z", - "iopub.status.busy": "2024-07-25T06:17:06.836311Z", - "iopub.status.idle": "2024-07-25T06:17:06.843909Z", - "shell.execute_reply": "2024-07-25T06:17:06.843568Z" + "iopub.execute_input": "2025-08-30T21:15:44.389878Z", + "iopub.status.busy": "2025-08-30T21:15:44.389685Z", + "iopub.status.idle": "2025-08-30T21:15:44.397918Z", + "shell.execute_reply": "2025-08-30T21:15:44.397662Z" } }, "outputs": [ @@ -55,9 +55,9 @@ "@@ Returns \n", "------- in \n", "Specifies the flow(s) that this flow depends on.\n", - "... in the docstring of TriggerOnFinishDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/events_decorator.py.\n", + "... in the docstring of TriggerOnFinishDecorator in /Users/ville/src/metaflow/metaflow/plugins/events_decorator.py.\n", " warn(msg)\n", - "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of TriggerOnFinishDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/events_decorator.py.\n", + "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of TriggerOnFinishDecorator in /Users/ville/src/metaflow/metaflow/plugins/events_decorator.py.\n", " warn(msg)\n" ] }, @@ -65,9 +65,9 @@ "data": { "text/html": [ "\n", - "

decorator @trigger_on_finish (...)[source]

metaflow

Specifies the flow(s) that this flow depends on.

```
@trigger_on_finish(flow='FooFlow')
```
or
```
@trigger_on_finish(flows=['FooFlow', 'BarFlow'])
```
This decorator respects the @project decorator and triggers the flow
when upstream runs within the same namespace complete successfully

Additionally, you can specify project aware upstream flow dependencies
by specifying the fully qualified project_flow_name.
```
@trigger_on_finish(flow='my_project.branch.my_branch.FooFlow')
```
or
```
@trigger_on_finish(flows=['my_project.branch.my_branch.FooFlow', 'BarFlow'])
```

You can also specify just the project or project branch (other values will be
inferred from the current project or project branch):
```
@trigger_on_finish(flow={\"name\": \"FooFlow\", \"project\": \"my_project\", \"project_branch\": \"branch\"})
```

Note that `branch` is typically one of:
  - `prod`
  - `user.bob`
  - `test.my_experiment`
  - `prod.staging`

Parameters
----------
flow : Union[str, Dict[str, str]], optional, default None
    Upstream flow dependency for this flow.
flows : List[Union[str, Dict[str, str]]], default []
    Upstream flow dependencies for this flow.
options : Dict[str, Any], default {}
    Backend-specific configuration for tuning eventing behavior.

MF Add To Current
-----------------
trigger -> metaflow.events.Trigger
    Returns `Trigger` if the current run is triggered by an event

    @@ Returns
    -------
    Trigger
        `Trigger` if triggered by an event

\n", + "

decorator @trigger_on_finish (...)[source]

metaflow

Specifies the flow(s) that this flow depends on.

```
@trigger_on_finish(flow='FooFlow')
```
or
```
@trigger_on_finish(flows=['FooFlow', 'BarFlow'])
```
This decorator respects the @project decorator and triggers the flow
when upstream runs within the same namespace complete successfully

Additionally, you can specify project aware upstream flow dependencies
by specifying the fully qualified project_flow_name.
```
@trigger_on_finish(flow='my_project.branch.my_branch.FooFlow')
```
or
```
@trigger_on_finish(flows=['my_project.branch.my_branch.FooFlow', 'BarFlow'])
```

You can also specify just the project or project branch (other values will be
inferred from the current project or project branch):
```
@trigger_on_finish(flow={\"name\": \"FooFlow\", \"project\": \"my_project\", \"project_branch\": \"branch\"})
```

Note that `branch` is typically one of:
  - `prod`
  - `user.bob`
  - `test.my_experiment`
  - `prod.staging`

Parameters
----------
flow : Union[str, Dict[str, str]], optional, default None
    Upstream flow dependency for this flow.
flows : List[Union[str, Dict[str, str]]], default []
    Upstream flow dependencies for this flow.
options : Dict[str, Any], default {}
    Backend-specific configuration for tuning eventing behavior.

MF Add To Current
-----------------
trigger -> metaflow.events.Trigger
    Returns `Trigger` if the current run is triggered by an event

    @@ Returns
    -------
    Trigger
        `Trigger` if triggered by an event

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -80,7 +80,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/flow-decorators/trigger_on_finish.md b/docs/api/flow-decorators/trigger_on_finish.md index be654615..c1a795d2 100644 --- a/docs/api/flow-decorators/trigger_on_finish.md +++ b/docs/api/flow-decorators/trigger_on_finish.md @@ -7,7 +7,7 @@ Read more in [Triggering Flows Based on Other Flows](/production/event-triggerin - + diff --git a/docs/api/flowspec.ipynb b/docs/api/flowspec.ipynb index b0511480..77ba864a 100644 --- a/docs/api/flowspec.ipynb +++ b/docs/api/flowspec.ipynb @@ -46,10 +46,10 @@ "id": "a5ef9454", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.863742Z", - "iopub.status.busy": "2024-07-25T06:16:58.863656Z", - "iopub.status.idle": "2024-07-25T06:16:59.136317Z", - "shell.execute_reply": "2024-07-25T06:16:59.136010Z" + "iopub.execute_input": "2025-08-30T21:15:34.465794Z", + "iopub.status.busy": "2025-08-30T21:15:34.465697Z", + "iopub.status.idle": "2025-08-30T21:15:34.685321Z", + "shell.execute_reply": "2025-08-30T21:15:34.685068Z" } }, "outputs": [ @@ -96,10 +96,10 @@ "id": "5ff62112", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.138672Z", - "iopub.status.busy": "2024-07-25T06:16:59.138509Z", - "iopub.status.idle": "2024-07-25T06:16:59.143884Z", - "shell.execute_reply": "2024-07-25T06:16:59.143633Z" + "iopub.execute_input": "2025-08-30T21:15:34.687580Z", + "iopub.status.busy": "2025-08-30T21:15:34.687431Z", + "iopub.status.idle": "2025-08-30T21:15:34.693074Z", + "shell.execute_reply": "2025-08-30T21:15:34.692839Z" } }, "outputs": [ @@ -107,13 +107,13 @@ "data": { "text/html": [ "\n", - "

method FlowSpec.next *dsts, foreach=None[source]

Indicates the next step to execute after this step has completed.

This statement should appear as the last statement of each step, except
the end step.

There are several valid formats to specify the next step:

- Straight-line connection: `self.next(self.next_step)` where `next_step` is a method in
  the current class decorated with the `@step` decorator.

- Static fan-out connection: `self.next(self.step1, self.step2, ...)` where `stepX` are
  methods in the current class decorated with the `@step` decorator.

- Foreach branch:
  ```
  self.next(self.foreach_step, foreach='foreach_iterator')
  ```
  In this situation, `foreach_step` is a method in the current class decorated with the
  `@step` decorator and `foreach_iterator` is a variable name in the current class that
  evaluates to an iterator. A task will be launched for each value in the iterator and
  each task will execute the code specified by the step `foreach_step`.

Parameters
----------
dsts : Callable[..., None]
    One or more methods annotated with `@step`.

Raises
------
InvalidNextException
    Raised if the format of the arguments does not match one of the ones given above.

\n", + "

method FlowSpec.next *dsts, foreach=None, condition=None[source]

Indicates the next step to execute after this step has completed.

This statement should appear as the last statement of each step, except
the end step.

There are several valid formats to specify the next step:

- Straight-line connection: `self.next(self.next_step)` where `next_step` is a method in
  the current class decorated with the `@step` decorator.

- Static fan-out connection: `self.next(self.step1, self.step2, ...)` where `stepX` are
  methods in the current class decorated with the `@step` decorator.

- Foreach branch:
  ```
  self.next(self.foreach_step, foreach='foreach_iterator')
  ```
  In this situation, `foreach_step` is a method in the current class decorated with the
  `@step` decorator and `foreach_iterator` is a variable name in the current class that
  evaluates to an iterator. A task will be launched for each value in the iterator and
  each task will execute the code specified by the step `foreach_step`.

- Switch statement:
  ```
  self.next({\"case1\": self.step_a, \"case2\": self.step_b}, condition='condition_variable')
  ```
  In this situation, `step_a` and `step_b` are methods in the current class decorated
  with the `@step` decorator and `condition_variable` is a variable name in the current
  class. The value of the condition variable determines which step to execute. If the
  value doesn't match any of the dictionary keys, a RuntimeError is raised.

Parameters
----------
dsts : Callable[..., None]
    One or more methods annotated with `@step`.

Raises
------
InvalidNextException
    Raised if the format of the arguments does not match one of the ones given above.

\n", "
\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\t\n", "\n", @@ -123,7 +123,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -132,7 +132,7 @@ } ], "source": [ - "ShowDoc(FlowSpec.next, spoofstr=('*dsts, foreach=None'))" + "ShowDoc(FlowSpec.next, spoofstr=('*dsts, foreach=None, condition=None'))" ] }, { @@ -151,10 +151,10 @@ "id": "64fcbf33", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.145679Z", - "iopub.status.busy": "2024-07-25T06:16:59.145585Z", - "iopub.status.idle": "2024-07-25T06:16:59.147692Z", - "shell.execute_reply": "2024-07-25T06:16:59.147438Z" + "iopub.execute_input": "2025-08-30T21:15:34.694981Z", + "iopub.status.busy": "2025-08-30T21:15:34.694886Z", + "iopub.status.idle": "2025-08-30T21:15:34.696944Z", + "shell.execute_reply": "2025-08-30T21:15:34.696668Z" } }, "outputs": [ @@ -173,7 +173,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -191,10 +191,10 @@ "id": "13dacb8d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.149936Z", - "iopub.status.busy": "2024-07-25T06:16:59.149848Z", - "iopub.status.idle": "2024-07-25T06:16:59.152064Z", - "shell.execute_reply": "2024-07-25T06:16:59.151791Z" + "iopub.execute_input": "2025-08-30T21:15:34.699341Z", + "iopub.status.busy": "2025-08-30T21:15:34.699243Z", + "iopub.status.idle": "2025-08-30T21:15:34.701394Z", + "shell.execute_reply": "2025-08-30T21:15:34.701143Z" } }, "outputs": [ @@ -213,7 +213,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -231,10 +231,10 @@ "id": "684b4601", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.154283Z", - "iopub.status.busy": "2024-07-25T06:16:59.154195Z", - "iopub.status.idle": "2024-07-25T06:16:59.156811Z", - "shell.execute_reply": "2024-07-25T06:16:59.156579Z" + "iopub.execute_input": "2025-08-30T21:15:34.703589Z", + "iopub.status.busy": "2025-08-30T21:15:34.703503Z", + "iopub.status.idle": "2025-08-30T21:15:34.705892Z", + "shell.execute_reply": "2025-08-30T21:15:34.705672Z" } }, "outputs": [ @@ -242,9 +242,9 @@ "data": { "text/html": [ "\n", - "

method FlowSpec.foreach_stack (self) -> Optional[List[Tuple[int, int, Any]]][source]

Returns the current stack of foreach indexes and values for the current step.

Use this information to understand what data is being processed in the current
foreach branch. For example, considering the following code:
```
@step
def root(self):
    self.split_1 = ['a', 'b', 'c']
    self.next(self.nest_1, foreach='split_1')

@step
def nest_1(self):
    self.split_2 = ['d', 'e', 'f', 'g']
    self.next(self.nest_2, foreach='split_2'):

@step
def nest_2(self):
    foo = self.foreach_stack()
```

`foo` will take the following values in the various tasks for nest_2:
```
    [(0, 3, 'a'), (0, 4, 'd')]
    [(0, 3, 'a'), (1, 4, 'e')]
    ...
    [(0, 3, 'a'), (3, 4, 'g')]
    [(1, 3, 'b'), (0, 4, 'd')]
    ...
```
where each tuple corresponds to:

- The index of the task for that level of the loop.
- The number of splits for that level of the loop.
- The value for that level of the loop.

Note that the last tuple returned in a task corresponds to:

- 1st element: value returned by `self.index`.
- 3rd element: value returned by `self.input`.

Returns
-------
List[Tuple[int, int, Any]]
    An array describing the current stack of foreach steps.

\n", + "

method FlowSpec.foreach_stack (self) -> Optional[List[Tuple[int, int, Any]]][source]

Returns the current stack of foreach indexes and values for the current step.

Use this information to understand what data is being processed in the current
foreach branch. For example, considering the following code:
```
@step
def root(self):
    self.split_1 = ['a', 'b', 'c']
    self.next(self.nest_1, foreach='split_1')

@step
def nest_1(self):
    self.split_2 = ['d', 'e', 'f', 'g']
    self.next(self.nest_2, foreach='split_2'):

@step
def nest_2(self):
    foo = self.foreach_stack()
```

`foo` will take the following values in the various tasks for nest_2:
```
    [(0, 3, 'a'), (0, 4, 'd')]
    [(0, 3, 'a'), (1, 4, 'e')]
    ...
    [(0, 3, 'a'), (3, 4, 'g')]
    [(1, 3, 'b'), (0, 4, 'd')]
    ...
```
where each tuple corresponds to:

- The index of the task for that level of the loop.
- The number of splits for that level of the loop.
- The value for that level of the loop.

Note that the last tuple returned in a task corresponds to:

- 1st element: value returned by `self.index`.
- 3rd element: value returned by `self.input`.

Returns
-------
List[Tuple[int, int, Any]]
    An array describing the current stack of foreach steps.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -255,7 +255,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -273,10 +273,10 @@ "id": "e45999eb", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.159193Z", - "iopub.status.busy": "2024-07-25T06:16:59.159095Z", - "iopub.status.idle": "2024-07-25T06:16:59.162186Z", - "shell.execute_reply": "2024-07-25T06:16:59.161965Z" + "iopub.execute_input": "2025-08-30T21:15:34.708202Z", + "iopub.status.busy": "2025-08-30T21:15:34.708118Z", + "iopub.status.idle": "2025-08-30T21:15:34.711204Z", + "shell.execute_reply": "2025-08-30T21:15:34.710996Z" } }, "outputs": [ @@ -284,9 +284,9 @@ "data": { "text/html": [ "\n", - "

method FlowSpec.merge_artifacts (self, inputs: metaflow.datastore.inputs.Inputs, exclude: Optional[List[str]] = None, include: Optional[List[str]] = None) -> None[source]

Helper function for merging artifacts in a join step.

This function takes all the artifacts coming from the branches of a
join point and assigns them to self in the calling step. Only artifacts
not set in the current step are considered. If, for a given artifact, different
values are present on the incoming edges, an error will be thrown and the artifacts
that conflict will be reported.

As a few examples, in the simple graph: A splitting into B and C and joining in D:
```
A:
  self.x = 5
  self.y = 6
B:
  self.b_var = 1
  self.x = from_b
C:
  self.x = from_c

D:
  merge_artifacts(inputs)
```
In D, the following artifacts are set:
  - `y` (value: 6), `b_var` (value: 1)
  - if `from_b` and `from_c` are the same, `x` will be accessible and have value `from_b`
  - if `from_b` and `from_c` are different, an error will be thrown. To prevent this error,
    you need to manually set `self.x` in D to a merged value (for example the max) prior to
    calling `merge_artifacts`.

Parameters
----------
inputs : Inputs
    Incoming steps to the join point.
exclude : List[str], optional, default None
    If specified, do not consider merging artifacts with a name in `exclude`.
    Cannot specify if `include` is also specified.
include : List[str], optional, default None
    If specified, only merge artifacts specified. Cannot specify if `exclude` is
    also specified.

Raises
------
MetaflowException
    This exception is thrown if this is not called in a join step.
UnhandledInMergeArtifactsException
    This exception is thrown in case of unresolved conflicts.
MissingInMergeArtifactsException
    This exception is thrown in case an artifact specified in `include` cannot
    be found.

\n", + "

method FlowSpec.merge_artifacts (self, inputs: metaflow.datastore.inputs.Inputs, exclude: Optional[List[str]] = None, include: Optional[List[str]] = None) -> None[source]

Helper function for merging artifacts in a join step.

This function takes all the artifacts coming from the branches of a
join point and assigns them to self in the calling step. Only artifacts
not set in the current step are considered. If, for a given artifact, different
values are present on the incoming edges, an error will be thrown and the artifacts
that conflict will be reported.

As a few examples, in the simple graph: A splitting into B and C and joining in D:
```
A:
  self.x = 5
  self.y = 6
B:
  self.b_var = 1
  self.x = from_b
C:
  self.x = from_c

D:
  merge_artifacts(inputs)
```
In D, the following artifacts are set:
  - `y` (value: 6), `b_var` (value: 1)
  - if `from_b` and `from_c` are the same, `x` will be accessible and have value `from_b`
  - if `from_b` and `from_c` are different, an error will be thrown. To prevent this error,
    you need to manually set `self.x` in D to a merged value (for example the max) prior to
    calling `merge_artifacts`.

Parameters
----------
inputs : Inputs
    Incoming steps to the join point.
exclude : List[str], optional, default None
    If specified, do not consider merging artifacts with a name in `exclude`.
    Cannot specify if `include` is also specified.
include : List[str], optional, default None
    If specified, only merge artifacts specified. Cannot specify if `exclude` is
    also specified.

Raises
------
MetaflowException
    This exception is thrown if this is not called in a join step.
UnhandledInMergeArtifactsException
    This exception is thrown in case of unresolved conflicts.
MissingInMergeArtifactsException
    This exception is thrown in case an artifact specified in `include` cannot
    be found.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -304,7 +304,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -334,10 +334,10 @@ "id": "bd1c2814", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.163926Z", - "iopub.status.busy": "2024-07-25T06:16:59.163856Z", - "iopub.status.idle": "2024-07-25T06:16:59.169464Z", - "shell.execute_reply": "2024-07-25T06:16:59.169261Z" + "iopub.execute_input": "2025-08-30T21:15:34.713491Z", + "iopub.status.busy": "2025-08-30T21:15:34.713413Z", + "iopub.status.idle": "2025-08-30T21:15:34.720039Z", + "shell.execute_reply": "2025-08-30T21:15:34.719834Z" }, "scrolled": true }, @@ -346,25 +346,25 @@ "data": { "text/html": [ "\n", - "

class Parameter (name: str, default: Union[str, float, int, bool, Dict[str, Any], Callable[[], Union[str, float, int, bool, Dict[str, Any]]], NoneType] = None, type: Union[Type[str], Type[float], Type[int], Type[bool], metaflow.parameters.JSONTypeClass, NoneType] = None, help: Optional[str] = None, required: bool = False, show_default: bool = True, **kwargs: Dict[str, Any])[source]

Defines a parameter for a flow.

Parameters must be instantiated as class variables in flow classes, e.g.
```
class MyFlow(FlowSpec):
    param = Parameter('myparam')
```
in this case, the parameter is specified on the command line as
```
python myflow.py run --myparam=5
```
and its value is accessible through a read-only artifact like this:
```
print(self.param == 5)
```
Note that the user-visible parameter name, `myparam` above, can be
different from the artifact name, `param` above.

The parameter value is converted to a Python type based on the `type`
argument or to match the type of `default`, if it is set.

Parameters
----------
name : str
    User-visible parameter name.
default : str or float or int or bool or `JSONType` or a function.
    Default value for the parameter. Use a special `JSONType` class to
    indicate that the value must be a valid JSON object. A function
    implies that the parameter corresponds to a *deploy-time parameter*.
    The type of the default value is used as the parameter `type`.
type : Type, default None
    If `default` is not specified, define the parameter type. Specify
    one of `str`, `float`, `int`, `bool`, or `JSONType`. If None, defaults
    to the type of `default` or `str` if none specified.
help : str, optional
    Help text to show in `run --help`.
required : bool, default False
    Require that the user specified a value for the parameter.
    `required=True` implies that the `default` is not used.
show_default : bool, default True
    If True, show the default value in the help text.

\n", + "

class Parameter (name: str, default: Union[str, float, int, bool, Dict[str, Any], Callable[[metaflow.parameters.ParameterContext], Union[str, float, int, bool, Dict[str, Any]]], NoneType] = None, type: Union[Type[str], Type[float], Type[int], Type[bool], metaflow.parameters.JSONTypeClass, NoneType] = None, help: Optional[str] = None, required: Optional[bool] = None, show_default: Optional[bool] = None, **kwargs: Dict[str, Any])[source]

Defines a parameter for a flow.

Parameters must be instantiated as class variables in flow classes, e.g.
```
class MyFlow(FlowSpec):
    param = Parameter('myparam')
```
in this case, the parameter is specified on the command line as
```
python myflow.py run --myparam=5
```
and its value is accessible through a read-only artifact like this:
```
print(self.param == 5)
```
Note that the user-visible parameter name, `myparam` above, can be
different from the artifact name, `param` above.

The parameter value is converted to a Python type based on the `type`
argument or to match the type of `default`, if it is set.

Parameters
----------
name : str
    User-visible parameter name.
default : Union[str, float, int, bool, Dict[str, Any],
            Callable[
                [ParameterContext], Union[str, float, int, bool, Dict[str, Any]]
            ],
        ], optional, default None
    Default value for the parameter. Use a special `JSONType` class to
    indicate that the value must be a valid JSON object. A function
    implies that the parameter corresponds to a *deploy-time parameter*.
    The type of the default value is used as the parameter `type`.
type : Type, default None
    If `default` is not specified, define the parameter type. Specify
    one of `str`, `float`, `int`, `bool`, or `JSONType`. If None, defaults
    to the type of `default` or `str` if none specified.
help : str, optional, default None
    Help text to show in `run --help`.
required : bool, optional, default None
    Require that the user specifies a value for the parameter. Note that if
    a default is provide, the required flag is ignored.
    A value of None is equivalent to False.
show_default : bool, optional, default None
    If True, show the default value in the help text. A value of None is equivalent
    to True.

\n", "
\n", - "\n", + "\n", "\n", - "\n", + "\n", "\n", "\n", "\n", "\t\n", - "\t\n", + "\t\n", "\t\n", - "\t\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 7, @@ -382,10 +382,10 @@ "id": "46b05162", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.171289Z", - "iopub.status.busy": "2024-07-25T06:16:59.171208Z", - "iopub.status.idle": "2024-07-25T06:16:59.172794Z", - "shell.execute_reply": "2024-07-25T06:16:59.172555Z" + "iopub.execute_input": "2025-08-30T21:15:34.721776Z", + "iopub.status.busy": "2025-08-30T21:15:34.721711Z", + "iopub.status.idle": "2025-08-30T21:15:34.723247Z", + "shell.execute_reply": "2025-08-30T21:15:34.723062Z" } }, "outputs": [], @@ -435,10 +435,10 @@ "id": "0b2d482c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.174707Z", - "iopub.status.busy": "2024-07-25T06:16:59.174616Z", - "iopub.status.idle": "2024-07-25T06:16:59.176984Z", - "shell.execute_reply": "2024-07-25T06:16:59.176727Z" + "iopub.execute_input": "2025-08-30T21:15:34.724966Z", + "iopub.status.busy": "2025-08-30T21:15:34.724888Z", + "iopub.status.idle": "2025-08-30T21:15:34.727078Z", + "shell.execute_reply": "2025-08-30T21:15:34.726857Z" } }, "outputs": [ @@ -461,7 +461,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -489,10 +489,10 @@ "id": "3b674cdb", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.179209Z", - "iopub.status.busy": "2024-07-25T06:16:59.179111Z", - "iopub.status.idle": "2024-07-25T06:16:59.184913Z", - "shell.execute_reply": "2024-07-25T06:16:59.184699Z" + "iopub.execute_input": "2025-08-30T21:15:34.729486Z", + "iopub.status.busy": "2025-08-30T21:15:34.729386Z", + "iopub.status.idle": "2025-08-30T21:15:34.734749Z", + "shell.execute_reply": "2025-08-30T21:15:34.734529Z" } }, "outputs": [ @@ -500,9 +500,9 @@ "data": { "text/html": [ "\n", - "

class IncludeFile name, **kwargs[source]

Includes a local file as a parameter for the flow.

`IncludeFile` behaves like `Parameter` except that it reads its value from a file instead of
the command line. The user provides a path to a file on the command line. The file contents
are saved as a read-only artifact which is available in all steps of the flow.

Parameters
----------
name : str
    User-visible parameter name.
default : Union[str, Callable[ParameterContext, str]]
    Default path to a local file. A function
    implies that the parameter corresponds to a *deploy-time parameter*.
is_text : bool, default True
    Convert the file contents to a string using the provided `encoding`.
    If False, the artifact is stored in `bytes`.
encoding : str, optional, default 'utf-8'
    Use this encoding to decode the file contexts if `is_text=True`.
required : bool, default False
    Require that the user specified a value for the parameter.
    `required=True` implies that the `default` is not used.
help : str, optional
    Help text to show in `run --help`.
show_default : bool, default True
    If True, show the default value in the help text.

\n", + "

class IncludeFile name, **kwargs[source]

Includes a local file as a parameter for the flow.

`IncludeFile` behaves like `Parameter` except that it reads its value from a file instead of
the command line. The user provides a path to a file on the command line. The file contents
are saved as a read-only artifact which is available in all steps of the flow.

Parameters
----------
name : str
    User-visible parameter name.
default : Union[str, Callable[ParameterContext, str]]
    Default path to a local file. A function
    implies that the parameter corresponds to a *deploy-time parameter*.
is_text : bool, optional, default None
    Convert the file contents to a string using the provided `encoding`.
    If False, the artifact is stored in `bytes`. A value of None is equivalent to
    True.
encoding : str, optional, default None
    Use this encoding to decode the file contexts if `is_text=True`. A value of None
    is equivalent to utf-8.
required : bool, optional, default None
    Require that the user specified a value for the parameter.
    `required=True` implies that the `default` is not used. A value of None is
    equivalent to False
help : str, optional
    Help text to show in `run --help`.
show_default : bool, default True
    If True, show the default value in the help text. A value of None is equivalent
    to True.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -510,16 +510,16 @@ "\n", "\t\n", "\t\n", - "\t\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", "\t\n", - "\t\n", + "\t\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -542,7 +542,7 @@ ], "metadata": { "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, diff --git a/docs/api/flowspec.md b/docs/api/flowspec.md index 84873f32..f41d593b 100644 --- a/docs/api/flowspec.md +++ b/docs/api/flowspec.md @@ -33,11 +33,11 @@ To query and manipulate the currently executing run inside your flow, see the [` Annotate methods that are a part of your Metaflow workflow with [the `@step` decorator](/api/step-decorators/step). Use `FlowSpec.next` to define transitions between steps: - + - + - + @@ -72,7 +72,7 @@ Use the operations below, `FlowSpec.input`, `FlowSpec.index`, and `FlowSpec.fore - + @@ -84,7 +84,7 @@ Use the operations below, `FlowSpec.input`, `FlowSpec.index`, and `FlowSpec.fore - + @@ -109,18 +109,18 @@ The `Parameter` class is used to define parameters for a flow. The `Parameter` objects must be defined as class variables inside a flow. The parameter values are available as read-only artifacts in all steps of the flow. For instructions, see [How to define parameters for flows](/metaflow/basics#how-to-define-parameters-for-flows). - + - + - + - - - + + + @@ -160,7 +160,7 @@ The function called gets a parameter `context` that contains attributes about th The `IncludeFile` object is a special `Parameter` that reads its value from a local file. For an example, see [Data in Local Files](/scaling/data#data-in-local-files). - + @@ -168,11 +168,11 @@ The `IncludeFile` object is a special `Parameter` that reads its value from a lo - - - + + + - + diff --git a/docs/api/runner.ipynb b/docs/api/runner.ipynb index c2e7497d..c7055efc 100644 --- a/docs/api/runner.ipynb +++ b/docs/api/runner.ipynb @@ -22,10 +22,10 @@ "id": "d4c02781", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:56.846165Z", - "iopub.status.busy": "2024-07-25T06:16:56.846050Z", - "iopub.status.idle": "2024-07-25T06:16:57.124578Z", - "shell.execute_reply": "2024-07-25T06:16:57.124277Z" + "iopub.execute_input": "2025-08-30T21:15:34.988476Z", + "iopub.status.busy": "2025-08-30T21:15:34.988399Z", + "iopub.status.idle": "2025-08-30T21:15:35.210301Z", + "shell.execute_reply": "2025-08-30T21:15:35.210016Z" } }, "outputs": [], @@ -70,10 +70,10 @@ "id": "7fdb7184", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.127133Z", - "iopub.status.busy": "2024-07-25T06:16:57.126981Z", - "iopub.status.idle": "2024-07-25T06:16:57.135119Z", - "shell.execute_reply": "2024-07-25T06:16:57.134789Z" + "iopub.execute_input": "2025-08-30T21:15:35.212401Z", + "iopub.status.busy": "2025-08-30T21:15:35.212261Z", + "iopub.status.idle": "2025-08-30T21:15:35.220479Z", + "shell.execute_reply": "2025-08-30T21:15:35.220273Z" } }, "outputs": [ @@ -81,25 +81,26 @@ "data": { "text/html": [ "\n", - "

class Runner flow_file, show_output=True, profile=None, env=None, cwd=None, **kwargs[source]

Metaflow's Runner API that presents a programmatic interface
to run flows and perform other operations either synchronously or asynchronously.
The class expects a path to the flow file along with optional arguments
that match top-level options on the command-line.

This class works as a context manager, calling `cleanup()` to remove
temporary files at exit.

Example:
```python
with Runner('slowflow.py', pylint=False) as runner:
    result = runner.run(alpha=5, tags=[\"abc\", \"def\"], max_workers=5)
    print(result.run.finished)
```

Parameters
----------
flow_file : str
    Path to the flow file to run
show_output : bool, default True
    Show the 'stdout' and 'stderr' to the console by default,
    Only applicable for synchronous 'run' and 'resume' functions.
profile : Optional[str], default None
    Metaflow profile to use to run this run. If not specified, the default
    profile is used (or the one already set using `METAFLOW_PROFILE`)
env : Optional[Dict], default None
    Additional environment variables to set for the Run. This overrides the
    environment set for this process.
cwd : Optional[str], default None
    The directory to run the subprocess in; if not specified, the current
    directory is used.
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` before
    the `run` command.

\n", + "

class Runner flow_file, show_output=True, profile=None, env=None, cwd=None, **kwargs[source]

Metaflow's Runner API that presents a programmatic interface
to run flows and perform other operations either synchronously or asynchronously.
The class expects a path to the flow file along with optional arguments
that match top-level options on the command-line.

This class works as a context manager, calling `cleanup()` to remove
temporary files at exit.

Example:
```python
with Runner('slowflow.py', pylint=False) as runner:
    result = runner.run(alpha=5, tags=[\"abc\", \"def\"], max_workers=5)
    print(result.run.finished)
```

Parameters
----------
flow_file : str
    Path to the flow file to run, relative to current directory.
show_output : bool, default True
    Show the 'stdout' and 'stderr' to the console by default,
    Only applicable for synchronous 'run' and 'resume' functions.
profile : str, optional, default None
    Metaflow profile to use to run this run. If not specified, the default
    profile is used (or the one already set using `METAFLOW_PROFILE`)
env : Dict[str, str], optional, default None
    Additional environment variables to set for the Run. This overrides the
    environment set for this process.
cwd : str, optional, default None
    The directory to run the subprocess in; if not specified, the current
    directory is used.
file_read_timeout : int, default 3600
    The timeout until which we try to read the runner attribute file (in seconds).
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` before
    the `run` command.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", "\n", "\n", - "\t\n", + "\t\n", "\t\n", - "\t\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", + "\t\n", "\t\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, @@ -117,10 +118,10 @@ "id": "84c83820", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.137507Z", - "iopub.status.busy": "2024-07-25T06:16:57.137393Z", - "iopub.status.idle": "2024-07-25T06:16:57.139946Z", - "shell.execute_reply": "2024-07-25T06:16:57.139696Z" + "iopub.execute_input": "2025-08-30T21:15:35.222262Z", + "iopub.status.busy": "2025-08-30T21:15:35.222173Z", + "iopub.status.idle": "2025-08-30T21:15:35.224441Z", + "shell.execute_reply": "2025-08-30T21:15:35.224231Z" } }, "outputs": [ @@ -128,9 +129,9 @@ "data": { "text/html": [ "\n", - "

method Runner.cleanup (self)[source]

Delete any temporary files created during execution.

\n", + "

method Runner.cleanup (self)[source]

Delete any temporary files created during execution.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -139,7 +140,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 3, @@ -167,10 +168,10 @@ "id": "8c61365e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.142811Z", - "iopub.status.busy": "2024-07-25T06:16:57.142681Z", - "iopub.status.idle": "2024-07-25T06:16:57.145665Z", - "shell.execute_reply": "2024-07-25T06:16:57.145412Z" + "iopub.execute_input": "2025-08-30T21:15:35.226707Z", + "iopub.status.busy": "2025-08-30T21:15:35.226623Z", + "iopub.status.idle": "2025-08-30T21:15:35.228936Z", + "shell.execute_reply": "2025-08-30T21:15:35.228733Z" } }, "outputs": [ @@ -178,9 +179,9 @@ "data": { "text/html": [ "\n", - "

method Runner.run (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Blocking execution of the run. This method will wait until
the run has completed execution.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun containing the results of the run.

\n", + "

method Runner.run (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Blocking execution of the run. This method will wait until
the run has completed execution.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun containing the results of the run.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -194,7 +195,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 4, @@ -212,10 +213,10 @@ "id": "4fe8757c", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.147705Z", - "iopub.status.busy": "2024-07-25T06:16:57.147607Z", - "iopub.status.idle": "2024-07-25T06:16:57.150376Z", - "shell.execute_reply": "2024-07-25T06:16:57.150071Z" + "iopub.execute_input": "2025-08-30T21:15:35.231042Z", + "iopub.status.busy": "2025-08-30T21:15:35.230953Z", + "iopub.status.idle": "2025-08-30T21:15:35.233324Z", + "shell.execute_reply": "2025-08-30T21:15:35.233126Z" } }, "outputs": [ @@ -223,9 +224,9 @@ "data": { "text/html": [ "\n", - "

method Runner.resume (self, **kwargs)[source]

Blocking resume execution of the run.
This method will wait until the resumed run has completed execution.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python ./myflow.py` after
    the `resume` command.

Returns
-------
ExecutingRun
    ExecutingRun containing the results of the resumed run.

\n", + "

method Runner.resume (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Blocking resume execution of the run.
This method will wait until the resumed run has completed execution.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python ./myflow.py` after
    the `resume` command.

Returns
-------
ExecutingRun
    ExecutingRun containing the results of the resumed run.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -239,7 +240,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 5, @@ -265,10 +266,10 @@ "id": "ec91c1c9", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.152518Z", - "iopub.status.busy": "2024-07-25T06:16:57.152422Z", - "iopub.status.idle": "2024-07-25T06:16:57.155118Z", - "shell.execute_reply": "2024-07-25T06:16:57.154859Z" + "iopub.execute_input": "2025-08-30T21:15:35.235528Z", + "iopub.status.busy": "2025-08-30T21:15:35.235434Z", + "iopub.status.idle": "2025-08-30T21:15:35.237830Z", + "shell.execute_reply": "2025-08-30T21:15:35.237625Z" } }, "outputs": [ @@ -276,9 +277,9 @@ "data": { "text/html": [ "\n", - "

method Runner.async_run (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", + "

method Runner.async_run (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -292,7 +293,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 6, @@ -310,10 +311,10 @@ "id": "87f8530a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.157634Z", - "iopub.status.busy": "2024-07-25T06:16:57.157548Z", - "iopub.status.idle": "2024-07-25T06:16:57.160071Z", - "shell.execute_reply": "2024-07-25T06:16:57.159813Z" + "iopub.execute_input": "2025-08-30T21:15:35.239548Z", + "iopub.status.busy": "2025-08-30T21:15:35.239465Z", + "iopub.status.idle": "2025-08-30T21:15:35.241826Z", + "shell.execute_reply": "2025-08-30T21:15:35.241641Z" } }, "outputs": [ @@ -321,9 +322,9 @@ "data": { "text/html": [ "\n", - "

method Runner.async_resume (self, **kwargs)[source]

Non-blocking resume execution of the run.
This method will return as soon as the resume has launched.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `resume` command.

Returns
-------
ExecutingRun
    ExecutingRun representing the resumed run that was started.

\n", + "

method Runner.async_resume (self, **kwargs) -> metaflow.runner.metaflow_runner.ExecutingRun[source]

Non-blocking resume execution of the run.
This method will return as soon as the resume has launched.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `resume` command.

Returns
-------
ExecutingRun
    ExecutingRun representing the resumed run that was started.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -337,7 +338,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 7, @@ -365,10 +366,10 @@ "id": "9fb445c1", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.162889Z", - "iopub.status.busy": "2024-07-25T06:16:57.162793Z", - "iopub.status.idle": "2024-07-25T06:16:57.167115Z", - "shell.execute_reply": "2024-07-25T06:16:57.166894Z" + "iopub.execute_input": "2025-08-30T21:15:35.243968Z", + "iopub.status.busy": "2025-08-30T21:15:35.243888Z", + "iopub.status.idle": "2025-08-30T21:15:35.247743Z", + "shell.execute_reply": "2025-08-30T21:15:35.247549Z" } }, "outputs": [ @@ -376,9 +377,9 @@ "data": { "text/html": [ "\n", - "

class NBRunner flow, show_output=True, profile=None, env=None, base_dir=None, **kwargs[source]

A  wrapper over `Runner` for executing flows defined in a Jupyter
notebook cell.

Instantiate this class on the last line of a notebook cell where
a `flow` is defined. In contrast to `Runner`, this class is not
meant to be used in a context manager. Instead, use a blocking helper
function like `nbrun` (which calls `cleanup()` internally) or call
`cleanup()` explictly when using non-blocking APIs.

```python
run = NBRunner(FlowName).nbrun()
```

Parameters
----------
flow : FlowSpec
    Flow defined in the same cell
show_output : bool, default True
    Show the 'stdout' and 'stderr' to the console by default,
    Only applicable for synchronous 'run' and 'resume' functions.
profile : Optional[str], default None
    Metaflow profile to use to run this run. If not specified, the default
    profile is used (or the one already set using `METAFLOW_PROFILE`)
env : Optional[Dict], default None
    Additional environment variables to set for the Run. This overrides the
    environment set for this process.
base_dir : Optional[str], default None
    The directory to run the subprocess in; if not specified, a temporary
    directory is used.
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` before
    the `run` command.

\n", + "

class NBRunner flow, show_output=True, profile=None, env=None, base_dir=None, **kwargs[source]

A  wrapper over `Runner` for executing flows defined in a Jupyter
notebook cell.

Instantiate this class on the last line of a notebook cell where
a `flow` is defined. In contrast to `Runner`, this class is not
meant to be used in a context manager. Instead, use a blocking helper
function like `nbrun` (which calls `cleanup()` internally) or call
`cleanup()` explictly when using non-blocking APIs.

```python
run = NBRunner(FlowName).nbrun()
```

Parameters
----------
flow : FlowSpec
    Flow defined in the same cell
show_output : bool, default True
    Show the 'stdout' and 'stderr' to the console by default,
    Only applicable for synchronous 'run' and 'resume' functions.
profile : str, optional, default None
    Metaflow profile to use to run this run. If not specified, the default
    profile is used (or the one already set using `METAFLOW_PROFILE`)
env : Dict[str, str], optional, default None
    Additional environment variables to set for the Run. This overrides the
    environment set for this process.
base_dir : str, optional, default None
    The directory to run the subprocess in; if not specified, the current
    working directory is used.
file_read_timeout : int, default 3600
    The timeout until which we try to read the runner attribute file (in seconds).
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` before
    the `run` command.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -386,15 +387,16 @@ "\n", "\t\n", "\t\n", - "\t\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", + "\t\n", "\t\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 8, @@ -420,10 +422,10 @@ "id": "8971ea07", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.168964Z", - "iopub.status.busy": "2024-07-25T06:16:57.168888Z", - "iopub.status.idle": "2024-07-25T06:16:57.171468Z", - "shell.execute_reply": "2024-07-25T06:16:57.171219Z" + "iopub.execute_input": "2025-08-30T21:15:35.249416Z", + "iopub.status.busy": "2025-08-30T21:15:35.249349Z", + "iopub.status.idle": "2025-08-30T21:15:35.251464Z", + "shell.execute_reply": "2025-08-30T21:15:35.251243Z" } }, "outputs": [ @@ -431,9 +433,9 @@ "data": { "text/html": [ "\n", - "

method NBRunner.nbrun (self, **kwargs)[source]

Blocking execution of the run. This method will wait until
the run has completed execution.

Note that in contrast to `run`, this method returns a
`metaflow.Run` object directly and calls `cleanup()` internally
to support a common notebook pattern of executing a flow and
retrieving its results immediately.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
Run
    A `metaflow.Run` object representing the finished run.

\n", + "

method NBRunner.nbrun (self, **kwargs)[source]

Blocking execution of the run. This method will wait until
the run has completed execution.

Note that in contrast to `run`, this method returns a
`metaflow.Run` object directly and calls `cleanup()` internally
to support a common notebook pattern of executing a flow and
retrieving its results immediately.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
Run
    A `metaflow.Run` object representing the finished run.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -447,7 +449,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 9, @@ -465,10 +467,10 @@ "id": "95a8c5b3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.173679Z", - "iopub.status.busy": "2024-07-25T06:16:57.173586Z", - "iopub.status.idle": "2024-07-25T06:16:57.176168Z", - "shell.execute_reply": "2024-07-25T06:16:57.175862Z" + "iopub.execute_input": "2025-08-30T21:15:35.253539Z", + "iopub.status.busy": "2025-08-30T21:15:35.253467Z", + "iopub.status.idle": "2025-08-30T21:15:35.255678Z", + "shell.execute_reply": "2025-08-30T21:15:35.255491Z" } }, "outputs": [ @@ -476,9 +478,9 @@ "data": { "text/html": [ "\n", - "

method NBRunner.nbresume (self, **kwargs)[source]

Blocking resuming of a run. This method will wait until
the resumed run has completed execution.

Note that in contrast to `resume`, this method returns a
`metaflow.Run` object directly and calls `cleanup()` internally
to support a common notebook pattern of executing a flow and
retrieving its results immediately.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `resume` command.

Returns
-------
Run
    A `metaflow.Run` object representing the resumed run.

\n", + "

method NBRunner.nbresume (self, **kwargs)[source]

Blocking resuming of a run. This method will wait until
the resumed run has completed execution.

Note that in contrast to `resume`, this method returns a
`metaflow.Run` object directly and calls `cleanup()` internally
to support a common notebook pattern of executing a flow and
retrieving its results immediately.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `resume` command.

Returns
-------
Run
    A `metaflow.Run` object representing the resumed run.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -492,7 +494,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 10, @@ -518,10 +520,10 @@ "id": "e9360c97", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.178155Z", - "iopub.status.busy": "2024-07-25T06:16:57.178052Z", - "iopub.status.idle": "2024-07-25T06:16:57.180530Z", - "shell.execute_reply": "2024-07-25T06:16:57.180242Z" + "iopub.execute_input": "2025-08-30T21:15:35.257784Z", + "iopub.status.busy": "2025-08-30T21:15:35.257705Z", + "iopub.status.idle": "2025-08-30T21:15:35.259740Z", + "shell.execute_reply": "2025-08-30T21:15:35.259557Z" } }, "outputs": [ @@ -529,9 +531,9 @@ "data": { "text/html": [ "\n", - "

method NBRunner.async_run (self, **kwargs)[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched. This method is equivalent to `Runner.async_run`.

Note that this method is asynchronous and needs to be `await`ed.


Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", + "

method NBRunner.async_run (self, **kwargs)[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched. This method is equivalent to `Runner.async_run`.

Note that this method is asynchronous and needs to be `await`ed.


Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -545,7 +547,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 11, @@ -563,10 +565,10 @@ "id": "dd48a525", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.182821Z", - "iopub.status.busy": "2024-07-25T06:16:57.182745Z", - "iopub.status.idle": "2024-07-25T06:16:57.185040Z", - "shell.execute_reply": "2024-07-25T06:16:57.184827Z" + "iopub.execute_input": "2025-08-30T21:15:35.262077Z", + "iopub.status.busy": "2025-08-30T21:15:35.261984Z", + "iopub.status.idle": "2025-08-30T21:15:35.264154Z", + "shell.execute_reply": "2025-08-30T21:15:35.263966Z" } }, "outputs": [ @@ -574,9 +576,9 @@ "data": { "text/html": [ "\n", - "

method NBRunner.async_resume (self, **kwargs)[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched. This method is equivalent to `Runner.async_resume`.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", + "

method NBRunner.async_resume (self, **kwargs)[source]

Non-blocking execution of the run. This method will return as soon as the
run has launched. This method is equivalent to `Runner.async_resume`.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
**kwargs : Any
    Additional arguments that you would pass to `python myflow.py` after
    the `run` command, in particular, any parameters accepted by the flow.

Returns
-------
ExecutingRun
    ExecutingRun representing the run that was started.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -590,7 +592,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 12, @@ -608,10 +610,10 @@ "id": "4dcbcf6a", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.187052Z", - "iopub.status.busy": "2024-07-25T06:16:57.186966Z", - "iopub.status.idle": "2024-07-25T06:16:57.189166Z", - "shell.execute_reply": "2024-07-25T06:16:57.188955Z" + "iopub.execute_input": "2025-08-30T21:15:35.266132Z", + "iopub.status.busy": "2025-08-30T21:15:35.266060Z", + "iopub.status.idle": "2025-08-30T21:15:35.267982Z", + "shell.execute_reply": "2025-08-30T21:15:35.267789Z" } }, "outputs": [ @@ -619,9 +621,9 @@ "data": { "text/html": [ "\n", - "

method NBRunner.cleanup (self)[source]

Delete any temporary files created during execution.

Call this method after using `async_run` or `async_resume`. You don't
have to call this after `nbrun` or `nbresume`.

\n", + "

method NBRunner.cleanup (self)[source]

Delete any temporary files created during execution.

Call this method after using `async_run` or `async_resume`. You don't
have to call this after `nbrun` or `nbresume`.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -630,7 +632,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 13, @@ -656,10 +658,10 @@ "id": "ca407a51", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.191237Z", - "iopub.status.busy": "2024-07-25T06:16:57.191153Z", - "iopub.status.idle": "2024-07-25T06:16:57.195115Z", - "shell.execute_reply": "2024-07-25T06:16:57.194853Z" + "iopub.execute_input": "2025-08-30T21:15:35.270116Z", + "iopub.status.busy": "2025-08-30T21:15:35.270041Z", + "iopub.status.idle": "2025-08-30T21:15:35.273706Z", + "shell.execute_reply": "2025-08-30T21:15:35.273505Z" } }, "outputs": [ @@ -667,9 +669,9 @@ "data": { "text/html": [ "\n", - "

class ExecutingRun [source]

This class contains a reference to a `metaflow.Run` object representing
the currently executing or finished run, as well as metadata related
to the process.

`ExecutingRun` is returned by methods in `Runner` and `NBRunner`. It is not
meant to be instantiated directly.

This class works as a context manager, allowing you to use a pattern like
```python
with Runner(...).run() as running:
    ...
```
Note that you should use either this object as the context manager or
`Runner`, not both in a nested manner.

\n", + "

class ExecutingRun [source]

This class contains a reference to a `metaflow.Run` object representing
the currently executing or finished run, as well as metadata related
to the process.

`ExecutingRun` is returned by methods in `Runner` and `NBRunner`. It is not
meant to be instantiated directly.

This class works as a context manager, allowing you to use a pattern like
```python
with Runner(...).run() as running:
    ...
```
Note that you should use either this object as the context manager or
`Runner`, not both in a nested manner.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -678,7 +680,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 14, @@ -696,10 +698,10 @@ "id": "f127a019", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.197432Z", - "iopub.status.busy": "2024-07-25T06:16:57.197357Z", - "iopub.status.idle": "2024-07-25T06:16:57.199514Z", - "shell.execute_reply": "2024-07-25T06:16:57.199252Z" + "iopub.execute_input": "2025-08-30T21:15:35.275972Z", + "iopub.status.busy": "2025-08-30T21:15:35.275896Z", + "iopub.status.idle": "2025-08-30T21:15:35.277790Z", + "shell.execute_reply": "2025-08-30T21:15:35.277585Z" } }, "outputs": [ @@ -718,7 +720,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 15, @@ -736,10 +738,10 @@ "id": "777973d3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.201633Z", - "iopub.status.busy": "2024-07-25T06:16:57.201538Z", - "iopub.status.idle": "2024-07-25T06:16:57.203616Z", - "shell.execute_reply": "2024-07-25T06:16:57.203359Z" + "iopub.execute_input": "2025-08-30T21:15:35.280004Z", + "iopub.status.busy": "2025-08-30T21:15:35.279927Z", + "iopub.status.idle": "2025-08-30T21:15:35.281762Z", + "shell.execute_reply": "2025-08-30T21:15:35.281551Z" } }, "outputs": [ @@ -747,18 +749,18 @@ "data": { "text/html": [ "\n", - "

property ExecutingRun.status [source]

Returns the status of the underlying subprocess that is responsible
for executing the run.

The return value is one of the following strings:
- `running` indicates a currently executing run.
- `failed` indicates a failed run.
- `successful` a successful run.

Returns
-------
str
    The current status of the run.

\n", + "

property ExecutingRun.status [source]

Returns the status of the underlying subprocess that is responsible
for executing the run.

The return value is one of the following strings:
- `timeout` indicates that the run timed out.
- `running` indicates a currently executing run.
- `failed` indicates a failed run.
- `successful` indicates a successful run.

Returns
-------
str
    The current status of the run.

\n", "
\n", "\n", "\n", - "\n", + "\n", "\n", "\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 16, @@ -776,10 +778,10 @@ "id": "ca5b461d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.205776Z", - "iopub.status.busy": "2024-07-25T06:16:57.205682Z", - "iopub.status.idle": "2024-07-25T06:16:57.207721Z", - "shell.execute_reply": "2024-07-25T06:16:57.207475Z" + "iopub.execute_input": "2025-08-30T21:15:35.284097Z", + "iopub.status.busy": "2025-08-30T21:15:35.284013Z", + "iopub.status.idle": "2025-08-30T21:15:35.285910Z", + "shell.execute_reply": "2025-08-30T21:15:35.285723Z" } }, "outputs": [ @@ -798,7 +800,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 17, @@ -816,10 +818,10 @@ "id": "c6b98164", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.209801Z", - "iopub.status.busy": "2024-07-25T06:16:57.209713Z", - "iopub.status.idle": "2024-07-25T06:16:57.211588Z", - "shell.execute_reply": "2024-07-25T06:16:57.211359Z" + "iopub.execute_input": "2025-08-30T21:15:35.288269Z", + "iopub.status.busy": "2025-08-30T21:15:35.288187Z", + "iopub.status.idle": "2025-08-30T21:15:35.290143Z", + "shell.execute_reply": "2025-08-30T21:15:35.289934Z" } }, "outputs": [ @@ -838,7 +840,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 18, @@ -864,10 +866,10 @@ "id": "f716f347", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.214150Z", - "iopub.status.busy": "2024-07-25T06:16:57.214062Z", - "iopub.status.idle": "2024-07-25T06:16:57.216506Z", - "shell.execute_reply": "2024-07-25T06:16:57.216287Z" + "iopub.execute_input": "2025-08-30T21:15:35.292838Z", + "iopub.status.busy": "2025-08-30T21:15:35.292751Z", + "iopub.status.idle": "2025-08-30T21:15:35.295038Z", + "shell.execute_reply": "2025-08-30T21:15:35.294829Z" }, "scrolled": true }, @@ -876,16 +878,16 @@ "data": { "text/html": [ "\n", - "

method ExecutingRun.wait (self, timeout: Optional[float] = None, stream: Optional[str] = None) -> 'ExecutingRun'[source]

Wait for this run to finish, optionally with a timeout
and optionally streaming its output.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
timeout : Optional[float], default None
    The maximum time to wait for the run to finish.
    If the timeout is reached, the run is terminated
stream : Optional[str], default None
    If specified, the specified stream is printed to stdout. `stream` can
    be one of `stdout` or `stderr`.

Returns
-------
ExecutingRun
    This object, allowing you to chain calls.

\n", + "

method ExecutingRun.wait (self, timeout: Optional[float] = None, stream: Optional[str] = None) -> 'ExecutingRun'[source]

Wait for this run to finish, optionally with a timeout
and optionally streaming its output.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
timeout : float, optional, default None
    The maximum time, in seconds, to wait for the run to finish.
    If the timeout is reached, the run is terminated. If not specified, wait
    forever.
stream : str, optional, default None
    If specified, the specified stream is printed to stdout. `stream` can
    be one of `stdout` or `stderr`.

Returns
-------
ExecutingRun
    This object, allowing you to chain calls.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", "\n", "\n", - "\t\n", - "\t\n", + "\t\n", + "\t\n", "\n", "\n", "\t\n", @@ -893,7 +895,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 19, @@ -911,10 +913,10 @@ "id": "4b97379d", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:57.218736Z", - "iopub.status.busy": "2024-07-25T06:16:57.218656Z", - "iopub.status.idle": "2024-07-25T06:16:57.220972Z", - "shell.execute_reply": "2024-07-25T06:16:57.220747Z" + "iopub.execute_input": "2025-08-30T21:15:35.297864Z", + "iopub.status.busy": "2025-08-30T21:15:35.297758Z", + "iopub.status.idle": "2025-08-30T21:15:35.300154Z", + "shell.execute_reply": "2025-08-30T21:15:35.299910Z" } }, "outputs": [ @@ -922,16 +924,16 @@ "data": { "text/html": [ "\n", - "

method ExecutingRun.stream_log (self, stream: str, position: Optional[int] = None) -> Iterator[Tuple[int, str]][source]

Asynchronous iterator to stream logs from the subprocess line by line.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
stream : str
    The stream to stream logs from. Can be one of `stdout` or `stderr`.
position : Optional[int], default None
    The position in the log file to start streaming from. If None, it starts
    from the beginning of the log file. This allows resuming streaming from
    a previously known position

Yields
------
Tuple[int, str]
    A tuple containing the position in the log file and the line read. The
    position returned can be used to feed into another `stream_logs` call
    for example.

\n", + "

method ExecutingRun.stream_log (self, stream: str, position: Optional[int] = None) -> Iterator[Tuple[int, str]][source]

Asynchronous iterator to stream logs from the subprocess line by line.

Note that this method is asynchronous and needs to be `await`ed.

Parameters
----------
stream : str
    The stream to stream logs from. Can be one of `stdout` or `stderr`.
position : int, optional, default None
    The position in the log file to start streaming from. If None, it starts
    from the beginning of the log file. This allows resuming streaming from
    a previously known position

Yields
------
Tuple[int, str]
    A tuple containing the position in the log file and the line read. The
    position returned can be used to feed into another `stream_logs` call
    for example.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", "\n", "\n", "\t\n", - "\t\n", + "\t\n", "\n", "\n", "\t\n", @@ -939,7 +941,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 20, diff --git a/docs/api/runner.md b/docs/api/runner.md index 4b148f48..b53ad1d4 100644 --- a/docs/api/runner.md +++ b/docs/api/runner.md @@ -24,24 +24,25 @@ with await Runner(...).async_run() as running: If you don't use `Runner` as a context manager, remember to call `Runner.cleanup()` to remove any leftover temp files. - + - + - - - + + + + - + @@ -55,7 +56,7 @@ If you don't use `Runner` as a context manager, remember to call `Runner.cleanup These calls block until the command completes. - + @@ -70,7 +71,7 @@ These calls block until the command completes. - + @@ -87,7 +88,7 @@ These calls block until the command completes. ### Non-Blocking API - + @@ -102,7 +103,7 @@ These calls block until the command completes. - + @@ -121,7 +122,7 @@ These calls block until the command completes. `NBRunner` is a wrapper over `Runner` which allows you to refer to a flow defined in a notebook cell instead of a file. For examples, see [Running flows in a notebook](/metaflow/managing-flows/notebook-runs). - + @@ -129,9 +130,10 @@ These calls block until the command completes. - - - + + + + @@ -140,7 +142,7 @@ These calls block until the command completes. ### Blocking API - + @@ -155,7 +157,7 @@ These calls block until the command completes. - + @@ -172,7 +174,7 @@ These calls block until the command completes. ### Non-Blocking API - + @@ -187,7 +189,7 @@ These calls block until the command completes. - + @@ -202,7 +204,7 @@ These calls block until the command completes. - + @@ -214,7 +216,7 @@ These calls block until the command completes. ## ExecutingRun - + @@ -236,7 +238,7 @@ These calls block until the command completes. - + @@ -266,14 +268,14 @@ These calls block until the command completes. ### Non-Blocking API - + - - + + @@ -282,16 +284,17 @@ These calls block until the command completes. - + - + + diff --git a/docs/api/step-decorators/batch.ipynb b/docs/api/step-decorators/batch.ipynb index 050f84df..6bcccc0a 100644 --- a/docs/api/step-decorators/batch.ipynb +++ b/docs/api/step-decorators/batch.ipynb @@ -24,10 +24,10 @@ "id": "a5ef9454", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.400491Z", - "iopub.status.busy": "2024-07-25T06:16:59.400293Z", - "iopub.status.idle": "2024-07-25T06:16:59.685244Z", - "shell.execute_reply": "2024-07-25T06:16:59.684870Z" + "iopub.execute_input": "2025-08-30T21:15:37.986569Z", + "iopub.status.busy": "2025-08-30T21:15:37.986325Z", + "iopub.status.idle": "2025-08-30T21:15:38.223739Z", + "shell.execute_reply": "2025-08-30T21:15:38.223456Z" } }, "outputs": [], @@ -46,10 +46,10 @@ "id": "09970e68", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.687754Z", - "iopub.status.busy": "2024-07-25T06:16:59.687597Z", - "iopub.status.idle": "2024-07-25T06:16:59.696794Z", - "shell.execute_reply": "2024-07-25T06:16:59.696531Z" + "iopub.execute_input": "2025-08-30T21:15:38.225915Z", + "iopub.status.busy": "2025-08-30T21:15:38.225753Z", + "iopub.status.idle": "2025-08-30T21:15:38.234985Z", + "shell.execute_reply": "2025-08-30T21:15:38.234740Z" } }, "outputs": [ @@ -57,9 +57,9 @@ "data": { "text/html": [ "\n", - "

decorator @batch (...)[source]

metaflow

Specifies that this step should execute on [AWS Batch](https://aws.amazon.com/batch/).

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
gpu : int, default 0
    Number of GPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
memory : int, default 4096
    Memory size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
image : str, optional, default None
    Docker image to use when launching on AWS Batch. If not specified, and
    METAFLOW_BATCH_CONTAINER_IMAGE is specified, that image is used. If
    not, a default Docker image mapping to the current version of Python is used.
queue : str, default METAFLOW_BATCH_JOB_QUEUE
    AWS Batch Job Queue to submit the job to.
iam_role : str, default METAFLOW_ECS_S3_ACCESS_IAM_ROLE
    AWS IAM role that AWS Batch container uses to access AWS cloud resources.
execution_role : str, default METAFLOW_ECS_FARGATE_EXECUTION_ROLE
    AWS IAM role that AWS Batch can use [to trigger AWS Fargate tasks]
    (https://docs.aws.amazon.com/batch/latest/userguide/execution-IAM-role.html).
shared_memory : int, optional, default None
    The value for the size (in MiB) of the /dev/shm volume for this step.
    This parameter maps to the `--shm-size` option in Docker.
max_swap : int, optional, default None
    The total amount of swap memory (in MiB) a container can use for this
    step. This parameter is translated to the `--memory-swap` option in
    Docker where the value is the sum of the container memory plus the
    `max_swap` value.
swappiness : int, optional, default None
    This allows you to tune memory swappiness behavior for this step.
    A swappiness value of 0 causes swapping not to happen unless absolutely
    necessary. A swappiness value of 100 causes pages to be swapped very
    aggressively. Accepted values are whole numbers between 0 and 100.
use_tmpfs : bool, default False
    This enables an explicit tmpfs mount for this step. Note that tmpfs is
    not available on Fargate compute environments
tmpfs_tempdir : bool, default True
    sets METAFLOW_TEMPDIR to tmpfs_path if set for this step.
tmpfs_size : int, optional, default None
    The value for the size (in MiB) of the tmpfs mount for this step.
    This parameter maps to the `--tmpfs` option in Docker. Defaults to 50% of the
    memory allocated for this step.
tmpfs_path : str, optional, default None
    Path to tmpfs mount for this step. Defaults to /metaflow_temp.
inferentia : int, default 0
    Number of Inferentia chips required for this step.
trainium : int, default None
    Alias for inferentia. Use only one of the two.
efa : int, default 0
    Number of elastic fabric adapter network devices to attach to container
ephemeral_storage : int, default None
    The total amount, in GiB, of ephemeral storage to set for the task, 21-200GiB.
    This is only relevant for Fargate compute environments
log_driver: str, optional, default None
    The log driver to use for the Amazon ECS container.
log_options: List[str], optional, default None
    List of strings containing options for the chosen log driver. The configurable values
    depend on the `log driver` chosen. Validation of these options is not supported yet.
    Example: [`awslogs-group:aws/batch/job`]

\n", + "

decorator @batch (...)[source]

metaflow

Specifies that this step should execute on [AWS Batch](https://aws.amazon.com/batch/).

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
gpu : int, default 0
    Number of GPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
memory : int, default 4096
    Memory size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
image : str, optional, default None
    Docker image to use when launching on AWS Batch. If not specified, and
    METAFLOW_BATCH_CONTAINER_IMAGE is specified, that image is used. If
    not, a default Docker image mapping to the current version of Python is used.
queue : str, default METAFLOW_BATCH_JOB_QUEUE
    AWS Batch Job Queue to submit the job to.
iam_role : str, default METAFLOW_ECS_S3_ACCESS_IAM_ROLE
    AWS IAM role that AWS Batch container uses to access AWS cloud resources.
execution_role : str, default METAFLOW_ECS_FARGATE_EXECUTION_ROLE
    AWS IAM role that AWS Batch can use [to trigger AWS Fargate tasks]
    (https://docs.aws.amazon.com/batch/latest/userguide/execution-IAM-role.html).
shared_memory : int, optional, default None
    The value for the size (in MiB) of the /dev/shm volume for this step.
    This parameter maps to the `--shm-size` option in Docker.
max_swap : int, optional, default None
    The total amount of swap memory (in MiB) a container can use for this
    step. This parameter is translated to the `--memory-swap` option in
    Docker where the value is the sum of the container memory plus the
    `max_swap` value.
swappiness : int, optional, default None
    This allows you to tune memory swappiness behavior for this step.
    A swappiness value of 0 causes swapping not to happen unless absolutely
    necessary. A swappiness value of 100 causes pages to be swapped very
    aggressively. Accepted values are whole numbers between 0 and 100.
use_tmpfs : bool, default False
    This enables an explicit tmpfs mount for this step. Note that tmpfs is
    not available on Fargate compute environments
tmpfs_tempdir : bool, default True
    sets METAFLOW_TEMPDIR to tmpfs_path if set for this step.
tmpfs_size : int, optional, default None
    The value for the size (in MiB) of the tmpfs mount for this step.
    This parameter maps to the `--tmpfs` option in Docker. Defaults to 50% of the
    memory allocated for this step.
tmpfs_path : str, optional, default None
    Path to tmpfs mount for this step. Defaults to /metaflow_temp.
inferentia : int, default 0
    Number of Inferentia chips required for this step.
trainium : int, default None
    Alias for inferentia. Use only one of the two.
efa : int, default 0
    Number of elastic fabric adapter network devices to attach to container
ephemeral_storage : int, default None
    The total amount, in GiB, of ephemeral storage to set for the task, 21-200GiB.
    This is only relevant for Fargate compute environments
log_driver: str, optional, default None
    The log driver to use for the Amazon ECS container.
log_options: List[str], optional, default None
    List of strings containing options for the chosen log driver. The configurable values
    depend on the `log driver` chosen. Validation of these options is not supported yet.
    Example: [`awslogs-group:aws/batch/job`]

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -89,7 +89,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/batch.md b/docs/api/step-decorators/batch.md index 2a2c6412..578a31a7 100644 --- a/docs/api/step-decorators/batch.md +++ b/docs/api/step-decorators/batch.md @@ -7,7 +7,7 @@ The `@batch` decorator sends a step for execution on the [AWS Batch](https://aws Note that while `@batch` doesn't allow mounting arbitrary disk volumes on the fly, you can create in-memory filesystems easily with `tmpfs` options. For more details, see [using `metaflow.S3` for in-memory processing](/scaling/data#using-metaflows3-for-in-memory-processing). - + diff --git a/docs/api/step-decorators/card.ipynb b/docs/api/step-decorators/card.ipynb index 1b4dfe16..8ab207e2 100644 --- a/docs/api/step-decorators/card.ipynb +++ b/docs/api/step-decorators/card.ipynb @@ -16,10 +16,10 @@ "id": "20310c8b", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:02.423148Z", - "iopub.status.busy": "2024-07-25T06:17:02.422995Z", - "iopub.status.idle": "2024-07-25T06:17:02.722995Z", - "shell.execute_reply": "2024-07-25T06:17:02.722698Z" + "iopub.execute_input": "2025-08-30T21:15:39.898583Z", + "iopub.status.busy": "2025-08-30T21:15:39.898479Z", + "iopub.status.idle": "2025-08-30T21:15:40.106685Z", + "shell.execute_reply": "2025-08-30T21:15:40.106363Z" } }, "outputs": [], @@ -38,10 +38,10 @@ "id": "7f2bc67e", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:02.725533Z", - "iopub.status.busy": "2024-07-25T06:17:02.725351Z", - "iopub.status.idle": "2024-07-25T06:17:02.733139Z", - "shell.execute_reply": "2024-07-25T06:17:02.732917Z" + "iopub.execute_input": "2025-08-30T21:15:40.108681Z", + "iopub.status.busy": "2025-08-30T21:15:40.108552Z", + "iopub.status.idle": "2025-08-30T21:15:40.117210Z", + "shell.execute_reply": "2025-08-30T21:15:40.116977Z" } }, "outputs": [ @@ -53,9 +53,9 @@ "@@ Returns \n", "------- in \n", "Creates a human-readable report, a Metaflow Card, after this step completes.\n", - "... in the docstring of CardDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/cards/card_decorator.py.\n", + "... in the docstring of CardDecorator in /Users/ville/src/metaflow/metaflow/plugins/cards/card_decorator.py.\n", " warn(msg)\n", - "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of CardDecorator in /Users/ville/src/madhur-metaflow/metaflow/plugins/cards/card_decorator.py.\n", + "/Users/ville/mambaforge/envs/docs/lib/python3.11/site-packages/numpydoc/docscrape.py:434: UserWarning: Unknown section Mf Add To Current in the docstring of CardDecorator in /Users/ville/src/metaflow/metaflow/plugins/cards/card_decorator.py.\n", " warn(msg)\n" ] }, @@ -63,9 +63,9 @@ "data": { "text/html": [ "\n", - "

decorator @card (...)[source]

metaflow

Creates a human-readable report, a Metaflow Card, after this step completes.

Note that you may add multiple `@card` decorators in a step with different parameters.

Parameters
----------
type : str, default 'default'
    Card type.
id : str, optional, default None
    If multiple cards are present, use this id to identify this card.
options : Dict[str, Any], default {}
    Options passed to the card. The contents depend on the card type.
timeout : int, default 45
    Interrupt reporting if it takes more than this many seconds.

MF Add To Current
-----------------
card -> metaflow.plugins.cards.component_serializer.CardComponentCollector
    The `@card` decorator makes the cards available through the `current.card`
    object. If multiple `@card` decorators are present, you can add an `ID` to
    distinguish between them using `@card(id=ID)` as the decorator. You will then
    be able to access that specific card using `current.card[ID].

    Methods available are `append` and `extend`

    @@ Returns
    -------
    CardComponentCollector
        The or one of the cards attached to this step.

\n", + "

decorator @card (...)[source]

metaflow

Creates a human-readable report, a Metaflow Card, after this step completes.

Note that you may add multiple `@card` decorators in a step with different parameters.

Parameters
----------
type : str, default 'default'
    Card type.
id : str, optional, default None
    If multiple cards are present, use this id to identify this card.
options : Dict[str, Any], default {}
    Options passed to the card. The contents depend on the card type.
timeout : int, default 45
    Interrupt reporting if it takes more than this many seconds.

MF Add To Current
-----------------
card -> metaflow.plugins.cards.component_serializer.CardComponentCollector
    The `@card` decorator makes the cards available through the `current.card`
    object. If multiple `@card` decorators are present, you can add an `ID` to
    distinguish between them using `@card(id=ID)` as the decorator. You will then
    be able to access that specific card using `current.card[ID].

    Methods available are `append` and `extend`

    @@ Returns
    -------
    CardComponentCollector
        The or one of the cards attached to this step.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -79,7 +79,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/card.md b/docs/api/step-decorators/card.md index 235793a9..cedd1227 100644 --- a/docs/api/step-decorators/card.md +++ b/docs/api/step-decorators/card.md @@ -5,7 +5,7 @@ Creates a report card after the step completes. For more information, see [Visua - + diff --git a/docs/api/step-decorators/catch.ipynb b/docs/api/step-decorators/catch.ipynb index 5e4039ba..60ad5586 100644 --- a/docs/api/step-decorators/catch.ipynb +++ b/docs/api/step-decorators/catch.ipynb @@ -18,10 +18,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:01.010753Z", - "iopub.status.busy": "2024-07-25T06:17:01.010654Z", - "iopub.status.idle": "2024-07-25T06:17:01.277105Z", - "shell.execute_reply": "2024-07-25T06:17:01.276760Z" + "iopub.execute_input": "2025-08-30T21:15:38.692739Z", + "iopub.status.busy": "2025-08-30T21:15:38.692660Z", + "iopub.status.idle": "2025-08-30T21:15:38.913789Z", + "shell.execute_reply": "2025-08-30T21:15:38.913472Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:01.279312Z", - "iopub.status.busy": "2024-07-25T06:17:01.279168Z", - "iopub.status.idle": "2024-07-25T06:17:01.284680Z", - "shell.execute_reply": "2024-07-25T06:17:01.284461Z" + "iopub.execute_input": "2025-08-30T21:15:38.915990Z", + "iopub.status.busy": "2025-08-30T21:15:38.915771Z", + "iopub.status.idle": "2025-08-30T21:15:38.921468Z", + "shell.execute_reply": "2025-08-30T21:15:38.921266Z" } }, "outputs": [ @@ -65,7 +65,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/conda.ipynb b/docs/api/step-decorators/conda.ipynb index 36037f48..072f884c 100644 --- a/docs/api/step-decorators/conda.ipynb +++ b/docs/api/step-decorators/conda.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:00.001856Z", - "iopub.status.busy": "2024-07-25T06:17:00.001763Z", - "iopub.status.idle": "2024-07-25T06:17:00.270101Z", - "shell.execute_reply": "2024-07-25T06:17:00.269792Z" + "iopub.execute_input": "2025-08-30T21:15:36.476365Z", + "iopub.status.busy": "2025-08-30T21:15:36.476294Z", + "iopub.status.idle": "2025-08-30T21:15:36.696018Z", + "shell.execute_reply": "2025-08-30T21:15:36.695744Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:00.272427Z", - "iopub.status.busy": "2024-07-25T06:17:00.272277Z", - "iopub.status.idle": "2024-07-25T06:17:00.279931Z", - "shell.execute_reply": "2024-07-25T06:17:00.279670Z" + "iopub.execute_input": "2025-08-30T21:15:36.698107Z", + "iopub.status.busy": "2025-08-30T21:15:36.697973Z", + "iopub.status.idle": "2025-08-30T21:15:36.705867Z", + "shell.execute_reply": "2025-08-30T21:15:36.705622Z" } }, "outputs": [ @@ -51,9 +51,9 @@ "data": { "text/html": [ "\n", - "

decorator @conda (...)[source]

metaflow

Specifies the Conda environment for the step.

Information in this decorator will augment any
attributes set in the `@conda_base` flow-level decorator. Hence,
you can use `@conda_base` to set packages required by all
steps and use `@conda` to specify step-specific overrides.

Parameters
----------
packages : Dict[str, str], default {}
    Packages to use for this step. The key is the name of the package
    and the value is the version to use.
libraries : Dict[str, str], default {}
    Supported for backward compatibility. When used with packages, packages will take precedence.
python : str, optional, default None
    Version of Python to use, e.g. '3.7.4'. A default value of None implies
    that the version used will correspond to the version of the Python interpreter used to start the run.
disabled : bool, default False
    If set to True, disables @conda.

\n", + "

decorator @conda (...)[source]

metaflow

Specifies the Conda environment for the step.

Information in this decorator will augment any
attributes set in the `@conda_base` flow-level decorator. Hence,
you can use `@conda_base` to set packages required by all
steps and use `@conda` to specify step-specific overrides.

Parameters
----------
packages : Dict[str, str], default {}
    Packages to use for this step. The key is the name of the package
    and the value is the version to use.
libraries : Dict[str, str], default {}
    Supported for backward compatibility. When used with packages, packages will take precedence.
python : str, optional, default None
    Version of Python to use, e.g. '3.7.4'. A default value of None implies
    that the version used will correspond to the version of the Python interpreter used to start the run.
disabled : bool, default False
    If set to True, disables @conda.

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -67,7 +67,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/conda.md b/docs/api/step-decorators/conda.md index f6804060..a98ae76b 100644 --- a/docs/api/step-decorators/conda.md +++ b/docs/api/step-decorators/conda.md @@ -7,7 +7,7 @@ The libraries are installed from [Conda repositories](https://anaconda.org/). Fo - + diff --git a/docs/api/step-decorators/environment.ipynb b/docs/api/step-decorators/environment.ipynb index c0aa77aa..1b59d1a6 100644 --- a/docs/api/step-decorators/environment.ipynb +++ b/docs/api/step-decorators/environment.ipynb @@ -24,10 +24,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:03.511324Z", - "iopub.status.busy": "2024-07-25T06:17:03.510816Z", - "iopub.status.idle": "2024-07-25T06:17:03.798763Z", - "shell.execute_reply": "2024-07-25T06:17:03.798427Z" + "iopub.execute_input": "2025-08-30T21:15:41.403605Z", + "iopub.status.busy": "2025-08-30T21:15:41.403531Z", + "iopub.status.idle": "2025-08-30T21:15:41.609980Z", + "shell.execute_reply": "2025-08-30T21:15:41.609645Z" } }, "outputs": [], @@ -46,10 +46,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:03.800957Z", - "iopub.status.busy": "2024-07-25T06:17:03.800802Z", - "iopub.status.idle": "2024-07-25T06:17:03.805689Z", - "shell.execute_reply": "2024-07-25T06:17:03.805444Z" + "iopub.execute_input": "2025-08-30T21:15:41.612041Z", + "iopub.status.busy": "2025-08-30T21:15:41.611905Z", + "iopub.status.idle": "2025-08-30T21:15:41.616866Z", + "shell.execute_reply": "2025-08-30T21:15:41.616657Z" } }, "outputs": [ @@ -70,7 +70,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/kubernetes.ipynb b/docs/api/step-decorators/kubernetes.ipynb index 2ec23e28..7b757402 100644 --- a/docs/api/step-decorators/kubernetes.ipynb +++ b/docs/api/step-decorators/kubernetes.ipynb @@ -18,10 +18,10 @@ "id": "e5133463", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:04.019492Z", - "iopub.status.busy": "2024-07-25T06:17:04.019392Z", - "iopub.status.idle": "2024-07-25T06:17:04.308718Z", - "shell.execute_reply": "2024-07-25T06:17:04.308378Z" + "iopub.execute_input": "2025-08-30T21:15:41.654468Z", + "iopub.status.busy": "2025-08-30T21:15:41.654376Z", + "iopub.status.idle": "2025-08-30T21:15:41.879230Z", + "shell.execute_reply": "2025-08-30T21:15:41.878884Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "d7399f58", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:04.311427Z", - "iopub.status.busy": "2024-07-25T06:17:04.311186Z", - "iopub.status.idle": "2024-07-25T06:17:04.322020Z", - "shell.execute_reply": "2024-07-25T06:17:04.321710Z" + "iopub.execute_input": "2025-08-30T21:15:41.881329Z", + "iopub.status.busy": "2025-08-30T21:15:41.881188Z", + "iopub.status.idle": "2025-08-30T21:15:41.892047Z", + "shell.execute_reply": "2025-08-30T21:15:41.891818Z" } }, "outputs": [ @@ -51,9 +51,9 @@ "data": { "text/html": [ "\n", - "

decorator @kubernetes (...)[source]

metaflow

Specifies that this step should execute on Kubernetes.

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
memory : int, default 4096
    Memory size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
disk : int, default 10240
    Disk size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
image : str, optional, default None
    Docker image to use when launching on Kubernetes. If not specified, and
    METAFLOW_KUBERNETES_CONTAINER_IMAGE is specified, that image is used. If
    not, a default Docker image mapping to the current version of Python is used.
image_pull_policy: str, default KUBERNETES_IMAGE_PULL_POLICY
    If given, the imagePullPolicy to be applied to the Docker image of the step.
service_account : str, default METAFLOW_KUBERNETES_SERVICE_ACCOUNT
    Kubernetes service account to use when launching pod in Kubernetes.
secrets : List[str], optional, default None
    Kubernetes secrets to use when launching pod in Kubernetes. These
    secrets are in addition to the ones defined in `METAFLOW_KUBERNETES_SECRETS`
    in Metaflow configuration.
namespace : str, default METAFLOW_KUBERNETES_NAMESPACE
    Kubernetes namespace to use when launching pod in Kubernetes.
gpu : int, optional, default None
    Number of GPUs required for this step. A value of zero implies that
    the scheduled node should not have GPUs.
gpu_vendor : str, default KUBERNETES_GPU_VENDOR
    The vendor of the GPUs to be used for this step.
tolerations : List[str], default []
    The default is extracted from METAFLOW_KUBERNETES_TOLERATIONS.
    Kubernetes tolerations to use when launching pod in Kubernetes.
use_tmpfs : bool, default False
    This enables an explicit tmpfs mount for this step.
tmpfs_tempdir : bool, default True
    sets METAFLOW_TEMPDIR to tmpfs_path if set for this step.
tmpfs_size : int, optional, default: None
    The value for the size (in MiB) of the tmpfs mount for this step.
    This parameter maps to the `--tmpfs` option in Docker. Defaults to 50% of the
    memory allocated for this step.
tmpfs_path : str, optional, default /metaflow_temp
    Path to tmpfs mount for this step.
persistent_volume_claims : Dict[str, str], optional, default None
    A map (dictionary) of persistent volumes to be mounted to the pod for this step. The map is from persistent
    volumes to the path to which the volume is to be mounted, e.g., `{'pvc-name': '/path/to/mount/on'}`.
shared_memory: int, optional
    Shared memory size (in MiB) required for this step
port: int, optional
    Port number to specify in the Kubernetes job object

\n", + "

decorator @kubernetes (...)[source]

metaflow

Specifies that this step should execute on Kubernetes.

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step. If `@resources` is
    also present, the maximum value from all decorators is used.
memory : int, default 4096
    Memory size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
disk : int, default 10240
    Disk size (in MB) required for this step. If
    `@resources` is also present, the maximum value from all decorators is
    used.
image : str, optional, default None
    Docker image to use when launching on Kubernetes. If not specified, and
    METAFLOW_KUBERNETES_CONTAINER_IMAGE is specified, that image is used. If
    not, a default Docker image mapping to the current version of Python is used.
image_pull_policy: str, default KUBERNETES_IMAGE_PULL_POLICY
    If given, the imagePullPolicy to be applied to the Docker image of the step.
image_pull_secrets: List[str], default []
    The default is extracted from METAFLOW_KUBERNETES_IMAGE_PULL_SECRETS.
    Kubernetes image pull secrets to use when pulling container images
    in Kubernetes.
service_account : str, default METAFLOW_KUBERNETES_SERVICE_ACCOUNT
    Kubernetes service account to use when launching pod in Kubernetes.
secrets : List[str], optional, default None
    Kubernetes secrets to use when launching pod in Kubernetes. These
    secrets are in addition to the ones defined in `METAFLOW_KUBERNETES_SECRETS`
    in Metaflow configuration.
node_selector: Union[Dict[str,str], str], optional, default None
    Kubernetes node selector(s) to apply to the pod running the task.
    Can be passed in as a comma separated string of values e.g.
    'kubernetes.io/os=linux,kubernetes.io/arch=amd64' or as a dictionary
    {'kubernetes.io/os': 'linux', 'kubernetes.io/arch': 'amd64'}
namespace : str, default METAFLOW_KUBERNETES_NAMESPACE
    Kubernetes namespace to use when launching pod in Kubernetes.
gpu : int, optional, default None
    Number of GPUs required for this step. A value of zero implies that
    the scheduled node should not have GPUs.
gpu_vendor : str, default KUBERNETES_GPU_VENDOR
    The vendor of the GPUs to be used for this step.
tolerations : List[Dict[str,str]], default []
    The default is extracted from METAFLOW_KUBERNETES_TOLERATIONS.
    Kubernetes tolerations to use when launching pod in Kubernetes.
labels: Dict[str, str], default: METAFLOW_KUBERNETES_LABELS
    Kubernetes labels to use when launching pod in Kubernetes.
annotations: Dict[str, str], default: METAFLOW_KUBERNETES_ANNOTATIONS
    Kubernetes annotations to use when launching pod in Kubernetes.
use_tmpfs : bool, default False
    This enables an explicit tmpfs mount for this step.
tmpfs_tempdir : bool, default True
    sets METAFLOW_TEMPDIR to tmpfs_path if set for this step.
tmpfs_size : int, optional, default: None
    The value for the size (in MiB) of the tmpfs mount for this step.
    This parameter maps to the `--tmpfs` option in Docker. Defaults to 50% of the
    memory allocated for this step.
tmpfs_path : str, optional, default /metaflow_temp
    Path to tmpfs mount for this step.
persistent_volume_claims : Dict[str, str], optional, default None
    A map (dictionary) of persistent volumes to be mounted to the pod for this step. The map is from persistent
    volumes to the path to which the volume is to be mounted, e.g., `{'pvc-name': '/path/to/mount/on'}`.
shared_memory: int, optional
    Shared memory size (in MiB) required for this step
port: int, optional
    Port number to specify in the Kubernetes job object
compute_pool : str, optional, default None
    Compute pool to be used for for this step.
    If not specified, any accessible compute pool within the perimeter is used.
hostname_resolution_timeout: int, default 10 * 60
    Timeout in seconds for the workers tasks in the gang scheduled cluster to resolve the hostname of control task.
    Only applicable when @parallel is used.
qos: str, default: Burstable
    Quality of Service class to assign to the pod. Supported values are: Guaranteed, Burstable, BestEffort

security_context: Dict[str, Any], optional, default None
    Container security context. Applies to the task container. Allows the following keys:
    - privileged: bool, optional, default None
    - allow_privilege_escalation: bool, optional, default None
    - run_as_user: int, optional, default None
    - run_as_group: int, optional, default None
    - run_as_non_root: bool, optional, default None

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -64,12 +64,16 @@ "\t\n", "\t\n", "\t\n", + "\t\n", "\t\n", "\t\n", + "\t\n", "\t\n", "\t\n", "\t\n", - "\t\n", + "\t\n", + "\t\n", + "\t\n", "\t\n", "\t\n", "\t\n", @@ -77,11 +81,15 @@ "\t\n", "\t\n", "\t\n", + "\t\n", + "\t\n", + "\t\n", + "\t\n", "
\n", "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/kubernetes.md b/docs/api/step-decorators/kubernetes.md index c62cdd74..18d75059 100644 --- a/docs/api/step-decorators/kubernetes.md +++ b/docs/api/step-decorators/kubernetes.md @@ -7,7 +7,7 @@ For options related to `tmpfs`, see [Using `metaflow.S3` for in-memory processin - + @@ -18,12 +18,16 @@ For options related to `tmpfs`, see [Using `metaflow.S3` for in-memory processin + + - + + + @@ -31,6 +35,10 @@ For options related to `tmpfs`, see [Using `metaflow.S3` for in-memory processin + + + +
diff --git a/docs/api/step-decorators/pypi.ipynb b/docs/api/step-decorators/pypi.ipynb index bb18fed5..4e44aaaa 100644 --- a/docs/api/step-decorators/pypi.ipynb +++ b/docs/api/step-decorators/pypi.ipynb @@ -18,10 +18,10 @@ "id": "8d5bb116", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:02.673485Z", - "iopub.status.busy": "2024-07-25T06:17:02.673407Z", - "iopub.status.idle": "2024-07-25T06:17:03.004750Z", - "shell.execute_reply": "2024-07-25T06:17:03.004331Z" + "iopub.execute_input": "2025-08-30T21:15:40.150261Z", + "iopub.status.busy": "2025-08-30T21:15:40.150165Z", + "iopub.status.idle": "2025-08-30T21:15:40.382655Z", + "shell.execute_reply": "2025-08-30T21:15:40.382348Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "29af6ee3", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:03.009355Z", - "iopub.status.busy": "2024-07-25T06:17:03.009178Z", - "iopub.status.idle": "2024-07-25T06:17:03.015047Z", - "shell.execute_reply": "2024-07-25T06:17:03.014828Z" + "iopub.execute_input": "2025-08-30T21:15:40.384732Z", + "iopub.status.busy": "2025-08-30T21:15:40.384588Z", + "iopub.status.idle": "2025-08-30T21:15:40.390145Z", + "shell.execute_reply": "2025-08-30T21:15:40.389963Z" } }, "outputs": [ @@ -65,7 +65,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/resources.ipynb b/docs/api/step-decorators/resources.ipynb index 257e1354..9c688f31 100644 --- a/docs/api/step-decorators/resources.ipynb +++ b/docs/api/step-decorators/resources.ipynb @@ -18,10 +18,10 @@ "id": "c8aea2b0", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.369242Z", - "iopub.status.busy": "2024-07-25T06:16:58.369161Z", - "iopub.status.idle": "2024-07-25T06:16:58.665405Z", - "shell.execute_reply": "2024-07-25T06:16:58.665060Z" + "iopub.execute_input": "2025-08-30T21:15:35.489270Z", + "iopub.status.busy": "2025-08-30T21:15:35.489175Z", + "iopub.status.idle": "2025-08-30T21:15:35.701945Z", + "shell.execute_reply": "2025-08-30T21:15:35.701660Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "1d16a1c7", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:58.668000Z", - "iopub.status.busy": "2024-07-25T06:16:58.667823Z", - "iopub.status.idle": "2024-07-25T06:16:58.673649Z", - "shell.execute_reply": "2024-07-25T06:16:58.673353Z" + "iopub.execute_input": "2025-08-30T21:15:35.704097Z", + "iopub.status.busy": "2025-08-30T21:15:35.703901Z", + "iopub.status.idle": "2025-08-30T21:15:35.709338Z", + "shell.execute_reply": "2025-08-30T21:15:35.709118Z" } }, "outputs": [ @@ -51,7 +51,7 @@ "data": { "text/html": [ "\n", - "

decorator @resources (...)[source]

metaflow

Specifies the resources needed when executing this step.

Use `@resources` to specify the resource requirements
independently of the specific compute layer (`@batch`, `@kubernetes`).

You can choose the compute layer on the command line by executing e.g.
```
python myflow.py run --with batch
```
or
```
python myflow.py run --with kubernetes
```
which executes the flow on the desired system using the
requirements specified in `@resources`.

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step.
gpu : int, default 0
    Number of GPUs required for this step.
disk : int, optional, default None
    Disk size (in MB) required for this step. Only applies on Kubernetes.
memory : int, default 4096
    Memory size (in MB) required for this step.
shared_memory : int, optional, default None
    The value for the size (in MiB) of the /dev/shm volume for this step.
    This parameter maps to the `--shm-size` option in Docker.

\n", + "

decorator @resources (...)[source]

metaflow

Specifies the resources needed when executing this step.

Use `@resources` to specify the resource requirements
independently of the specific compute layer (`@batch`, `@kubernetes`).

You can choose the compute layer on the command line by executing e.g.
```
python myflow.py run --with batch
```
or
```
python myflow.py run --with kubernetes
```
which executes the flow on the desired system using the
requirements specified in `@resources`.

Parameters
----------
cpu : int, default 1
    Number of CPUs required for this step.
gpu : int, optional, default None
    Number of GPUs required for this step.
disk : int, optional, default None
    Disk size (in MB) required for this step. Only applies on Kubernetes.
memory : int, default 4096
    Memory size (in MB) required for this step.
shared_memory : int, optional, default None
    The value for the size (in MiB) of the /dev/shm volume for this step.
    This parameter maps to the `--shm-size` option in Docker.

\n", "
\n", "\n", "\n", @@ -60,7 +60,7 @@ "\n", "\n", "\t\n", - "\t\n", + "\t\n", "\t\n", "\t\n", "\t\n", @@ -68,7 +68,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/resources.md b/docs/api/step-decorators/resources.md index 820ec44e..715cf6c3 100644 --- a/docs/api/step-decorators/resources.md +++ b/docs/api/step-decorators/resources.md @@ -14,7 +14,7 @@ Note that `@resources` takes effect only when combined with another decorator li - + diff --git a/docs/api/step-decorators/retry.ipynb b/docs/api/step-decorators/retry.ipynb index 79cca018..48a7e62b 100644 --- a/docs/api/step-decorators/retry.ipynb +++ b/docs/api/step-decorators/retry.ipynb @@ -18,10 +18,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.552838Z", - "iopub.status.busy": "2024-07-25T06:16:59.552756Z", - "iopub.status.idle": "2024-07-25T06:16:59.832917Z", - "shell.execute_reply": "2024-07-25T06:16:59.832549Z" + "iopub.execute_input": "2025-08-30T21:15:37.478143Z", + "iopub.status.busy": "2025-08-30T21:15:37.478052Z", + "iopub.status.idle": "2025-08-30T21:15:37.706804Z", + "shell.execute_reply": "2025-08-30T21:15:37.706547Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:16:59.835195Z", - "iopub.status.busy": "2024-07-25T06:16:59.835052Z", - "iopub.status.idle": "2024-07-25T06:16:59.840908Z", - "shell.execute_reply": "2024-07-25T06:16:59.840631Z" + "iopub.execute_input": "2025-08-30T21:15:37.709095Z", + "iopub.status.busy": "2025-08-30T21:15:37.708941Z", + "iopub.status.idle": "2025-08-30T21:15:37.713902Z", + "shell.execute_reply": "2025-08-30T21:15:37.713679Z" } }, "outputs": [ @@ -65,7 +65,7 @@ "
" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/secrets.ipynb b/docs/api/step-decorators/secrets.ipynb index fe87e2d3..11ce70e5 100644 --- a/docs/api/step-decorators/secrets.ipynb +++ b/docs/api/step-decorators/secrets.ipynb @@ -16,10 +16,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:01.505025Z", - "iopub.status.busy": "2024-07-25T06:17:01.504912Z", - "iopub.status.idle": "2024-07-25T06:17:02.122202Z", - "shell.execute_reply": "2024-07-25T06:17:02.121862Z" + "iopub.execute_input": "2025-08-30T21:15:39.399793Z", + "iopub.status.busy": "2025-08-30T21:15:39.399717Z", + "iopub.status.idle": "2025-08-30T21:15:39.606129Z", + "shell.execute_reply": "2025-08-30T21:15:39.605815Z" } }, "outputs": [], @@ -38,10 +38,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:02.126004Z", - "iopub.status.busy": "2024-07-25T06:17:02.125801Z", - "iopub.status.idle": "2024-07-25T06:17:02.133368Z", - "shell.execute_reply": "2024-07-25T06:17:02.132965Z" + "iopub.execute_input": "2025-08-30T21:15:39.608147Z", + "iopub.status.busy": "2025-08-30T21:15:39.608009Z", + "iopub.status.idle": "2025-08-30T21:15:39.613887Z", + "shell.execute_reply": "2025-08-30T21:15:39.613682Z" } }, "outputs": [ @@ -49,20 +49,21 @@ "data": { "text/html": [ "\n", - "

decorator @secrets (...)[source]

metaflow

Specifies secrets to be retrieved and injected as environment variables prior to
the execution of a step.

Parameters
----------
sources : List[Union[str, Dict[str, Any]]], default: []
    List of secret specs, defining how the secrets are to be retrieved

\n", + "

decorator @secrets (...)[source]

metaflow

Specifies secrets to be retrieved and injected as environment variables prior to
the execution of a step.

Parameters
----------
sources : List[Union[str, Dict[str, Any]]], default: []
    List of secret specs, defining how the secrets are to be retrieved
role : str, optional, default: None
    Role to use for fetching secrets

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", "\n", "\n", "\t\n", + "\t\n", "\n", "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/secrets.md b/docs/api/step-decorators/secrets.md index 86028eb3..b2a3a2fb 100644 --- a/docs/api/step-decorators/secrets.md +++ b/docs/api/step-decorators/secrets.md @@ -5,13 +5,14 @@ The `@secrets` decorator allows you to access secrets, such as database credenti - + + diff --git a/docs/api/step-decorators/step.ipynb b/docs/api/step-decorators/step.ipynb index 049cd566..7c1d6210 100644 --- a/docs/api/step-decorators/step.ipynb +++ b/docs/api/step-decorators/step.ipynb @@ -18,10 +18,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:00.752153Z", - "iopub.status.busy": "2024-07-25T06:17:00.752043Z", - "iopub.status.idle": "2024-07-25T06:17:01.014513Z", - "shell.execute_reply": "2024-07-25T06:17:01.014212Z" + "iopub.execute_input": "2025-08-30T21:15:38.147424Z", + "iopub.status.busy": "2025-08-30T21:15:38.147351Z", + "iopub.status.idle": "2025-08-30T21:15:38.392184Z", + "shell.execute_reply": "2025-08-30T21:15:38.391902Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:01.017025Z", - "iopub.status.busy": "2024-07-25T06:17:01.016865Z", - "iopub.status.idle": "2024-07-25T06:17:01.021112Z", - "shell.execute_reply": "2024-07-25T06:17:01.020885Z" + "iopub.execute_input": "2025-08-30T21:15:38.394428Z", + "iopub.status.busy": "2025-08-30T21:15:38.394270Z", + "iopub.status.idle": "2025-08-30T21:15:38.398172Z", + "shell.execute_reply": "2025-08-30T21:15:38.397923Z" } }, "outputs": [ @@ -51,9 +51,9 @@ "data": { "text/html": [ "\n", - "

function step [source]

metaflow

Marks a method in a FlowSpec as a Metaflow Step. Note that this
decorator needs to be placed as close to the method as possible (ie:
before other decorators).

In other words, this is valid:
```
@batch
@step
def foo(self):
    pass
```

whereas this is not:
```
@step
@batch
def foo(self):
    pass
```

Parameters
----------
f : Union[Callable[[FlowSpecDerived], None], Callable[[FlowSpecDerived, Any], None]]
    Function to make into a Metaflow Step

Returns
-------
Union[Callable[[FlowSpecDerived, StepFlag], None], Callable[[FlowSpecDerived, Any, StepFlag], None]]
    Function that is a Metaflow Step

\n", + "

function step [source]

metaflow

Marks a method in a FlowSpec as a Metaflow Step. Note that this
decorator needs to be placed as close to the method as possible (ie:
before other decorators).

In other words, this is valid:
```
@batch
@step
def foo(self):
    pass
```

whereas this is not:
```
@step
@batch
def foo(self):
    pass
```

Parameters
----------
f : Union[Callable[[FlowSpecDerived], None], Callable[[FlowSpecDerived, Any], None]]
    Function to make into a Metaflow Step

Returns
-------
Union[Callable[[FlowSpecDerived, StepFlag], None], Callable[[FlowSpecDerived, Any, StepFlag], None]]
    Function that is a Metaflow Step

\n", "
\n", - "\n", + "\n", "\n", "\n", "\n", @@ -67,7 +67,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/api/step-decorators/step.md b/docs/api/step-decorators/step.md index 87f50c39..446064c0 100644 --- a/docs/api/step-decorators/step.md +++ b/docs/api/step-decorators/step.md @@ -7,7 +7,7 @@ Use `@step` to construct Metaflow workflows. For more information, see [Basics o - + diff --git a/docs/api/step-decorators/timeout.ipynb b/docs/api/step-decorators/timeout.ipynb index f57fd574..e09f2d09 100644 --- a/docs/api/step-decorators/timeout.ipynb +++ b/docs/api/step-decorators/timeout.ipynb @@ -18,10 +18,10 @@ "id": "627487a4", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:03.018541Z", - "iopub.status.busy": "2024-07-25T06:17:03.018463Z", - "iopub.status.idle": "2024-07-25T06:17:03.292350Z", - "shell.execute_reply": "2024-07-25T06:17:03.292035Z" + "iopub.execute_input": "2025-08-30T21:15:40.899541Z", + "iopub.status.busy": "2025-08-30T21:15:40.899459Z", + "iopub.status.idle": "2025-08-30T21:15:41.104548Z", + "shell.execute_reply": "2025-08-30T21:15:41.104217Z" } }, "outputs": [], @@ -40,10 +40,10 @@ "id": "af50a6d2", "metadata": { "execution": { - "iopub.execute_input": "2024-07-25T06:17:03.294731Z", - "iopub.status.busy": "2024-07-25T06:17:03.294553Z", - "iopub.status.idle": "2024-07-25T06:17:03.300684Z", - "shell.execute_reply": "2024-07-25T06:17:03.300430Z" + "iopub.execute_input": "2025-08-30T21:15:41.106598Z", + "iopub.status.busy": "2025-08-30T21:15:41.106465Z", + "iopub.status.idle": "2025-08-30T21:15:41.112213Z", + "shell.execute_reply": "2025-08-30T21:15:41.111997Z" } }, "outputs": [ @@ -66,7 +66,7 @@ "" ], "text/plain": [ - "" + "" ] }, "execution_count": 2, diff --git a/docs/index.md b/docs/index.md index 0339205a..e44dc369 100644 --- a/docs/index.md +++ b/docs/index.md @@ -21,14 +21,14 @@ Metaflow makes it easy to build and manage real-life data science, AI, and ML pr ## Getting Started - [Installing Metaflow locally](getting-started/install) -- [Setting Up the Dev Stack](getting-started/devstack) ✨*New*✨ +- [Setting Up the Dev Stack](getting-started/devstack) - [Deploying Infrastructure for Metaflow](getting-started/infrastructure) - [Quickstart Tutorial](getting-started/tutorials/) ## I. Flow Development - [Introduction to Developing with Metaflow](metaflow/introduction) -- [Creating Flows](metaflow/basics) +- [Creating Flows](metaflow/basics) ✨*New: support for conditional and recursive steps*✨ - [Inspecting Flows and Results](metaflow/client) - [Managing Flows in Notebooks and Scripts](metaflow/managing-flows/introduction) - [Debugging Flows](metaflow/debugging) diff --git a/docs/metaflow/basics.md b/docs/metaflow/basics.md index 788b7937..a8247ff8 100644 --- a/docs/metaflow/basics.md +++ b/docs/metaflow/basics.md @@ -38,7 +38,7 @@ another one. Here is a graph with two linear transitions: -![](/assets/graph_linear.png) +![](/assets/dag-linear.png) The corresponding Metaflow script looks like this: @@ -109,7 +109,7 @@ transitions to two parallel steps, `a` and `b`. Any number of parallel steps are allowed. A benefit of a branch like this is performance: Metaflow can execute `a` and `b` over multiple CPU cores or over multiple instances in the cloud. -![](/assets/graph_branch.png) +![](/assets/dag-branch.png) ```python from metaflow import FlowSpec, step @@ -169,7 +169,7 @@ methods, many parallel copies of steps inside a foreach loop are executed. A foreach loop can iterate over any list like `titles` below. -![](/assets/graph_foreach.png) +![](/assets/dag-foreach.png) ```python from metaflow import FlowSpec, step @@ -217,6 +217,138 @@ task. You can nest foreaches and combine them with branches and linear steps arbitrarily. +### Conditionals + +:::info +This is a new feature in Metaflow 2.18. +::: + +Conditional branches allow you to choose *one* branch among multiple +options, based on the value of an artifact. Similar to `foreach`, you +pass a `condition` keyword argument to `self.next`, pointing to an artifact +that determines which branch to follow. Unlike regular branches, where +multiple paths can run in parallel, conditional branching ensures that +exactly one branch is executed. + +![](/assets/dag-conditional.png) + +Specify the candidate branches as a Python dictionary, similar to a +`switch` statement in many languages. Dictionary keys can be any valid type +(e.g. strings, integers, booleans), and the values must be valid `@step` methods. +The keys may be specified +via [a `Config` file](/metaflow/configuring-flows/introduction), +e.g. `{cfg.first_choice: self.head}`. Note that you must specify a key for +every possible value of the `condition` artifact. + +This flow flips a coin and executes a heads or tails branch accordingly: + +```python +from metaflow import FlowSpec, step +import random + +class CoinFlipFlow(FlowSpec): + + @step + def start(self): + self.choice = random.choice(['head', 'tails']) + self.next({"head": self.head, "tails": self.tails}, condition='choice') + + @step + def head(self): + print('Head!') + self.next(self.end) + + @step + def tails(self): + print('Tails!') + self.next(self.end) + + @step + def end(self): + print("This is the end") + +if __name__ == '__main__': + CoinFlipFlow() +``` + +Note that you don't need a join step with conditionals, since only +one branch is execution so there are no multiple sets of artifacts +to be merged. + +### Recursion + +:::info +This is a new feature in Metaflow 2.18. +::: + +While Metaflow doesn't support arbitrary loops across the flow - +the flows are *Directed* **Acyclic** *Graphs* after all - as a special +case, you can execute a single step multiple times recursively. + +![](/assets/dag-recursion.png) + +Similar to `foreach`, this construct causes a single step to spawn multiple tasks, +each producing its own artifacts. Recursion is particularly useful when you want +to *snapshot and persist artifacts* before continuing processing. + +If the flow +fails during recursion, you can use the +[`resume` command](/metaflow/debugging#how-to-use-the-resume-command) to pick up +from the latest successful iteration instead of starting over. And, you are able +to use [the Client API](/metaflow/client) to inspect any iteration afterwards - +for instance, a particular parametrization of a hyperparameter optimization loop. + +Define recursion simply by creating a conditional where one of the branches points +at the step itself. For instance, the example below implements the +following `while` loop + +```python +x = 1 +while x < 5: + x += 1 +``` + +...as the `loop` step: + +```python +from metaflow import FlowSpec, step + +class WhileFlow(FlowSpec): + + @step + def start(self): + self.next(self.loop) + + @step + def loop(self): + self.x = getattr(self, 'x', 0) + 1 + print('X is', self.x) + self.again = self.x < 5 + self.next({True: self.loop, False: self.end}, condition='again') + + @step + def end(self): + print("The final X is", self.x) + +if __name__ == '__main__': + WhileFlow() +``` + +You can interrupt the flow mid-execution and try `python while.py resume` to observe +how the execution resumes from the latest successful iteration instead of starting +again. + +### Nesting constructs + +You can combine and nest the above constructs almost arbitrarily. + +For example, the following flow orchestrates [multiple agents working in +parallel](https://outerbounds.com/blog/agentic-metaflow) using a **foreach**. +The agent state is persisted through **recursive steps**, and the agent can +continue along a suitable path through **a conditional**. + +![](/assets/dag-nested.png) + ## What should be a step? There is not a single right way of structuring code as a graph of steps but here are diff --git a/docs/production/scheduling-metaflow-flows/scheduling-with-airflow.md b/docs/production/scheduling-metaflow-flows/scheduling-with-airflow.md index 817e2b9c..02fdd5f6 100644 --- a/docs/production/scheduling-metaflow-flows/scheduling-with-airflow.md +++ b/docs/production/scheduling-metaflow-flows/scheduling-with-airflow.md @@ -43,6 +43,13 @@ multiple people, multiple workflows, or it is becoming business-critical, check section around [coordinating larger Metaflow projects](../coordinating-larger-metaflow-projects.md). +:::info Note +[Conditional and recursive steps](/metaflow/basics#conditionals) +introduced in Metaflow 2.18, are not yet supported +on Airflow deployments. Contact [the Metaflow Slack](http://slack.outerbounds.co) if +you have a use case for this feature. +::: + ## Pushing a flow to production Let's use [the flow from the section about diff --git a/docs/production/scheduling-metaflow-flows/scheduling-with-aws-step-functions.md b/docs/production/scheduling-metaflow-flows/scheduling-with-aws-step-functions.md index 1f3df50d..c8e18233 100644 --- a/docs/production/scheduling-metaflow-flows/scheduling-with-aws-step-functions.md +++ b/docs/production/scheduling-metaflow-flows/scheduling-with-aws-step-functions.md @@ -35,6 +35,14 @@ You can interact with Step Functions programmatically using the `Deployer` API - more about it here](/metaflow/managing-flows/deployer). ::: +:::info Note +[Conditional and recursive steps](/metaflow/basics#conditionals) +introduced in Metaflow 2.18, are not yet supported +on Step Functions deployments. Contact +[the Metaflow Slack](http://slack.outerbounds.co) if +you have a use case for this feature. +::: + ## Pushing a flow to production Let's use [the flow from the section about diff --git a/static/assets/dag-branch.png b/static/assets/dag-branch.png new file mode 100644 index 00000000..a872c53f Binary files /dev/null and b/static/assets/dag-branch.png differ diff --git a/static/assets/dag-conditional.png b/static/assets/dag-conditional.png new file mode 100644 index 00000000..f6f85228 Binary files /dev/null and b/static/assets/dag-conditional.png differ diff --git a/static/assets/dag-foreach.png b/static/assets/dag-foreach.png new file mode 100644 index 00000000..1d211711 Binary files /dev/null and b/static/assets/dag-foreach.png differ diff --git a/static/assets/dag-linear.png b/static/assets/dag-linear.png new file mode 100644 index 00000000..7b422639 Binary files /dev/null and b/static/assets/dag-linear.png differ diff --git a/static/assets/dag-nested.png b/static/assets/dag-nested.png new file mode 100644 index 00000000..ec5b4259 Binary files /dev/null and b/static/assets/dag-nested.png differ diff --git a/static/assets/dag-recursion.png b/static/assets/dag-recursion.png new file mode 100644 index 00000000..a011b031 Binary files /dev/null and b/static/assets/dag-recursion.png differ