11 May 22:32

cd3f29d

Flyte 1.5.1 Patch Release

This is a patch release that only contains one change - flyteorg/flyteadmin#560
which cherry-picks flyteorg/flyteadmin#554.

PR #554 adds a migration that remediates an issue that we discovered with very old installations of Flyte. Basically one of the tables node_executions has a self-referencing foreign key. The main id column of the table is a bigint whereas the self-foreign-key parent_id was an int. This was a rooted in an early version of gorm and should not affect most users. Out of an abundance of caution however, we are adding a migration to patch this issue in a manner that minimizes any locking.

To Deploy

When you deploy this release of Flyte, you should make sure that you have more than one pod for Admin running. (If you are running the flyte-binary helm chart, this patch release does not apply to you at all. All those deployments should already have the correct column type.) When the two new migrations that #554 added runs, the first one may take an extended period of time (hours). However, this is entirely non-blocking as long as there is another Admin instance available to serve traffic.

The second migration is locking, but even on very large tables, this migration was over in ~5 seconds, so you should not see any noticeable downtime whatsoever.

The migration will also check to see that your database falls into this category before running (ie, the parent_id and the id columns in node_executions are mismatched). You can also do check this yourself using psql. If this migration is not needed, the migration will simply mark itself as complete and be a no-op otherwise.

Assets 6

0 Join discussion

09 May 03:25

flyte-bot

v1.6.0-b1

941fa41

Flyte v1.6.0-b1 milestone release Pre-release

Pre-release

Flyte v1.6.0-b1

What's Changed

Console

feat: show launchplan in execution table by @pradithya in flyteorg/flyteconsole#738
feat: show launch plan information in workflow's schedules by @pradithya in flyteorg/flyteconsole#739
fix: passthrough runtime env vars by @ursucarina in flyteorg/flyteconsole#741
chore: add fallback to task execution link by @ursucarina in flyteorg/flyteconsole#743
chore: allow custom subnav by @ursucarina in flyteorg/flyteconsole#734
fix: force node executions to pull their status by @ursucarina in flyteorg/flyteconsole#737
chore: fix details panel card padding by @ursucarina in flyteorg/flyteconsole#745
chore: fix crash by @ursucarina in flyteorg/flyteconsole#746
[UI Feature] Add full-list log output to execution detail panel by @james-union in flyteorg/flyteconsole#744
TLM add log-message window to left panel by @james-union in flyteorg/flyteconsole#748
[Snyk] Upgrade eslint from 8.31.0 to 8.33.0 by @EngHabu in flyteorg/flyteconsole#695
chore: [tlm] comprehensive node execution query by @ursucarina in flyteorg/flyteconsole#749
chore: guard against /tasks failing by @ursucarina in flyteorg/flyteconsole#750
chore: propagate dynamic parent id by @ursucarina in flyteorg/flyteconsole#751
Add support fetching description entity by @pingsutw in flyteorg/flyteconsole#735

Admin

Infer GOOS and GOARCH from environment by @jeevb in flyteorg/flyteadmin#550
Enrich TerminateExecution error to tell propeller the execution already terminated by @EngHabu in flyteorg/flyteadmin#551
Address resolution by @wild-endeavor in flyteorg/flyteadmin#546
Add migration to turn parent_id column into bigint only if necessary by @eapolinario in flyteorg/flyteadmin#554

Propeller

Moved controller-runtime start out of webhook Run function by @hamersaw in flyteorg/flytepropeller#546
Fixing recovering of SKIPPED nodes by @hamersaw in flyteorg/flytepropeller#551
Remove resource injection on the node for container task by @ByronHsu in flyteorg/flytepropeller#544
Infer GOOS and GOARCH from environment by @jeevb in flyteorg/flytepropeller#552
fix makefile to read variables from environment and overrides by @jeevb in flyteorg/flytepropeller#554
Remove BarrierTick by @hamersaw in flyteorg/flytepropeller#545
Check for TerminateExecution error and eat Precondition status by @EngHabu in flyteorg/flytepropeller#553
Setting primaryContainerName by default on Pod plugin by @hamersaw in flyteorg/flytepropeller#555
Implement ability to specify additional/override annotations when using Vault Secret Manager by @pradithya in flyteorg/flytepropeller#556
Maintaining Interruptible and OverwriteCache for reference launchplans by @hamersaw in flyteorg/flytepropeller#557
Added support for aborting task nodes reported as failures by @hamersaw in flyteorg/flytepropeller#541
Added support for EnvironmentVariables on ExecutionConfig by @hamersaw in flyteorg/flytepropeller#558
Fast fail if task resource requests exceed k8s resource limits by @hamersaw in flyteorg/flytepropeller#488
@ByronHsu made their first contribution in flyteorg/flytepropeller#544

Contributors

EngHabu, eapolinario, and 8 other contributors

Assets 6

0 Join discussion

06 Apr 17:29

flyte-bot

v1.5.0

bc521da

Flyte v1.5.0 milestone release

Flyte 1.5 release

Platform

We're laying the foundation for an improved experience to help performance investigations. Stay tuned for more details!

We can now submit Ray jobs to separate clusters (other than the one flytepropeller is running). Thanks to flyteorg/flyteplugins#321.

Several bug fixes, including:

Database Migrations

One of the improvements planned requires us to clean up our database migrations. We have done so in this release so you should see a series of new migrations.
These should have zero impact if you are otherwise up-to-date on migrations (which is why they are all labeled noop) but please be aware that it will add a minute or so to the
init container/command that runs the migrations in the default Helm charts. Notably, because these should be a no-op, they also do not come with any rollback commands.
If you experience any issues, please let us know.

Flytekit

Python 3.11 is now officially supported.

Revamped Data subsystem

The data persistence layer was completely revamped. We now rely exclusively on fsspec to handle IO.

Most users will benefit from a more performant IO subsystem, in other words,
no change is needed in user code.

The data persistence layer has undergone a thorough overhaul. We now exclusively utilize fsspec for managing input and output operations.

For the majority of users, the improved IO subsystem provides enhanced performance, meaning that no modifications are required in their existing code.

This change opened the door for flytekit to rely on fsspec streaming capabilities. For example, let's say we want to stream a file, now we're able to do:

@task
def copy_file(ff: FlyteFile) -> FlyteFile:
    new_file = FlyteFile.new_remote_file(ff.remote_path)
    with ff.open("r", cache_type="simplecache", cache_options={}) as r:
        with new_file.open("w") as w:
            w.write(r.read())
    return new_file

This feature is marked as experimental. We'd love feedback on the API!

Limited support for partial tasks

We can use functools.partial to "freeze"
some task arguments. Let's take a look at an example where we partially fix the parameter for a task:

@task
def t1(a: int, b: str) -> str:
    return f"{a} -> {b}"
    
t1_fixed_b = functools.partial(t1, b="hello")

@workflow
def wf(a: int) -> str:
    return t1_fixed_b(a=a)

Notice how calls to t1_fixed_b do not need to specify the b parameter.

This also works for MapTasks in a limited capacity. For example:

from flytekit import task, workflow, partial, map_task

@task
def t1(x: int, y: float) -> float:
    return x + y

@workflow
def wf(y: List[float]):
   partial_t1 = partial(t1, x=5)
   return map_task(partial_t1)(y=y)

We are currently seeking feedback on this feature, and as a result, it is labeled as experimental for now.

Also worth mentioning that fixing parameters of type list is not currently supported. For example, if we try to register this workflow:

from functools import partial
from typing import List
from flytekit import task, workflow, map_task

@task
def t(a: int, xs: List[int]) -> str:
    return f"{a} {xs}"

@workflow
def wf():
    partial_t = partial(t, xs=[1, 2, 3])
    map_task(partial_t)(a=[1, 2])

We're going to see this error:

❯ pyflyte run workflows/example.py wf
Failed with Unknown Exception <class 'ValueError'> Reason: Map tasks do not support partial tasks with lists as inputs.
Map tasks do not support partial tasks with lists as inputs.

Flyteconsole

Multiple bug fixes around waiting for external inputs.
Better support for dataclasses in the launch form.

Assets 6

0 Join discussion

31 Mar 21:47

flyte-bot

v1.5.0-a0

08f6959

Flyte v1.5.0-a0 milestone release Pre-release

Pre-release

Flyte v1.5.0-a0 Changelog

This is an alpha release to help test changes internally before the official release. *

Noop migrations (flyteorg/flyteadmin#542)
Tracking reasons time-series (flyteorg/flyteadmin#540)
Fix fast-cache bug on first node event (flyteorg/flyteadmin#483)
Bug fix: Branch operator not taking dependencies into account (#3512)
Persist k8s plugin state between evaluations (flyteorg/flytepropeller#540)
Add support for GCP Secret Manager (flyteorg/flytepropeller#547)
Reject button on gate nodes (flyteorg/flyteconsole#733)
Better error message in casea of invalid json in launch form (flyteorg/flyteconsole#693)
Split flyte-binary services into http and grpc (#3518)

Assets 6

0 Join discussion

22 Mar 19:07

flyte-bot

v1.4.3

7479b56

Flyte v1.4.3 milestone release

Flyte 1.4.3 release

This patch release pulls in the compiler changes necessary to compile gate nodes and the fix for StructuredDataset in flyteorg/flyteadmin#541.

Assets 6

0 Join discussion

17 Mar 19:56

flyte-bot

v1.4.2

9bef5ff

Flyte v1.4.2 milestone release

Flyte 1.4.2 release

This patch release pulls in flyteorg/flyteconsole#721

Assets 6

0 Join discussion

15 Mar 22:18

flyte-bot

v1.4.1

c3bdaa4

Flyte v1.4.1 milestone release

Flyte 1.4.1 release

A patch release containing a few bug fixes, including:

Assets 6

0 Join discussion

07 Mar 01:38

flyte-bot

v1.4.0

eba986b

Flyte v1.4.0 milestone release

Flyte 1.4 release

The main features of the 1.4 release are:

Suport for PodTemplate at the task-level
Revamped auth system in flytekit

As python 3.7 reached EOL support in December of 2022, we dropped support for that version on this release.

Platform

Support for `PodTemplate` at the task-level.

Users can now define PodTemplate as part of the definition of a task. For example, note how we have access a full V1PodSpec as part of the task definition:

@task(
    pod_template=PodTemplate(
        primary_container_name="primary",
        labels={"lKeyA": "lValA", "lKeyB": "lValB"},
        annotations={"aKeyA": "aValA", "aKeyB": "aValB"},
        pod_spec=V1PodSpec(
            containers=[
                V1Container(
                    name="primary",
                    image="repo/placeholderImage:0.0.0",
                    command="echo",
                    args=["wow"],
                    resources=V1ResourceRequirements(limits={"cpu": "999", "gpu": "999"}),
                    env=[V1EnvVar(name="eKeyC", value="eValC"), V1EnvVar(name="eKeyD", value="eValD")],
                ),
            ],
            volumes=[V1Volume(name="volume")],
            tolerations=[
                V1Toleration(
                    key="num-gpus",
                    operator="Equal",
                    value=1,
                    effect="NoSchedule",
                ),
            ],
        )
    )
)
def t1(i: str):
    ...

We are working on more examples in our documentation. Stay tuned!

Flytekit

As promised in https://github.com/flyteorg/flytekit/releases/tag/v1.3.0, we're backporting important changes to the 1.2.x release branch. In the past month we had 2 releases: https://github.com/flyteorg/flytekit/releases/tag/v1.2.8 and https://github.com/flyteorg/flytekit/releases/tag/v1.2.9.

Here's some of the highlights of this release. For a full changelog please visit https://github.com/flyteorg/flytekit/releases/tag/v1.4.0.

Revamped auth system

In flyteorg/flytekit#1458 we introduced a new OAuth2 handling system based on client-side grpc interceptors.

New sandbox features

In this new release flytectl demo brings the following new features:

Support for specifying extra configuration for Flyte
Support for specifying extra cluster resource templates for boostrapping new namespaces
Sandbox state (DB, buckets) is now persistent across restarts and upgrades

Flyteconsole

Assets 6

0 Join discussion

14 Feb 19:38

flyte-bot

v1.4.0-b0

d60c9af

Flyte v1.4.0-b0 milestone release Pre-release

Pre-release

Flyte v1.4.0-b0 Changelog

Pod Templates and changes to the sandbox experience (mainly around configuration reloading).

A full changelog is going to come in the official release.

Assets 6

0 Join discussion

11 Jan 23:57

flyte-bot

v1.3.0

f69fb09

Flyte v1.3.0 milestone release

Flyte v1.3.0

The main features of this 1.3 release are

Databricks support as part of the Spark plugin
New Helm chart that offers a simpler deployment using just one Flyte service
Signaling/gate node support (human in the loop tasks)
User documentation support (backend and flytekit only, limited types)

The latter two are pending some work in Flyte console, they will be piped through fully by the end of Q1. Support for setting and approving gate nodes is supported in FlyteRemote however, though only a limited set of types can be passed in.

Notes

There are a couple things to point out with this release.

Caching on Structured Dataset

Please take a look at the flytekit PR notes for more information but if you haven't bumped Propeller to version v1.1.36 (aka Flyte v1.2) or later, tasks that take as input a dataframe or a structured dataset type, that are cached, will trigger a cache miss. If you've upgraded Propeller, it will not.

Flytekit Remote Types

In the FlyteRemote experience, fetched tasks and workflows will now be based on their respective "spec" classes in the IDL (task/wf) rather than the template. The spec messages are a superset of the template messages so no information is lost. If you have code that was accessing elements of the templates directly however, these will need to be updated.

Usage Overview

Databricks

Please refer to the documentation for setting up Databricks.
Databricks is a subclass of the Spark task configuration so you'll be able to use the new class in place of the more general Spark configuration.

from flytekitplugins.spark import Databricks
@task(
    task_config=Databricks(
        spark_conf={
            "spark.driver.memory": "1000M",
            "spark.executor.memory": "1000M",
            "spark.executor.cores": "1",
            "spark.executor.instances": "2",
            "spark.driver.cores": "1",
        },
        databricks_conf={
            "run_name": "flytekit databricks plugin example",
            "new_cluster": {
                "spark_version": "11.0.x-scala2.12",
                "node_type_id": "r3.xlarge",
                "aws_attributes": {
                    "availability": "ON_DEMAND",
                    "instance_profile_arn": "arn:aws:iam::1237657460:instance-profile/databricks-s3-role",
                },
                "num_workers": 4,
            },
            "timeout_seconds": 3600,
            "max_retries": 1,
        }
    ))

New Deployment Type

A couple releases ago, we introduced a new Flyte executable that combined all the functionality of Flyte's backend into one command. This simplifies the deployment in that only one image needs to run now. This approach is now our recommended way for new comers to the project to install and administer Flyte and there is a new Helm chart also. Documentation has been updated to take this into account. For new installations of Flyte, clusters that do not already have the flyte-core or flyte charts installed, users can

helm install flyte-server flyteorg/flyte-binary --namespace flyte --values your_values.yaml

New local demo environment

Users may have noticed that the environment provided by flytectl demo start has also been updated to use this new style of deployment, and internally now installs this new Helm chart. The demo cluster now also exposes an internal docker registry on port 30000. That is, with the new demo cluster up, you can tag and push to localhost:30000/yourimage:tag123 and the image will be accessible to the internal Docker daemon. The web interface is still at localhost:30080, Postgres has been moved to 30001 and the Minio API (not web server) has been moved to 30002.

Human-in-the-loop Workflows

Users can now insert sleeps, approval, and input requests, in the form of gate nodes. Check out one of our earlier issues for background information.

from flytekit import wait_for_input, approve, sleep

@workflow
def mainwf(a: int):
    x = t1(a=a)
    s1 = wait_for_input("signal-name", timeout=timedelta(hours=1), expected_type=bool)
    s2 = wait_for_input("signal name 2", timeout=timedelta(hours=2), expected_type=int)
    z = t1(a=5)
    zzz = sleep(timedelta(seconds=10))
    y = t2(a=s2)
    q = t2(a=approve(y, "approvalfory", timeout=timedelta(hours=2)))
    x >> s1
    s1 >> z
    z >> zzz
    ...

These also work inside @dynamic tasks. Interacting with signals from flytekit's remote experience looks like

from flytekit.remote.remote import FlyteRemote
from flytekit.configuration import Config
r = FlyteRemote(
    Config.auto(config_file="/Users/ytong/.flyte/dev.yaml"),
   default_project="flytesnacks",
   default_domain="development",
)
r.list_signals("atc526g94gmlg4w65dth")
r.set_signal("signal-name", "execidabc123", True)

Overwritten Cached Values on Execution

Users can now configure workflow execution to overwrite the cache. Each task in the workflow execution, regardless of previous cache status, will execute and write cached values - overwritting previous values if necessary. This allows previously corrupted cache values to be corrected without the tedious process of incrementing the cache_version and re-registering Flyte workflows / tasks.

Support for Dask

Users will be able to spawn Dask ephemeral clusters as part of their workflows, similar to the support for Ray and Spark.

Looking Ahead

In the coming release, we are focusing on...

Out of core plugin: Make backend plugin scalable and easy to author. No need of code generation, using tools that MLEs and Data Scientists are not accustomed to using.
Performance Observability: We have made great progress on exposing both finer-grained runtime metrics and Flytes orchestration metrics. This is important to better understand workflow evaluation performance and mitigate inefficiencies thereof.

Assets 6

0 Join discussion

Releases: flyteorg/flyte

Flyte v1.5.1 milestone release

Flyte 1.5.1 Patch Release

To Deploy

Flyte v1.6.0-b1 milestone release

Flyte v1.6.0-b1

What's Changed

Console

Admin

Propeller

Contributors

Flyte v1.5.0 milestone release

Flyte 1.5 release

Platform

Database Migrations

Flytekit

Revamped Data subsystem

Limited support for partial tasks

Flyteconsole

Flyte v1.5.0-a0 milestone release

Flyte v1.5.0-a0 Changelog

Flyte v1.4.3 milestone release

Flyte 1.4.3 release

Flyte v1.4.2 milestone release

Flyte 1.4.2 release

Flyte v1.4.1 milestone release

Flyte 1.4.1 release

Flyte v1.4.0 milestone release

Flyte 1.4 release

Platform

Support for PodTemplate at the task-level.

Flytekit

Revamped auth system

New sandbox features

Flyteconsole

Flyte v1.4.0-b0 milestone release

Flyte v1.4.0-b0 Changelog

Flyte v1.3.0 milestone release

Flyte v1.3.0

Notes

Caching on Structured Dataset

Flytekit Remote Types

Usage Overview

Databricks

New Deployment Type

New local demo environment

Human-in-the-loop Workflows

Overwritten Cached Values on Execution

Support for Dask

Looking Ahead

Support for `PodTemplate` at the task-level.