Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend][sdk] Cannot assign values from Kubernetes Metadata to Environment Variables in V2 #10155

Closed
TristanGreathouse opened this issue Oct 25, 2023 · 1 comment · Fixed by #10496
Assignees
Labels
area/backend kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.

Comments

@TristanGreathouse
Copy link

TristanGreathouse commented Oct 25, 2023

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    Using AWS distribution of Kubeflow.

  • KFP version:
    Built from 1.8 branch the day before 1.8-rc2 was released.

  • KFP SDK version:

kfp                        2.3.0                  
kfp-kubernetes             1.0.0                  
kfp-pipeline-spec          0.2.2                  
kfp-server-api             2.0.2

Steps to reproduce

When using kfp==1.8.21 and kubernetes=26.1.0 I could do the following during pipeline compilation to set kubernetes metadata values to environment variables within a component.

import os
import kfp
from kfp import dsl
from kubernetes import client as k8s_client
from kfp.components import create_component_from_func

def test() -> float:
    return 1.1


@dsl.pipeline
def compile_pipeline():
    eck_env_var = k8s_client.V1EnvVar(
        name='EXECUTION_CACHE_KEY',
        value_from=k8s_client.V1EnvVarSource(
            field_ref=k8s_client.V1ObjectFieldSelector(
                field_path="metadata.annotations['pipelines.kubeflow.org/execution_cache_key']"
            )
        )
    )

    test_component = create_component_from_func(test)
    test_component.add_env_variable(eck_env_var)

if __name__ == "__main__":
    kfp.compiler.Compiler().compile(compile_pipeline, "test_pipeline.yaml")

However, in kfp==2.3.0, there is no longer an add_env_variable, and it appears to have been replace with set_env_variable. However, there are just two mandatory fields, name and value, and there appears to be no way to set environment values from kubernetes.

I tried a workaround where I simply set placeholder values during compilation and then post-processed the YAML to add the kubernetes metadata environment variables back in. This is shown in the code snippet below.

@dsl.component
def test() -> float:
    return 1.1


@dsl.pipeline
def compile_pipeline():
    test_component = test()
    test_component.set_env_variable(name="EXECUTION_CACHE_KEY", value="PLACEHOLDER")
    test_component.set_env_variable(name="KFP_RUN_ID", value="PLACEHOLDER")
    test_component.set_env_variable(name="WORKFLOW_ID", value="PLACEHOLDER")


if __name__ == "__main__":
    kfp.compiler.Compiler().compile(compile_pipeline, "test_pipeline.yaml")

    k8s_resource_mapping = {
        "KFP_RUN_ID": {
            "fieldRef": {
                "apiVersion": "v1",
                "fieldPath": "metadata.labels['pipeline/runid']"
            }
        },
        "WORKFLOW_ID": {
            "fieldRef": {
                "apiVersion": "v1",
                "fieldPath": "metadata.labels['workflows.argoproj.io/workflow']"
            }
        },
        "EXECUTION_CACHE_KEY": {
            "fieldRef": {
                "apiVersion": "v1",
                "fieldPath": "metadata.annotations['pipelines.kubeflow.org/execution_cache_key']"
            }
        }
    }

    configs = list(yaml.safe_load_all(open("test_pipeline.yaml", 'r')))
    new_configs = []
    for config in configs:
        if "deploymentSpec" in config.keys():
            for executor in config["deploymentSpec"]["executors"].keys():
                for idx, env in enumerate(config["deploymentSpec"]["executors"][executor]["container"].get("env", [])):
                    if env["name"] in k8s_resource_mapping.keys():
                        config["deploymentSpec"]["executors"][executor]["container"]["env"][idx] = {
                            "name": env["name"],
                            "valueFrom": k8s_resource_mapping[env["name"]]
                        }

        new_configs.append(config)

    yaml.dump_all(new_configs, open("test_pipeline.yaml", 'w'))

I was optimistic this would work, but when I started a run and logged the values of those environment variables, they were null-valued. If I check out the pod with the following command:

kubectl get pod <pod_name> -o json

I see the following output. The environment variables I assigned do not have the intended valueFrom field. However, other environment variables that have been assigned to the pod by non-user-specified processes do have the valueFrom field in exactly the same format as I had in my compiled YAML.

{
    "name": "EXECUTION_CACHE_KEY"
},
{
    "name": "KFP_RUN_ID"
},
{
    "name": "WORKFLOW_ID"
},
{
    "name": "KFP_POD_NAME",
    "valueFrom": {
        "fieldRef": {
            "apiVersion": "v1",
             "fieldPath": "metadata.name"
         }
    }
}

Expected result

I should be able to assign kubernetes metadata values as environment variables to a component using Kubeflow's DSL, but if that's not possible, at the very least being able to post-process the YAML to add these values as they were added in V1, and not having them modified to remove the valueFrom metadata tags in the backend would be sufficient. Perhaps the most logical place for this functionality is in the new kfp-kubernetes SDK


Impacted by this bug? Give it a 👍.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Jan 25, 2024
KFP SDK Triage automation moved this from Awaits Contributor to Closed Mar 4, 2024
KFP Runtime Triage automation moved this from Awaits Contributor to Closed Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backend kind/bug lifecycle/stale The issue / pull request is stale, any activities remove this label.
Development

Successfully merging a pull request may close this issue.

2 participants