Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Helm charts no longer deploy into EKS cluster #1565

Closed
benesch opened this issue May 6, 2021 · 18 comments
Closed

Python Helm charts no longer deploy into EKS cluster #1565

benesch opened this issue May 6, 2021 · 18 comments
Assignees
Labels
emergent An issue that was added to the current milestone after planning kind/bug Some behavior is incorrect or out of spec resolution/duplicate This issue is a duplicate of another issue

Comments

@benesch
Copy link

benesch commented May 6, 2021

Since #1539, when deploying a Helm v3 chart in Python, the resources in that chart are no longer deployed to provider specified on the chart. This means that e.g. attempting to deploy a chart to an EKS cluster in Python silently deploys it to whatever is in your kubeconfig instead.

Steps to reproduce

Assuming eks.cluster is an EKS cluster resource, run a program like:

k8s.helm.v3.Chart(
    "some-helm-chart",
    k8s.helm.v3.ChartOpts(
        chart="whatever",
        # ...
    ),
    opts=pulumi.ResourceOptions(provider=eks.cluster.provider),
)

The chart's resources will get deployed to whatever kubectl is configured to use, rather than the EKS cluster.

Likely cause

This is almost certainly fallout from #1539. v3.0.0 does not exhibit this bug, but v3.1.0 (which includes #1539) does.

cc @lukehoban @lblackstone

@benesch benesch added the kind/bug Some behavior is incorrect or out of spec label May 6, 2021
@lukehoban lukehoban self-assigned this May 6, 2021
@lukehoban
Copy link
Member

@benesch I just tried this - and I was not able to reproduce the issue.

With this program and requirements.txt:

from pulumi import ResourceOptions
from pulumi_kubernetes import Provider
from pulumi_kubernetes.helm.v3 import Chart, LocalChartOpts

prov = Provider("p", context="docker-desktop")

Chart("foo", LocalChartOpts(path="../foo", namespace="lager"))
Chart("foo2", LocalChartOpts(path="../foo", namespace="lager2"), ResourceOptions(provider=prov))
pulumi>=3.0.0,<4.0.0
pulumi-kubernetes==3.1.0

And with my current context set to a cluster that is not reachable, but with a local docker-desktop context that is available.

An update gives me the expected:

Updating (dev)

View Live: https://app.pulumi.com/lukehoban/pychart/dev/updates/6

     Type                                     Name         Status                  Info
 +   pulumi:pulumi:Stack                      pychart-dev  **creating failed**     1 error
 +   ├─ kubernetes:helm.sh/v3:Chart           foo2         created                 
 +   │  ├─ kubernetes:core/v1:ServiceAccount  lager2/foo2  created                 
 +   │  ├─ kubernetes:core/v1:Service         lager2/foo2  created                 
 +   │  └─ kubernetes:apps/v1:Deployment      lager2/foo2  created                 
 +   ├─ pulumi:providers:kubernetes           p            created                 
 +   └─ kubernetes:helm.sh/v3:Chart           foo          created                 
 +      ├─ kubernetes:core/v1:ServiceAccount  lager/foo    **creating failed**     1 error
 +      ├─ kubernetes:core/v1:Service         lager/foo    **creating failed**     1 error
 +      └─ kubernetes:apps/v1:Deployment      lager/foo    **creating failed**     1 error
 

Do you have any more details on your repro for this?

@lblackstone
Copy link
Member

I was able to repro with the following program:

import pulumi
import pulumi_aws as aws
import pulumi_eks as eks
import os
from pulumi import ResourceOptions
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts, FetchOpts

base_name = "demo"
profile = os.environ["AWS_PROFILE"]

# Create an AWS provider instance using the named profile creds
# and current region.
uswest2 = aws.Provider("uswest2", region="us-west-2", profile=profile)

kubeconfig_opts = eks.KubeconfigOptionsArgs(profile_name=profile)
myekscluster = eks.Cluster(
    base_name,
    provider_credential_opts=kubeconfig_opts,
    opts=pulumi.ResourceOptions(provider=uswest2)
)

Chart("nginx", ChartOpts(
    chart="nginx",
    values={"service": {"type": "ClusterIP"}},
    fetch_opts=FetchOpts(
        repo="https://charts.bitnami.com/bitnami"
    )
), ResourceOptions(provider=myekscluster.provider))
pulumi>=3.0.0,<4.0.0
pulumi-aws>=4.0.0,<5.0.0
pulumi-eks>=0.30.0
pulumi-kubernetes==3.1.0

Here are the relevant parts of the state file:

Chart

{
    "urn": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart::nginx",
    "custom": false,
    "type": "kubernetes:helm.sh/v3:Chart",
    "outputs": {
        "resources": {
            "apps/v1/Deployment:nginx": {
                "4dabf18193072939515e22adb298388d": "5cf8f73096256a8f31e491e813e4eb8e",
                "id": "default/nginx",
                "packageVersion": "",
                "urn": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart$kubernetes:apps/v1:Deployment::nginx"
            },
            "v1/ConfigMap:nginx-server-block": {
                "4dabf18193072939515e22adb298388d": "5cf8f73096256a8f31e491e813e4eb8e",
                "id": "default/nginx-server-block",
                "packageVersion": "",
                "urn": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart$kubernetes:core/v1:ConfigMap::nginx-server-block"
            },
            "v1/Service:nginx": {
                "4dabf18193072939515e22adb298388d": "5cf8f73096256a8f31e491e813e4eb8e",
                "id": "default/nginx",
                "packageVersion": "",
                "urn": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart$kubernetes:core/v1:Service::nginx"
            }
        }
    },
    "parent": "urn:pulumi:dev::eks-py-test::pulumi:pulumi:Stack::eks-py-test-dev",
    "aliases": [
        "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v2:Chart::nginx"
    ]
},

Deployment (Chart sub-resource)

"parent": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart::nginx",
"provider": "urn:pulumi:dev::eks-py-test::pulumi:providers:kubernetes::default_3_1_0::2077f3de-aa61-4ebf-8d8c-ce6739f0b7ae",

@lblackstone
Copy link
Member

Also confirmed that it works as expected with pulumi_kubernetes==3.0.0.

Here's the updated Deployment state with the 3.0.0 provider:

"parent": "urn:pulumi:dev::eks-py-test::kubernetes:helm.sh/v3:Chart::nginx",
"provider": "urn:pulumi:dev::eks-py-test::eks:index:Cluster$pulumi:providers:kubernetes::demo-provider::c99dc918-8274-4f42-b90c-3daf6ca53bc3",

@benesch
Copy link
Author

benesch commented May 7, 2021

Thanks for piecing together a standalone repro, @lblackstone!

@lukehoban
Copy link
Member

Debugging this - I'm seeing that the Chart object looks like this when the child Deployment is being created (reading the deployment's parent):

 {'_transformations': [], '_name': 'nginx', '_providers': {<pulumi.output.Output object at 0x10e3f2280>: <pulumi.output.Output object at 0x10e3eac40>}, '_protect': False, '_aliases': [<pulumi.output.Output object at 0x10e3f26a0>], 'urn': <pulumi.output.Output object at 0x10e3f2a90>, 'id': None, 'resources': <pulumi.output.Output object at 0x10e3f2e80>}

Note that the key in the _providers map is an Output, even though the type of _providers and implementation of get_provider assume it is a str. See for example https://github.com/pulumi/pulumi/blob/master/sdk/python/lib/pulumi/resource.py#L792:L808.

It seems that something - most likely related to multi-language components? - is populating the providers map incorrectly, which is leading to the parent providers inheritance not working. This results in the child believing there is no provider to inherit from the parent for kubernetes - even though there is.

@benesch
Copy link
Author

benesch commented May 7, 2021

A bit of a shot in the dark, but I filed pulumi/pulumi#6693 a few weeks back about other RemoteComponent-related weirdness. Using eks.Cluster from Python definitely seems to have a bunch of weird edge cases.

@lukehoban
Copy link
Member

The issue appears to be that myekscluster.provider is an Output, but the implementation of the Resource base class does not correctly handle Output[Provider] arguments. In particular this line causes an propeperty on the Output[Provider] to be accessed, which is itself an Output[str], which is then used as the key for the dict entry.

https://github.com/pulumi/pulumi/blob/master/sdk/python/lib/pulumi/runtime/resource.py#L618

Either Resource needs to handle Output[Provider] inputs, or multi-lang components need to ensure that true Provider instances are returned as outputs instead of Output[Provider].

@leezen leezen added this to the 0.56 milestone May 7, 2021
@lblackstone lblackstone assigned lblackstone and unassigned lukehoban May 7, 2021
@lblackstone
Copy link
Member

lblackstone commented May 7, 2021

I think this regression was caused by the changes to fix pulumi/pulumi-eks#555

I'll go through all of the related changes and make sure Outputs are being handled properly.

Edit: This wasn't the problem.

@lblackstone
Copy link
Member

After some more digging, I realized that my repro program contains a type error. In the Python, .NET, and Go SDKs, the provider attribute is typed as an Output rather than a prompt value as in the NodeJS SDK.

Everything works as expected when I unwrap the provider Output with an apply:

import pulumi
import pulumi_aws as aws
import pulumi_eks as eks
import os
from pulumi import ResourceOptions
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts, FetchOpts

base_name = "demo"
profile = os.environ["AWS_PROFILE"]

# Create an AWS provider instance using the named profile creds
# and current region.
uswest2 = aws.Provider("uswest2", region="us-west-2", profile=profile)

kubeconfig_opts = eks.KubeconfigOptionsArgs(profile_name=profile)
myekscluster = eks.Cluster(
    base_name,
    provider_credential_opts=kubeconfig_opts,
    opts=pulumi.ResourceOptions(provider=uswest2)
)

myekscluster.provider.apply(lambda p: chart(p))


def chart(provider):
    Chart("nginx", ChartOpts(
        chart="nginx",
        values={"service": {"type": "ClusterIP"}},
        fetch_opts=FetchOpts(
            repo="https://charts.bitnami.com/bitnami"
        )
    ), ResourceOptions(provider=provider))

@lblackstone
Copy link
Member

I narrowed down the change in behavior between v3.0.0 and v3.1.0 to this: https://github.com/pulumi/pulumi-kubernetes/pull/1539/files?file-filters%5B%5D=.py#diff-f2b60d028397820398a9fa1ebca34ef73fe594525259f511f39bfce01ef24e9fL177-L178

With this change reverted, the Output<Provider> implicitly resolves even though the types don't match.

@lblackstone
Copy link
Member

@benesch We're currently investigating a few options to fix this. As a workaround in the meantime, you can create a new Provider instance using the EKS cluster's kubeconfig, and it will work as you'd expect:

import pulumi
import pulumi_aws as aws
import pulumi_eks as eks
import os
from pulumi import ResourceOptions
from pulumi_kubernetes.helm.v3 import Chart, ChartOpts, FetchOpts
from pulumi_kubernetes import Provider, ProviderArgs

base_name = "demo"
profile = os.environ["AWS_PROFILE"]

# Create an AWS provider instance using the named profile creds
# and current region.
uswest2 = aws.Provider("uswest2", region="us-west-2", profile=profile)

kubeconfig_opts = eks.KubeconfigOptionsArgs(profile_name=profile)
myekscluster = eks.Cluster(
    base_name,
    provider_credential_opts=kubeconfig_opts,
    opts=pulumi.ResourceOptions(provider=uswest2)
)

provider = Provider("k8s", ProviderArgs(
    kubeconfig=myekscluster.kubeconfig
))
Chart("nginx", ChartOpts(
    chart="nginx",
    values={"service": {"type": "ClusterIP"}},
    fetch_opts=FetchOpts(
        repo="https://charts.bitnami.com/bitnami"
    )
), ResourceOptions(provider=provider))

@lblackstone lblackstone added kind/bug Some behavior is incorrect or out of spec emergent An issue that was added to the current milestone after planning and removed kind/bug Some behavior is incorrect or out of spec priority/P1 labels May 7, 2021
@benesch
Copy link
Author

benesch commented May 7, 2021

Thanks, @lblackstone. For now we're happy to stick on v3.0.0, but I'll try out your workaround if we need to upgrade.

@lblackstone
Copy link
Member

pulumi/pulumi#7012 tracks making the cluster.provider work for SDKs other than NodeJS.

@leezen leezen modified the milestones: 0.56, 0.57 May 17, 2021
@leezen leezen removed this from the 0.57 milestone Jun 8, 2021
@leezen leezen added the resolution/duplicate This issue is a duplicate of another issue label Jun 8, 2021
@leezen
Copy link
Contributor

leezen commented Jun 8, 2021

Closing as this is now tracked in pulumi/pulumi#7012.

@leezen leezen closed this as completed Jun 8, 2021
@benesch
Copy link
Author

benesch commented Jul 7, 2021

I don't mean to sound ungrateful, but I'd like to advocate for reopening this issue until pulumi/pulumi#7012 is fixed, or there are some guardrails put in place here! While I understand that the aforementioned issue is tracking the root cause, it's not discoverable for folks who are just looking to understand why pulumi_eks and pulumi_kubernetes are not playing well together.

In particular, pulumi/pulumi#3383 means the bug presents as silently deploying into whatever cluster is currently active in the user's kubecfg. This is the kind of bug that can result in taking down prod. (We were lucky, and it only took down our staging environment, but that's only because I happened to have staging as my active cluster, not prod.)

Since pulumi/pulumi#7012 looks like it's not a quick fix, it'd be great to get some some assertions in the SDK to at least prevent disaster, or at the very least some warnings in the docs.

@lblackstone, thanks for the workaround, but I don't think it'll work for us, since swapping the provider out like that results in Pulumi attempting to replace all the resources in the cluster. Tried to work around this with provider aliases but looks like those aren't wired up yet (pulumi/pulumi#3979).

@benesch
Copy link
Author

benesch commented Jul 7, 2021

Here's my workaround for now:

class LazyResource:
    def __init__(self, resource):
        self.resource = pulumi.Output.from_input(resource)

    @property
    def urn(self):
        return self.resource.apply(lambda r: r.urn)

    @property
    def id(self):
        return self.resource.apply(lambda r: r.id)

    @property
    def __class__(self):
        # Bypass https://github.com/pulumi/pulumi/blob/b7d403204/sdk/python/lib/pulumi/resource.py#L460-L462.
        return pulumi.Resource


class LazyProvider(LazyResource):
    def __init__(self, package, resource):
        super().__init__(resource)
        self.package = package

    @property
    def __class__(self):
        return pulumi.ProviderResource

cluster = eks.Cluster(...)
_provider = cluster.provider
cluster.__dict__["provider"] = LazyProvider("kubernetes", _provider)

@eliskovets
Copy link

Hi guys,
With my limited undestanding of pulumi, even if the issue is related to pulumi/pulumi#7012 , the changes here seems to be an issue to me and it hasn't been reverted, still exist in the 3.9.0 version.

@benesch Could you please tell me if you still use 3.0.0 to mitigate this issue or found another workaround? How do you pass the kubernetes provider to the rest of Chart resources?

@benesch
Copy link
Author

benesch commented Nov 21, 2021

@eliskovets we've been using the workaround provided by @lblackstone in #1565 (comment) to good effect for a while now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
emergent An issue that was added to the current milestone after planning kind/bug Some behavior is incorrect or out of spec resolution/duplicate This issue is a duplicate of another issue
Projects
None yet
Development

No branches or pull requests

5 participants