Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grpc.Core.RpcException: "Failed to deserialize response message." when using "cert-manager.crds.yaml" for ConfigFile #4224

Closed
sfmskywalker opened this issue Mar 29, 2020 · 24 comments
Assignees
Labels
kind/bug Some behavior is incorrect or out of spec language/dotnet
Milestone

Comments

@sfmskywalker
Copy link

When using the following code in my C# Pulumi project:

var customResourceDefinitions = new ConfigFile("cert-manager-crds", new ConfigFileArgs
{
    File = "https://github.com/jetstack/cert-manager/releases/download/v0.14.0/cert-manager.crds.yaml"
});

and do pulumi up, the process fails with:

Diagnostics:
  pulumi:pulumi:Stack (ya-infra-shared-dev):
    error: Running program 'C:\Projects\Ya\Shared\infra\ya-infra-shared\bin\Debug\netcoreapp3.1\Ya.Infra.Shared.dll' failed with an unhandled exception:
    Grpc.Core.RpcException: Status(StatusCode=Internal, Detail="Failed to deserialize response message.")
       at Pulumi.GrpcMonitor.InvokeAsync(InvokeRequest request)
       at Pulumi.Deployment.InvokeAsync[T](String token, InvokeArgs args, InvokeOptions options, Boolean convertResult)
       at Pulumi.Output`1.ApplyHelperAsync[U](Task`1 dataTask, Func`2 func)
       at Pulumi.Output`1.ApplyHelperAsync[U](Task`1 dataTask, Func`2 func)
       at Pulumi.Output`1.ApplyHelperAsync[U](Task`1 dataTask, Func`2 func)
       at Pulumi.Output`1.Pulumi.IOutput.GetDataAsync()
       at Pulumi.Serialization.Serializer.SerializeAsync(String ctx, Object prop)
       at Pulumi.Deployment.SerializeFilteredPropertiesAsync(String label, IDictionary`2 args, Predicate`1 acceptKey)
       at Pulumi.Deployment.SerializeAllPropertiesAsync(String label, IDictionary`2 args)
       at Pulumi.Deployment.RegisterResourceOutputsAsync(Resource resource, Output`1 outputs)
       at Pulumi.Deployment.Runner.WhileRunningAsync()

I'm trying to automate the installation of cert-manager as described here. One of its first steps mentions I should install some custom resource definitions separately.

Perhaps I should be using a different class to get the cert-manager.crds.yaml applied?

@lblackstone
Copy link
Member

This worked using the NodeJS SDK, so it must be specific to .NET. Can you take a look @mikhailshilkov?

new k8s.yaml.ConfigFile("guestbook2",
    {
        file: "https://github.com/jetstack/cert-manager/releases/download/v0.14.0/cert-manager.crds.yaml"
    }
);

Screen Shot 2020-03-30 at 12 57 24 PM

@lblackstone lblackstone added kind/bug Some behavior is incorrect or out of spec language/dotnet labels Mar 30, 2020
@mikhailshilkov
Copy link
Member

Here is a minimal repro YAML to get the same exception:

apiVersion: apiextensions.k8s.io/v1beta1
kind: CustomResourceDefinition
spec:
  validation:
    openAPIV3Schema:
      properties:
        spec:
          properties:
            solver:
              properties:
                http01:
                  properties:
                    ingress:
                      properties:
                        podTemplate:
                          properties:
                            spec:
                              properties:
                                affinity:
                                  properties:
                                    podAffinity:
                                      properties:
                                        preferredDuringSchedulingIgnoredDuringExecution:
                                          items:
                                            properties:
                                              podAffinityTerm:
                                                properties:
                                                  labelSelector:
                                                    properties:
                                                      matchExpressions:
                                                        items:
                                                          properties:
                                                            values:
                                                              type: array

The C# gRPC swallows the real exception, but I got it with a debugger:

Google.Protobuf.InvalidProtocolBufferException: Protocol message had too many levels of nesting.  May be malicious.  Use CodedInputStream.SetRecursionLimit() to increase the depth limit.
   at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder)
   at Google.Protobuf.FieldCodec.<>c__DisplayClass32_0`1.<ForMessage>b__0(CodedInputStream input)
   at Google.Protobuf.FieldCodec`1.Read(CodedInputStream input)
... (a very long call stack follows)

Unfortunately, I don't see a setting to configure the recursion limit in the gRPC library - the protobuf setting mentioned in the exception doesn't look available. I'll open an issue and ask there.

@mikhailshilkov
Copy link
Member

The issue in gRPC: grpc/grpc#22682

@sfmskywalker
Copy link
Author

I was wondering if anyone could recommend an alternative approach to installing CertManager, even as a temporary workaround (other than manually invoking helm)?

@gitfool
Copy link
Contributor

gitfool commented Oct 14, 2020

I've got the same problem using dotnet/c# with cert manager. This is a showstopper! 😬

@sam-cogan
Copy link

I'm seeing the same issue when using the option to install Cert-Manager CRD's as part of the Helm chart. So currently there is no way to install cert-manager with Pulumi, as far as I can see

@gitfool
Copy link
Contributor

gitfool commented Oct 18, 2020

I'm still puzzled by why the above minimal repro yaml and cert-manager.yaml causes ParsingPrimitivesMessages.ReadMessage to throw with error "too many levels of nesting" since DefaultRecursionLimit is 100:

  • minimal repro yaml has 32 nested elements
  • cert manager yaml has at most 37 nested elements (e.g. line 9593)

(using yaml indentation as guide)

What am I missing?

@NArnott
Copy link

NArnott commented Oct 29, 2020

We just ran into this to, as part of setting up the AWS Load Balancer controller in our EKS environment. We are fully integrated with Pulumi and need a solution.

@lblackstone
Copy link
Member

@mikhailshilkov Any ideas on a workaround here?

@mikhailshilkov
Copy link
Member

mikhailshilkov commented Oct 30, 2020

I don't have any workarounds that wouldn't require a fix in our SDK. Options of how this could be resolved:

None of the options look particularly appealing but forking may be the least evil. Ideas?

@mikhailshilkov
Copy link
Member

@gitfool

minimal repro yaml has 32 nested elements
What am I missing?

So, for every layer of yaml, ReadMessage method would be called three times: once with message being a Struct, once with message being MapField<string, Value>, and once with Value. I guess that's just how untyped dictionaries are represented in protobuf.

@gitfool
Copy link
Contributor

gitfool commented Nov 2, 2020

@mikhailshilkov Given the latest response, I’d expect the upstream issues to take too long to wait, and we need a fix now, so I’d fork to increase the default recursion limit in the meantime.

@mikhailshilkov
Copy link
Member

I’d fork to increase the default recursion limit in the meantime.

Is there the best practice for publishing forked NuGet packages? Any examples in the wild?

@gitfool
Copy link
Contributor

gitfool commented Nov 2, 2020

I don’t know about best practice but in the past we’ve done it by publishing to an internal NuGet repository and configuring NuGet.config to reference the internal NuGet feed before the standard NuGet feed.

You could do the same but for a public repository and feed, which could be hosted on MyGet or GitHub packages etc. (Note: GitHub packages requires auth for private packages and I’m not sure if it requires auth for public packages, which would make it useless.)

@mikhailshilkov
Copy link
Member

Before resorting to a fork, I tried changing YAML-parsing invokes to return a string of JSON rather than a raw untyped protobuf map.

This fixed the issue at invoke time but the same issue re-appeared at resource invocation time: each CRD in that YAML is only 1 level less deep than the YAML overall, so resource creation hits the same max recursion threshold. So, I hit

Grpc.Core.RpcException: Status(StatusCode=Internal, Detail="Failed to deserialize response message.")
       at Pulumi.GrpcMonitor.RegisterResourceAsync(Resource resource, RegisterResourceRequest request)
       at Pulumi.Deployment.RegisterResourceAsync(Resource resource, Boolean remote, Func`2 newDependency, ResourceArgs args, ResourceOptions options)
       at Pulumi.Deployment.ReadOrRegisterResourceAsync(Resource resource, Boolean remote, Func`2 newDependency, ResourceArgs args, ResourceOptions options)

@mikhailshilkov
Copy link
Member

Here is the draft of the forking plan:

  1. Fork protocolbuffers/protobuf to Pulumi org (done): https://github.com/pulumi/protobuf
  2. Checkout the latest stable tag (v3.13.0.1 as of today).
  3. Create a pulumi-fork branch.
  4. Adjust DefaultRecursionLimit in two spots from 100 to 1000.
  5. Adjust the csproj file to something like
    <GeneratePackageOnBuild>true</GeneratePackageOnBuild>
    <Description>Pulumi fork to override DefaultRecursionLimit - C# runtime library for Protocol Buffers - Google's data interchange format.</Description>
    <GeneratePackageOnBuild>true</GeneratePackageOnBuild>
  6. Build the package.
  7. Publish it under Pulumi organization in NuGet as Pulumi.Protobuf.
  8. Update the reference in pulumi/pulumi to the fork.
  9. Publish a new version of Pulumi NuGet.
  10. Update Pulumi.Kubernetes to use it.

I ran this sequence locally, without any publishing, and it does fix the original problem. I am able to deploy the CRDs from https://github.com/jetstack/cert-manager/releases/download/v0.14.0/cert-manager.crds.yaml.

Any concerns about this proposal, technically or from the IP perspective?

@gitfool
Copy link
Contributor

gitfool commented Nov 9, 2020

Sounds good. Only concern would be to ensure you follow any upstream updates.

Also, I thought Google.Protobuf was a transitive dependency not a direct dependency, in which case won’t the forked package name need to be the same?

@mikhailshilkov
Copy link
Member

mikhailshilkov commented Nov 19, 2020

Thanks to everyone for waiting for so long: it took a while to investigate the issue, find an issue in the Protobuf .NET library, discuss it with maintainers, and finally come up with a workaround.

The issue is now fixed in the latest alpha release (dotnet add package Pulumi.Kubernetes --version 2.8.0-alpha.1605734708) and will be shipped in the next release. cc @lblackstone - we may want to cut 2.8.1 2.7.2 whenever that works for you.

@sfmskywalker
Copy link
Author

That’s great news! Thanks for the effort that went into this.

@NArnott
Copy link

NArnott commented Nov 19, 2020

Awesome! Can't wait to get this built out.

@sam-cogan
Copy link

Thanks for sorting this @mikhailshilkov I can confirm it works as expected with the Alpha release, looking forward to getting it in once 2.8.1 is out!

@mikhailshilkov
Copy link
Member

I think the next release is going to be 2.7.2 - I messed it up due to our alpha naming. I'll post an update here whenever the fix is released.

lblackstone added a commit to pulumi/pulumi-kubernetes that referenced this issue Nov 19, 2020
This release fixes pulumi/pulumi#4224
where users were unable to install large CRDs using .NET.
@lblackstone
Copy link
Member

https://github.com/pulumi/pulumi-kubernetes/releases/tag/v2.7.2 is out now with the fix

@gitfool
Copy link
Contributor

gitfool commented Nov 19, 2020

I also want to thank @mikhailshilkov for implementing the work around for this unfortunate issue with dotnet grpc! ❤️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Some behavior is incorrect or out of spec language/dotnet
Projects
None yet
Development

No branches or pull requests

6 participants