Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigTable Autoscaling - HTTP 499 error #673

Closed
3 tasks done
tommycouri-bestbuy opened this issue Jun 24, 2022 · 10 comments
Closed
3 tasks done

BigTable Autoscaling - HTTP 499 error #673

tommycouri-bestbuy opened this issue Jun 24, 2022 · 10 comments
Labels
bug Something isn't working

Comments

@tommycouri-bestbuy
Copy link

Checklist

Bug Description

When trying to deploy BigTable with autoscaling through config connector, I'm seeing the following error:

Cancelled (HTTP 499): Operation successfully rolled back : Both manual scaling (serve_nodes) and autoscaling (cluster_autoscaling_config) enabled. Exactly one must be set for CreateInstance/CreateCluster

Because of this, the BigTable instance does not deploy at all. When I try and deploy an instance without autoscaling enabled, it deploys without any issue. I have also opened GCP support case 29934946 regarding this issue. Their recommendation was to open up a GitHub issue here, as they believe the issue sits within Config Connector.

I have tried deploying this with and without the numNodes value as well. When that is not present, the instance fails to deploy without any error message in the activity log. I am also running config connector V1.88

Additional Diagnostic Information

Here is my sample yaml file that I'm trying to deploy:

  apiVersion: bigtable.cnrm.cloud.google.com/v1beta1
  kind: BigtableInstance
  metadata:
    name: bigtableinstance-auto
  spec:
    displayName: BigtableAuto
    cluster:
      - autoscalingConfig:
          cpuTarget: 60
          maxNodes: 3
          minNodes: 1
        numNodes: 1
        clusterId: bigtable-auto
        zone: us-central1-a

Kubernetes Cluster Version

1.21.0

Config Connector Version

1.88

Config Connector Mode

namespaced mode (default)

Log Output

Output from kubectl describe. I have removed my Namespace information:

API Version:  bigtable.cnrm.cloud.google.com/v1beta1
Kind:         BigtableInstance
Metadata:
  Creation Timestamp:  2022-06-23T19:15:55Z
  Finalizers:
    cnrm.cloud.google.com/finalizer
    cnrm.cloud.google.com/deletion-defender
  Generation:  1
  Managed Fields:
    API Version:  bigtable.cnrm.cloud.google.com/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:finalizers:
          .:
          v:"cnrm.cloud.google.com/deletion-defender":
          v:"cnrm.cloud.google.com/finalizer":
      f:status:
        .:
        f:conditions:
        f:observedGeneration:
    Manager:      cnrm-controller-manager
    Operation:    Update
    Time:         2022-06-23T19:15:55Z
    API Version:  bigtable.cnrm.cloud.google.com/v1beta1
    Fields Type:  FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .:
          f:kubectl.kubernetes.io/last-applied-configuration:
      f:spec:
        .:
        f:cluster:
        f:displayName:
    Manager:         kubectl-client-side-apply
    Operation:       Update
    Time:            2022-06-23T19:15:55Z
  Resource Version:  220808017
  UID:               ef40cdb9-da1e-4333-ae84-cbbdaa4e82c8
Spec:
  Cluster:
    Autoscaling Config:
      Cpu Target:  60
      Max Nodes:   3
      Min Nodes:   1
    Cluster Id:    bigtable-auto
    Num Nodes:     1
    Zone:          us-central1-a
  Display Name:    BigtableAuto
Status:
  Conditions:
    Last Transition Time:  2022-06-23T19:15:55Z
    Message:               Update call failed: error applying desired state: summary: Error creating instance. rpc error: code = Canceled desc = Operation successfully rolled back : Both manual scaling (serve_nodes) and autoscaling (cluster_autoscaling_config) enabled. Exactly one must be set for CreateInstance/CreateCluster
    Reason:                UpdateFailed
    Status:                False
    Type:                  Ready
  Observed Generation:     1
Events:
  Type    Reason    Age                    From                         Message
  ----    ------    ----                   ----                         -------
  Normal  Updating  5m27s (x570 over 19h)  bigtableinstance-controller  Update in progress

Steps to reproduce the issue

Deploying the following yaml file will should reproduce the issue.

YAML snippets

apiVersion: bigtable.cnrm.cloud.google.com/v1beta1
kind: BigtableInstance
metadata:
  name: bigtableinstance-auto
spec:
  displayName: BigtableAuto
  cluster:
    - autoscalingConfig:
        cpuTarget: 60
        maxNodes: 3
        minNodes: 1
      numNodes: 1
      clusterId: bigtable-auto
      zone: us-central1-a
@tommycouri-bestbuy tommycouri-bestbuy added the bug Something isn't working label Jun 24, 2022
@caieo
Copy link
Contributor

caieo commented Jun 28, 2022

Hi @tommycouri-bestbuy, normally you can only specify either numNodes or autoscalingConfig, not both. Can you try recreating the BigtableInstance after removing whichever one you don't need? We'll look into improving documentation for this field description to clarify this in the future.

@tommycouri-bestbuy
Copy link
Author

Hi @caieo - the issue is that when I'm deploying without numNodes specified, I get the following error:

Warning UpdateFailed 85s (x7 over 3m29s) bigtableinstance-controller Update call failed: error applying desired state: summary: Error: cluster.numNodes cannot be less than 1

@caieo
Copy link
Contributor

caieo commented Jun 29, 2022

@tommycouri-bestbuy, I was able to reproduce this bug. Let me try to determine the cause of this issue and I'll update this thread with more info when I get to it.

@caieo
Copy link
Contributor

caieo commented Jun 29, 2022

Opened a bug on the TF team's side to see if this is behavior they have run into. It looks like leaving numNodes unset triggers the validation function. However, the original error you provided indicates that the API does not allow setting of both numNodes and autoscalingConfig.

@tommycouri-bestbuy
Copy link
Author

Hi @caieo - just checking in to see if there is an update on this one.

@caieo
Copy link
Contributor

caieo commented Aug 16, 2022

Hi @tommycouri-bestbuy, after talking with the Terraform team, we decided that this is an issue that should be fixed on our side. We added this into our queue of bugs to fix but unfortunately don't have an update for you.

Can you also clarify if this issue is a blocker or friction point for you? I noticed you checked the If this issue is time-sensitive, I have submitted a corresponding issue with GCP support. in the template but have not see a support ticket assigned to us related to this issue.

@tommycouri-bestbuy
Copy link
Author

Hey @caieo - GCP support ticket 29934946 was opened up regarding this issue.

@caieo
Copy link
Contributor

caieo commented Aug 16, 2022

@tommycouri-bestbuy thanks for the additional info. I added this to our upcoming sprint to tackle as a quick fix, but because we are operating on a very limited capacity, we can't guarantee any timeline. Feel free to respond in the GCP support ticket or here if the priority of this issue changes.

@kevinsi4508
Copy link

kevinsi4508 commented Sep 13, 2022

numNodes is optional when autoscalingConfig is set. This should be fixed in the next Config Connector release.

@diviner524
Copy link
Collaborator

This is now supported in v1.94.0. Please give it a try and let us know if you see any issues.

Also thank you @kevinsi4508 for making the change!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants