Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement zoneTracker for coordinating downscales across statefulsets using a file in object storage for the prepare_downscale admission webhook #96

Conversation

JordanRushing
Copy link
Contributor

@JordanRushing JordanRushing commented Nov 14, 2023

This PR introduces the idea of a zoneTracker using a JSON-encoded file stored in object storage to save timestamps for coordinating downscales between statefulsets with the prepare-downscale admission webhook.

The file's contents:

{
  "ingester-zone-a": {
    "lastDownscaled": "2023-12-14T23:37:41Z"
  },
  "ingester-zone-b": {
    "lastDownscaled": "2023-12-14T23:37:41Z"
  },
  "ingester-zone-c": {
    "lastDownscaled": "2023-12-14T23:37:41Z"
  }
}

When rolling out the prepare-downscale admission webhook to support ingester autoscaling in Loki, we encountered differences in behavior w.r.t the last-downscale annotation that the rollout operator was applying to ingester statefulsets between the major cloud service provider's managed Kubernetes control planes.

HPA controllers in EKS/AKS fail to scale the relevant ingester statefulset after a valid AdmissionRequest is approved in the prepare-downscale webhook due to resourceVersion mismatch errors related to the last-downscale annotation being applied to the ingester statefulsets. We currently believe this is due to differences in managed control plane configurations between the providers but were unable to pinpoint a root cause even after working with the relevant support team for AWS.


Initially, we attempted to submit the annotation patch as a JSONPatch within the AdmissionResponse itself to solve this problem but even that patch was not correctly applied to the ingester statefulsets so downscales were not correctly coordinated.

As a workaround, we considered tracking this state in a file instead of a Kubernetes annotation. A draft PR was opened to do this and it worked with supported changes in the Loki API, but with trade-offs from saving this state on local disk.

After further consideration, we decided to store this file in object storage instead of PVs to simplify and fortify this state tracking with the strong read-after-write consistency guarantees that are provided by the three largest cloud provider's object storage services.

This has been successfully tested as a drop-in replacement for existing namespaces that were using the annotation to coordinate downscales prior on AWS/GCP/Azure (it essentially applies two conditionals to the existing code). It would be possible to add an in-memory bucket option to be configured as well if desired.


go mod vendor & go mod tidy were run as well.


TODO

  • Initial zone file creation
  • Bucket configuration and support s3/gcs/abs
  • More tests

… using a file in object storage for the prepare_downscale admission webhook

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
…itial zone file if not present

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Copy link

@cstyan cstyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally I'd look for a way to start breaking up prepareDownscale into some more functions. Especially with the amount of nested conditionals, it's getting a bit difficult to read. Maybe everything that happens in the block for if the zoneTracker file is being used can be a separate function?

Comment on lines 38 to 45
var zt *zoneTracker
if useZoneTracker {
// TODO(jordanrushing): fix bucket creation semantics and wire-in config supporting multiple CSPs
bkt := objstore.NewInMemBucket()
zt = newZoneTracker(bkt, "testkey")
}

return prepareDownscale(ctx, logger, ar, api, client, zt)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like I mentioned in our 1:1, I think the easiest option for now is to turn the zoneTracker param into a string. If it's not nil, require that it be one of gcs, s3, whatever azure storage is called, etc. That's probably easier than trying to wire up a config file, configmap that populates that config file, etc. so that there could be a zoneTracker storage config in the file that specifies the storage backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up adding the necessary config with flags passed to the zoneTracker instantiation before passing zt through to prepareDownscale

Comment on lines +180 to +182
level.Warn(logger).Log("msg", "downscale not allowed due to error while parsing downscale timestamps from the zone file", "err", err)
return deny(
"downscale of %s/%s in %s from %d to %d replicas is not allowed because parsing parsing downscale timestamps from the zone file failed.",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need a log line here and the returning of a warning? couldn't it be just returning the result, or an empty result and an error? then we only deal with logging once where ever it is we're calling prepareDownscale if the result is that we're denying the downscale

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's fair, generally I'm just trying to follow the pattern we've implemented elsewhere in this webhook. I don't think the extra logging has any real negative trade-offs as far as I can tell and this way depending on if you're looking at the operator directly or the admission responses you have visibility into the root cause of the failure.

} else {
foundSts, err := findDownscalesDoneMinTimeAgo(stsList, ar.Request.Name)
if err != nil {
level.Warn(logger).Log("msg", "downscale not allowed due to error while parsing downscale annotations", "err", err)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

pkg/admission/zone_tracker.go Outdated Show resolved Hide resolved
}
// If using zoneTracker instead of downscale annotations, check the zone file for recent downscaling.
// Otherwise, check the downscale annotations.
if zt != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make 2 types (one with the zones file and one with the existing logic) here that implement the same interface. That way the code for them is not intermingled as here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion for a follow-up PR IMO.

pkg/admission/zone_tracker.go Outdated Show resolved Hide resolved
return deny(msg)
}
} else {
foundSts, err := findDownscalesDoneMinTimeAgo(stsList, ar.Request.Name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the old way? If the initial load of the zones adds the waiting time to the initial times won't that be enough to migrate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the two conditionals is to maintain compatibility for users of the prepare_downscale webhook that don't have issues with the kubernetes annotations for this metadata and don't want to use the new zoneTracker.

@cstyan
Copy link

cstyan commented Nov 21, 2023

Mentioned this on slack but posting here too to see if anyone else agrees; it might make sense to change the naming a bit? The name zoneTracker is a bit ambiguous in this case imo, and I think it doesn't really match what's actually happening right?

What we're actually tracking is more like the time at which a zone was last autoscaled in order to prevent downscaling too frequently.

My first though is that zoneTracker implies we're tracking the zones, like which zones are active for use or something like that. zoneScalingTracker, scalingEventTracker, downscaleTracker would all be less ambiguous imo, but I don't feel strongly enough to block you.

…storage; update zoneTracker tests; run `go mod vendor` & `go mod tidy`

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
…on; update tests; go mod vendor & go mod tidy

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>

Update mockBucket instantiation in test; deny admission requests when failing to bootstrap the s3 bucket client

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>

Clean up s3 config struct

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>

Change s3 config creation semantics for zoneTracker

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>

Add support for gcs and azure object storage clients; go mod vendor & go mod tidy

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
@JordanRushing JordanRushing changed the title [WIP] Implement zoneTracker for coordinating downscales across statefulsets using a file in object storage for the prepare_downscale admission webhook Implement zoneTracker for coordinating downscales across statefulsets using a file in object storage for the prepare_downscale admission webhook Dec 6, 2023
@JordanRushing JordanRushing marked this pull request as ready for review December 6, 2023 22:10
@JordanRushing JordanRushing requested a review from a team as a code owner December 6, 2023 22:10
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
@JordanRushing
Copy link
Contributor Author

JordanRushing commented Dec 6, 2023

Personally I'd look for a way to start breaking up prepareDownscale into some more functions. Especially with the amount of nested conditionals, it's getting a bit difficult to read. Maybe everything that happens in the block for if the zoneTracker file is being used can be a separate function?

@cstyan I think that's valuable feedback and I'd be happy to open a separate PR to simplify the webhook code and break things out into better functions.

@@ -87,6 +103,32 @@ func (cfg config) validate() error {
if (cfg.kubeAPIURL == "") != (cfg.kubeConfigFile == "") {
return errors.New("either configure both Kubernetes API URL and config file or none of them")
}
if cfg.useZoneTracker {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in this branch belongs better in zone_tracker.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if I agree here, this validates the required config fields to launch with the zt following the pattern we use to validate the required kube config as well.

@jhalterman
Copy link
Member

Left a few comments we should consider, but this seems nice overall. Might be nice for @56quarters or @andyasp to weigh in on this as well.

@56quarters
Copy link
Contributor

This PR introduces the idea of a zoneTracker using a JSON-encoded file stored in object storage to save timestamps for coordinating downscales between statefulsets with the prepare-downscale admission webhook.

Has any consideration been given to using a configmap in Kubernetes instead of introducing a dependency on object storage for the rollout-operator?

@JordanRushing
Copy link
Contributor Author

This PR introduces the idea of a zoneTracker using a JSON-encoded file stored in object storage to save timestamps for coordinating downscales between statefulsets with the prepare-downscale admission webhook.

Has any consideration been given to using a configmap in Kubernetes instead of introducing a dependency on object storage for the rollout-operator?

Great question @56quarters! Yes, originally I was considering using a configmap after some deliberation with the team but ultimately, as far as I understand, etcd runs as a part of the managed control plane in EKS/AKS and the differences in the managed control planes between providers + poor visibility motivated this PR so I wanted to move this metadata outside of that.

I wonder if it might be better to move the annotation code path to using a configmap rather than the zoneTracker?

Do you have any concerns about introducing an object storage dependency to an optional code path here? The original code path is currently unchanged.

…ones; fix deadlock in lastDownscaled

Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
…g; update comment on loadZones to indicate unique sts name assumption for keys
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
Signed-off-by: JordanRushing <rushing.jordan@gmail.com>
@jhalterman
Copy link
Member

Talking to Jordan, he plans to followup with an attempt at accomplishing this via ConfigMap, which sounds good.

@JordanRushing
Copy link
Contributor Author

Talking to Jordan, he plans to followup with an attempt at accomplishing this via ConfigMap, which sounds good.

I think we can close this PR in favor of #107!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants