Autoscale up on memory usage #160

rigazilla · 2019-09-25T12:11:01Z

No description provided.

rigazilla · 2019-10-17T13:18:17Z

If the k8s platform has metrics enable, this can be achieved via HorizontalPodAutoscaler.
I have 2 options for this:

document how to do and let the user do the work.
docs as in opt 1 plus provide in Infinispan.Spec optional configuration field for basic autoscale settings (CPU and Memory based).

Anyone have preferences?

oraNod · 2019-10-22T07:49:25Z

imo option 2 sounds like a winner

tristantarrant · 2019-10-22T08:23:33Z

This MUST be automated. So option 2 all the way

rigazilla · 2020-01-24T15:38:32Z

A good starting point for working on a memory usage metric is the method calculateRequiredMinimumNumberOfNodes.

From the code above we can derive some assertions:

local and replicated cache have no benefits in scaling up for memory
invalidated mode cache may benefits from scaling up but it can't scale down
distributed ad scattered cache should be ok with scaling for memory

How we want to address this different cases? Should we specify constraints for a user to be able to do autoscaling?
For distributed and scattered cache is the memory usage computation suitable for us?
Currently the memory usage is a per-cache value, can Infinispan provide a cluster-wide memory usage measure or the operator needs to collect and process the data?

rigazilla · 2020-01-24T15:41:58Z

@tristantarrant I tried above to resume our discussion about memory autoscaling. Please add anything you thing could be useful to the discussion.

Also @danberindei, @wburns if you want to add your though, thanks!

wburns · 2020-01-24T16:31:27Z

An invalidation and LOCAL cache don't really make sense in a server with a cluster, period. Since you can't guarantee which node you will talk to. So I am not sure if we have to worry about those at all.

Replicated doesn't make sense for memory scaling as you mentioned. Requires vertical scaling, which this can't do afaik.

For distributed and scattered cache is the memory usage computation suitable for us?
Currently the memory usage is a per-cache value, can Infinispan provide a cluster-wide memory usage measure or the operator needs to collect and process the data?

I guess we need to detail what we want to do. Cause you could have the issue that a single cache could run into eviction early, which might be fine? What is the goal here again, I forgot? Should it be based solely on JVM heap size for example?

rigazilla · 2020-01-28T17:23:35Z

I guess we need to detail what we want to do. Cause you could have the issue that a single cache could run into eviction early, which might be fine? What is the goal here again, I forgot? Should it be based solely on JVM heap size for example?

I can see two cases

Scale to prevent out of memory: in this case we need a percentage of the used memory (heap and off-heap I suppose)
Scale to prevent eviction (preserve data): in this case we can reuse the per-cache usage computation

I would go for 1 as first implementation. @tristantarrant?
though I don't know if we have something in place that produces good metric

rigazilla · 2020-02-17T16:08:47Z

@anistor pr infinispan/infinispan#7857
changes metric names

rigazilla · 2020-03-05T09:23:50Z

I did some more tests. It seems that the standard memory metric is ok to measure offheap usage whether mem usage is increasing or decreasing. Heap memory is handled by the jvm and it's usually not released to the OS in a predictable way or not release at all due to the greedy jvm attitude.

So I'll do the following:

HEAP USE CASE
I'll implement an autoscaling based on standard metrics.

OFFHEAP USE CASE
I can see two approaches for this:

my preferred is: find a suitable jvm options set that convince the jvm to release erlier unused memory. This could be an useful setup also for users that want to save resources. I'm currently experimenting openJDK 8 with these settings: -XX:+UseSerialGC -XX:MinHeapFreeRatio=5 -XX:MaxHeapFreeRatio=10 -XX:GCTimeRatio=4 -XX:AdaptiveSizePolicyWeight=90 without any appreciable results;
set up an autoscaler based on the exposed metrics: base.memory.usedHeap,base.memory.maxHeap

rigazilla · 2020-04-23T08:03:22Z

Kubernetes HPA does only linear computation:
https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#algorithm-details
which doesn' fit with the ISPN memory consumption.
I'm going to implement an autoscale algorithm in the operator

rigazilla changed the title ~~Autoscale up on memory used~~ Autoscale up on memory usage Oct 1, 2019

rigazilla self-assigned this Oct 17, 2019

rigazilla added this to the 2.0.0 milestone Feb 24, 2020

rigazilla added a commit to rigazilla/infinispan-operator that referenced this issue Apr 24, 2020

Memory autoscaling infinispan#160

af3c84e

rigazilla added a commit to rigazilla/infinispan-operator that referenced this issue Jun 1, 2020

Memory autoscaling infinispan#160

aa3445a

rigazilla added a commit to rigazilla/infinispan-operator that referenced this issue Jun 5, 2020

Memory autoscaling infinispan#160

1c5d362

rigazilla added a commit to rigazilla/infinispan-operator that referenced this issue Jun 5, 2020

Memory autoscaling infinispan#160

1de0f64

rigazilla added a commit to rigazilla/infinispan-operator that referenced this issue Jun 8, 2020

Memory autoscaling infinispan#160

e2677da

rigazilla added a commit that referenced this issue Jun 15, 2020

Memory autoscaling #160

5ef50da

rigazilla closed this as completed Jul 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoscale up on memory usage #160

Autoscale up on memory usage #160

rigazilla commented Sep 25, 2019

rigazilla commented Oct 17, 2019

oraNod commented Oct 22, 2019

tristantarrant commented Oct 22, 2019

rigazilla commented Jan 24, 2020

rigazilla commented Jan 24, 2020

wburns commented Jan 24, 2020

rigazilla commented Jan 28, 2020

rigazilla commented Feb 17, 2020

rigazilla commented Mar 5, 2020

rigazilla commented Apr 23, 2020

Autoscale up on memory usage #160

Autoscale up on memory usage #160

Comments

rigazilla commented Sep 25, 2019

rigazilla commented Oct 17, 2019

oraNod commented Oct 22, 2019

tristantarrant commented Oct 22, 2019

rigazilla commented Jan 24, 2020

rigazilla commented Jan 24, 2020

wburns commented Jan 24, 2020

rigazilla commented Jan 28, 2020

rigazilla commented Feb 17, 2020

rigazilla commented Mar 5, 2020

rigazilla commented Apr 23, 2020