Skip to content

JRE memory calculation doesn't take the maximum number of threads in to account #157

@lhotari

Description

@lhotari

Our application gets killed under load with this error message in events log index: 0, reason: CRASHED, exit_description: out of memory, exit_status: 255 .
After digging in to it, I think it's caused by a bug in the JVM memory calculation. A process in the warden container gets killed by Linux cgroups OOM killer when it goes over it's resource limits. I found the location where the "out of memory" event is created. I also found the docs for cgroups memory limits. This behaviour is also defined in the Warden protocol.

Back to the actual issue:

I'd expect that the formula for calculation the total amount of memory is something like this:
heap + metasize/permgen + native + (thread_stack_size * maximum_number_of_threads)

However there isn't a documented way to specify the maximum number of threads.
In Tomcat configuration, the default maxThreads settings is 200.

There is this comment in stack_memory_bucket.rb

   # This class represents a memory bucket for stack memory. This is treated differently to other memory buckets
   # which have absolute sizes since stack memory is specified in terms of the size of an individual stack with no
   # definition of how many stacks may exist.

I found a reference to num_threads in the source code:

def normalise_stack_bucket(stack_bucket, buckets)
stack_memory = weighted_proportion(stack_bucket, buckets)
num_threads = [stack_memory / stack_bucket.default_size, 1].max
normalised_bucket = MemoryBucket.new('normalised stack', stack_bucket.weighting,
stack_bucket.range * num_threads)
[normalised_bucket, num_threads]
end
. This doesn't make sense to me. Is something missing?

I'd like to be able to specify the maximum number of threads in the JRE memory configuration. The memory calculation could then take this in to account when calculation the memory limits and doing the heuristics.

Maximum thread limits should be adjusted accordingly in the Tomcat configuration so that you don't run in to trouble with the default java-buildpack configuration.


Some comments about how Warden handles memory limits:
It would be nice if the application got some pre-notification about going over the memory limit. Linux cgroups supports memory threshold notifications. Technically it would be possible to implement Warden in a way that it doesn't use OOM killer to kill processes (Lwn.net "Toward reliable user-space OOM handling"). In that case it would be possible to get better crash reports and logging when there would be a grace time before the process is killed.
Currently this is defined in the Warden protocol and probably won't change:

The memory limit is specified in number of bytes. It is enforced using the control group associated with the container. When a container exceeds this limit, one or more of its processes will be killed by the kernel. Additionally, the Warden will be notified that an OOM happened and it subsequently tears down the container.

I don't like that OOM killer just goes and kills a process and then the container is teared down. Where should I issue my complaint about the Warden protocol? :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions