Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix container CPU runtime metrics #38

Merged
merged 3 commits into from
Jun 6, 2019
Merged

Commits on May 29, 2019

  1. Move CPU_SYS_NUMBER_OF_FIELDS const to mod os

    It's only used in the os module, so it only has to be defined there.
    tombruijn committed May 29, 2019
    Configuration menu
    Copy the full SHA
    b64eca7 View commit details
    Browse the repository at this point in the history
  2. Split CPU code into separate modules

    The new logic for cgroups is going to differ so much that there's very
    little advantage to keeping them all in the same module. It will start
    to become confusing what function is related to which implementation.
    Which is why I'm splitting the code now.
    tombruijn committed May 29, 2019
    Configuration menu
    Copy the full SHA
    d28024f View commit details
    Browse the repository at this point in the history
  3. Fix CPU metrics calculations for containers

    ## The problem
    
    The previous implementation of CPU metrics for containers was
    inaccurate. It reported trends (more/less CPU usage), but didn't
    accurately report the CPU metrics.
    
    This was the implementation was using the "total CPU usage" as the
    "total CPU time", which is not the same. The "total CPU usage" is how
    much the CPU was used during our measurements, but does not account for
    the CPU idle time, which the "total CPU time" does include.
    
    This caused the CPU metrics being reported to combine towards a total of
    a 100% usage, while the container itself may have only used 1/2% of the
    CPU at that time. The breakdown between `user` and `system` only applied
    to the `total usage` and not the "total CPU time".
    
    The implementation didn't account for the `idle` time which a CPU has,
    because the `/sys/fs` virtual file system does not report this value. It
    reports a total usage (combined `user` group, `system` group and
    whatever it doesn't expose in separate groups), and a breakdown for the
    `user` and `system` groups.
    
    ## The solution
    
    Instead we can use the timeframe in which the measurements were done, 60
    seconds, as the "total CPU time". These values can be compare because
    the `total usage`, `user` and `system` values are also reported in
    nanoseconds. (Nanoseconds for `total usage` and `seconds / 100` for the
    `user` and `system` groups.)
    
    60 seconds is the time in which we measure.
    We will use this as the "total CPU time". The delta values of the values
    reported for `total usage`, `user` and `system`, the time how long they
    spent on these groups within those 60 seconds, can then be used to
    compare the percentages for those values.
    
    Using the calculation:
    `delta value / total CPU time (60 seconds) * 100 = value in %`
    
    For example:
    
    ```
    time frame  = 60 seconds = 100% (60 / 60 * 100 = 100)
    total usage = 30 seconds =  50% (30 / 60 * 100 =  50)
    user        = 24 seconds =  40% (24 / 60 * 100 =  40)
    system      =  6 seconds =  10% ( 6 / 60 * 100 =  10)
    ```
    
    Where with the previously implementation the result would have been:
    
    ```
    total usage = 30 seconds =  100% (30 / 30 * 100 = 50)
    user        = 24 seconds =   80% (24 / 30 * 100 = 80)
    system      =  6 seconds =   20% ( 6 / 30 * 100 = 20)
    ```
    
    In addition we now report the `total_usage` of the container CPU, rather
    than only the breakdown between the `user` and `system` groups. This
    implementation differs from the non-container implementation for CPU
    metrics, where we do not have a `total_usage` metric.
    
    ### When a container has multiple CPU cores
    
    When a container has multiple CPU cores the measurement time does not
    change. But the container can use more resources during that time,
    counting towards a higher reported usage value. This means that a
    container with 2 CPU cores running both cores to their maximum during
    the measurement time will report 200% in total usage.
    
    This implementation will report the same metrics as the output from
    `docker stats`. We've decided to keep the reported values the same so
    there is less confusion about what is the accurate metric.
    
    This creates a difference in reporting for CPU usage between container
    systems and non-container systems. For non-containers all the reported
    CPU group metrics combine together towards a 100% value regardless of
    how many cores are available, this is because we know the "total CPU
    time" there as the idle time and other groups are exposed.
    See:
    https://github.com/appsignal/probes-rs/blob/e2ac0412f6357d64423073adf82009a116ae226b/src/cpu.rs#L141-L162
    
    If we would want to always report a 100% value no matter how many CPU
    cores a container has available, we can divide the reported values by
    the number of CPU cores. This is what we did here in our test script:
    https://gist.github.com/tombruijn/5f5e0c34af40cfb3967eca81ad8b5317#file-appsignal_stats-rb-L141
    and is what `ctop -scale-cpu` (ctop (https://bcicen.github.io/ctop/)
    being an alternative to `docker stats`).
    
    But please consider the following section as well as that will not fix
    the problem with multiple containers on one host system.
    
    ### About container neighbors
    
    While we report the same values as `docker stats`, `docker stats`
    doesn't actually report the container's CPU usage. Instead it shows the
    container's CPU usage impact on the host system. This is not immediately
    visible when using one container on the host system, but when multiple
    containers are running at the same time this problem becomes more clear.
    
    Multiple containers running on the same host system will impact the
    readings of `docker stats` as the host's resources will be shared among
    those containers.
    
    For example: We have a Docker system on our host machine that has 3 CPU
    cores available. We then start containers without any specific CPU
    limits.
    - In the scenario of one container maxing out its CPU cores it will be
      reported as 300% CPU usage by `docker stats`. Which is the total CPU
      time available.
    - When three containers are maxing out their CPU cores it will be
      reported by `docker stats` as 100% CPU usage for every container as
      the resources are shared amongst them evenly.
    
    The problem here remains that we do not know the container's actual
    "total CPU time" it has available, total usage + idle time, isolated to
    the container itself.
    
    To make it more complicated, if all container would max out their CPU,
    the containers' CPU time gets throttled as well. Which is something we
    may want to expose in the future as it shows when a container has less
    resources available at which time.
    
    ## Acknowledgements and sources
    
    Based on the implementation of this Ruby script written together with
    Robert Beekman @matsimitsu:
    https://gist.github.com/tombruijn/5f5e0c34af40cfb3967eca81ad8b5317
    Thanks Robert!
    
    Based on the docs as described here:
    https://docs.docker.com/config/containers/runmetrics/
    
    Co-authored-by: Robert Beekman <robert@matsimitsu.nl>
    tombruijn and matsimitsu committed May 29, 2019
    Configuration menu
    Copy the full SHA
    413b4ea View commit details
    Browse the repository at this point in the history