docs: benchmark stage1 systemd resource usage #1788
Comments
/cc @yifan-gu @dchen1107 |
cc @vishh |
I wrote a small go program that runs rkt and monitors it and its children's cpu and memory usage through proc. For lack of a better place, I put it in a new repo: https://github.com/dgonyeo/rkt-monitor. If others like it perhaps a subdirectory in the rkt repo would be a better home for it. Right now some of the logic is kinda hacky (panics everywhere), but it could be cleaned up fairly easily. It has access to a lot of information, but right now only prints a small subset of it. That could be easily changed, if more detailed reports would be useful. I imagine some CI could be set up to build/run this against rkt with a handful of ACIs, and flag any PRs that result in Sample output:
|
In this example, I am running The first pod takes 7.2M memory in total but the applications When several pods are running with the same stage1, surely all the pods are running the same version of When checking the The flavor |
Informational - in next two weeks I will be working on adding Clear Containers bits into kvm flavor (shared base image with systemd/sshd/init/enter/gc binaries and then DAX ) what in result should greatly reduced memory overhead in this flavor. |
About
|
@dgonyeo bump |
I'm looking at how to use I did try to omit
|
@dgonyeo I think it is easier to enable the accounting options in |
That does have the caveat that you'll only be able to use this thing on properly configured systems. That's easier for me though, so unless someone knows how to properly configured |
I've got systemd-cgtop reporting memory/cpu usage now, but some things don't seem quite right to me: The labels for the columns on the right are: Procs, %CPU, Memory, Input/s, Output/s
This is with the log-stresser ACI, that just spits out timestamps in an infinite loop to generate log messages. On my machine |
Well until I figure out the I also was unsure whether to display memory usage based on the VMS or RSS of a given process, but based on this SO post I just changed it to RSS. CPU reporting is also currently broken. I know how to fix it, but it's not worth the time if I'm ditching this implementation soon anyway.
|
CPU usage should be an easy fix right? IMHO let's get that going so we have a basic sample set of the whole picture. |
ping @dgonyeo |
Ok, I finally had time to revisit this. I think the numbers are accurate, and I got the CPU reporting fixed:
What else do you guys want to see? Maybe 95th percentile CPU and memory usage? More configuration over what it can print, or how rkt is called? I'd like to add this to rkt's benchmark tests once we're happy with the state of this, ideally ASAP. |
Can you describe what the workload does? Ideally, we will need the base On Mon, Jan 25, 2016 at 2:32 PM, Derek Gonyeo notifications@github.com
|
I was imagining something along the lines of a |
@dgonyeo can you pick this up for the next release? |
I'll be working on this this week. I'm going to switch over to using cadvisor (which until a PR is merged, will be a specific branch in @sjpotter's fork. Scenarios I'm going to get benchmarks for:
And I imagine it'll by default run these benchmarks with each stage1 available. If anyone has additional scenarios I should be looking at, please speak up. |
If the rkt-monitor stuff is already working can we just land that as-is and consider moving to cadvisor later after things have settled over there? |
Made a PR for it: #2324 |
@dgonyeo before closing this out - could you produce some example stats please? e.g. against the last release and against master? |
Certainly.
v1.4.0log-stresser.aci
mem-stresser.aci
cpu-stresser.aci
too-many-apps.podmanifest
Master (98da3e5)log-stresser.aci
mem-stresser.aci
cpu-stresser.aci
too-many-apps.podmanifest
|
Want those in the docs anywhere, or is this issue good enough? |
Somewhere, please. Open to suggestions for what it looks like exactly. Some On Wed, Apr 27, 2016 at 12:48 AM, Derek Gonyeo notifications@github.com
|
cc @dchen1107 |
Inside the two default stage1 backends (the standard
systemd-nspawn
-based one and the LKVM one), rkt uses systemd to manage the process lifecycle and provide some other features like logging (viasystemd-journald
). This means there is some resource usage overhead associated with each pod, beyond just the resource consumption of the constituent apps. This is by design and a known consequence of rkt's daemonless execution model.For the most part, this overhead should be minimal - on the order of a few megabytes of memory and negligible CPU. But it's important to have a solid understanding of what exactly this overhead is, and to track it over time (e.g. to ensure we don't have any regressions which dramatically increase this overhead, particularly when looking to integrate rkt with cluster manager systems like Kubernetes.
The text was updated successfully, but these errors were encountered: