Part 2 of rkt support #1154

sjpotter · 2016-03-10T06:13:50Z

The actual handler

Since I originally wrote it and splitting it in 2 an ignore metrics map was created. I think I captured it, but not 100% sure.

The design of this is to capture 2 of the cgroups as a rkt pod and a rkt container pair.

i.e. under the cgroup fs we'd have

machine-slice//system.slice/

in terms of raw containers this will ignore any system.slice directory under the rkt pod and will not be tracked by cadvisor by either. I think this is correct, but feedback would be appreciated.

I'm also not convinced by the efforts of splitting the fs path, I was reccomended to do this over a regex but again feedback would be appreciated.

sjpotter · 2016-03-10T06:14:08Z

@vishh @timstclair @yifan-gu @jonboulle

k8s-bot · 2016-03-10T06:15:23Z

Jenkins GCE e2e

Build/test passed for commit 5838ee2.

Build Log

vishh · 2016-03-10T18:37:04Z

The raw driver has been simple so far - it handles any cgroups that cannot be handled by other handlers. Any specific reason to change that?

sjpotter · 2016-03-10T18:53:55Z

It still basically does that. The idea is that "system.slice" cgroup itself isn't a "real" cgroup.

i.e. given a cgroup path

/sys/fs/cgroup/cpuset//machine.slice/machine-rkt\\x2d3c642e7b\\x2d7e44\\x2d42a1\\x2dba5f\\x2d2da88df3909c.scope/system.slice/alpine-sh.service

I want to consider the cgroups under machine.slice to be the "pod" (i.e. it will be the only one to report network stats as they are global to the pod) and the cgroups under system.slice to be the "containers". And I want those containers to be the direct subcontainers of the pod.

So, yea, I could say that the rkt handler doesn't accept or handle "system.slice" but it wouldn't have any parent or child from a raw container so would just disappear, seems better to say explicitly it is ignored.

vishh · 2016-03-10T23:23:48Z

container/rkt/handler.go

+
+// Watches the specified directory and all subdirectories. Returns whether the path was
+// already being watched and an error (if any).
+func (self *rktContainerHandler) watchDirectory(dir string, containerName string) (bool, error) {


Why is the rkt handler watching for new containers? I'd expect to have a global watcher which then hands off new containers to one of the handlers.

So this is the same exact code as the raw container, I didn't know if it was recursive or not. If it is and it just works because of the raw container support, that's great.

It is not needed here. IIUC, this logic is needed in the raw container handler mainly to watch all the sub-containers of /.
This PR exposes one more abstraction issue. The raw handler must be made specialized, or the watching of subcontainers must be moved out of the handler interface into a something else, which the raw handler can implement.

ok, removed and tested and you're right, works just fine.

vishh · 2016-03-14T21:47:26Z

Ping @sjpotter..

sjpotter · 2016-03-14T21:53:02Z

making progress, will update later today hopefully

k8s-bot · 2016-03-15T17:58:58Z

Jenkins GCE e2e

Build/test failed for commit aa78191.

Build Log

k8s-bot · 2016-03-15T18:32:51Z

Jenkins GCE e2e

Build/test failed for commit 12b5d62.

Build Log

k8s-bot · 2016-03-15T18:37:11Z

Jenkins GCE e2e

Build/test passed for commit 12b5d62.

Build Log

jonboulle · 2016-03-22T11:35:03Z

container/rkt/client.go

+
+	resp, err := client.GetInfo(context.Background(), &rktapi.GetInfoRequest{})
+	if err != nil {
+		return "", fmt.Errorf("couldn't GetInfo from rkt api servie: %v", err)


api service

vishh · 2016-03-23T18:34:10Z

container/rkt/client.go

+)
+
+func Client() (rktapi.PublicAPIClient, error) {
+	once.Do(func() {


Would we want to retry? What if the api service were to be restarting while cAdvisor is being started?
Not for this PR - This is currently an issue with docker client as well, and it needs to be fixed there too.

vishh · 2016-03-23T18:53:17Z

Completed another review pass. Ping me once you address all the comments @sjpotter.
The main issue with the PR is the lack of resiliency to rkt apiservice failures. We can address that in a subsequent PR though...
How are we planning to add test coverage for the rkt integration?

k8s-bot · 2016-03-24T02:04:46Z

Jenkins GCE e2e

Build/test failed for commit ee52fdf.

Build Log

vishh · 2016-03-24T19:32:29Z

LGTM. I didn't get a chance to test this PR yet. @sjpotter: What is the plan for testing rkt integration?

sjpotter · 2016-03-24T20:07:25Z

@vishh need to figure that out, I'm not 100% sure how to do that. In looking at the docker test cases, there isn't much there to inspire me

i can probably test individual functions (such as what I use to parse the cgroups), but unsure how to test every part of the e2e process.

In practice maybe can do integration tests like you have alrady (but aren't running)

vishh · 2016-03-24T20:51:39Z

We can add a few nodes to the e2e setup that run rkt and have the e2e framework run either docker or rkt tests based on the environment.
The other option is for you to setup an independent e2e and post the status on github.

vishh · 2016-03-25T20:16:08Z

@k8s-bot test this

k8s-bot · 2016-03-25T20:18:58Z

Jenkins GCE e2e

Build/test failed for commit ee52fdf.

Build Log

sjpotter · 2016-03-29T00:21:23Z

I've changed this PR a bit to use new rkt api service support to lookup rkt pods by cgroup path but it has to wait till 1.3 release this week (to be vendored properly). This is necessary for it to be reliable in the presence of random slice definitions and pod cgroup names.

sjpotter · 2016-03-30T18:37:17Z

@k8s-bot test this

k8s-bot · 2016-03-30T18:38:52Z

Jenkins GCE e2e

Build/test failed for commit ee52fdf.

Build Log

k8s-bot · 2016-03-30T21:59:00Z

Jenkins GCE e2e

Build/test passed for commit ee52fdf.

Build Log

vishh · 2016-03-31T23:15:15Z

Tests are passing. Merging this PR. We can continue iterating on it.

vishh reviewed Mar 10, 2016
View reviewed changes

jonboulle reviewed Mar 22, 2016
View reviewed changes

yifan-gu mentioned this pull request Mar 22, 2016

Let rkt api service provide a way to filter pods on cgroup path? rkt/rkt#2316

Closed

vishh reviewed Mar 23, 2016
View reviewed changes

address jon and vish comments

ee52fdf

vishh merged commit 5d7c71a into google:master Mar 31, 2016

sjpotter mentioned this pull request May 3, 2016

Use raw as a bases for a rkt container handler #1120

Closed

sjpotter deleted the rkt branch May 5, 2016 13:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Part 2 of rkt support #1154

Part 2 of rkt support #1154

sjpotter commented Mar 10, 2016

sjpotter commented Mar 10, 2016

k8s-bot commented Mar 10, 2016

vishh commented Mar 10, 2016

sjpotter commented Mar 10, 2016

vishh Mar 10, 2016

sjpotter Mar 11, 2016

vishh Mar 11, 2016

sjpotter Mar 14, 2016

vishh commented Mar 14, 2016

sjpotter commented Mar 14, 2016

k8s-bot commented Mar 15, 2016

k8s-bot commented Mar 15, 2016

k8s-bot commented Mar 15, 2016

jonboulle Mar 22, 2016

sjpotter Mar 23, 2016

vishh Mar 23, 2016

vishh commented Mar 23, 2016

k8s-bot commented Mar 24, 2016

vishh commented Mar 24, 2016

sjpotter commented Mar 24, 2016

vishh commented Mar 24, 2016

vishh commented Mar 25, 2016

k8s-bot commented Mar 25, 2016

sjpotter commented Mar 29, 2016

sjpotter commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

vishh commented Mar 31, 2016

Part 2 of rkt support #1154

Part 2 of rkt support #1154

Conversation

sjpotter commented Mar 10, 2016

sjpotter commented Mar 10, 2016

k8s-bot commented Mar 10, 2016

vishh commented Mar 10, 2016

sjpotter commented Mar 10, 2016

vishh Mar 10, 2016

Choose a reason for hiding this comment

sjpotter Mar 11, 2016

Choose a reason for hiding this comment

vishh Mar 11, 2016

Choose a reason for hiding this comment

sjpotter Mar 14, 2016

Choose a reason for hiding this comment

vishh commented Mar 14, 2016

sjpotter commented Mar 14, 2016

k8s-bot commented Mar 15, 2016

k8s-bot commented Mar 15, 2016

k8s-bot commented Mar 15, 2016

jonboulle Mar 22, 2016

Choose a reason for hiding this comment

sjpotter Mar 23, 2016

Choose a reason for hiding this comment

vishh Mar 23, 2016

Choose a reason for hiding this comment

vishh commented Mar 23, 2016

k8s-bot commented Mar 24, 2016

vishh commented Mar 24, 2016

sjpotter commented Mar 24, 2016

vishh commented Mar 24, 2016

vishh commented Mar 25, 2016

k8s-bot commented Mar 25, 2016

sjpotter commented Mar 29, 2016

sjpotter commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

k8s-bot commented Mar 30, 2016

vishh commented Mar 31, 2016