-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Part 2 of rkt support #1154
Part 2 of rkt support #1154
Conversation
The raw driver has been simple so far - it handles any cgroups that cannot be handled by other handlers. Any specific reason to change that? |
It still basically does that. The idea is that "system.slice" cgroup itself isn't a "real" cgroup. i.e. given a cgroup path
I want to consider the cgroups under machine.slice to be the "pod" (i.e. it will be the only one to report network stats as they are global to the pod) and the cgroups under system.slice to be the "containers". And I want those containers to be the direct subcontainers of the pod. So, yea, I could say that the rkt handler doesn't accept or handle "system.slice" but it wouldn't have any parent or child from a raw container so would just disappear, seems better to say explicitly it is ignored. |
|
||
// Watches the specified directory and all subdirectories. Returns whether the path was | ||
// already being watched and an error (if any). | ||
func (self *rktContainerHandler) watchDirectory(dir string, containerName string) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is the rkt handler watching for new containers? I'd expect to have a global watcher which then hands off new containers to one of the handlers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is the same exact code as the raw container, I didn't know if it was recursive or not. If it is and it just works because of the raw container support, that's great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed here. IIUC, this logic is needed in the raw container handler mainly to watch all the sub-containers of /
.
This PR exposes one more abstraction issue. The raw handler must be made specialized, or the watching of subcontainers must be moved out of the handler interface into a something else, which the raw handler can implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, removed and tested and you're right, works just fine.
Ping @sjpotter.. |
making progress, will update later today hopefully |
|
||
resp, err := client.GetInfo(context.Background(), &rktapi.GetInfoRequest{}) | ||
if err != nil { | ||
return "", fmt.Errorf("couldn't GetInfo from rkt api servie: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
api service
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
) | ||
|
||
func Client() (rktapi.PublicAPIClient, error) { | ||
once.Do(func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would we want to retry? What if the api service were to be restarting while cAdvisor is being started?
Not for this PR - This is currently an issue with docker client as well, and it needs to be fixed there too.
Completed another review pass. Ping me once you address all the comments @sjpotter. |
LGTM. I didn't get a chance to test this PR yet. @sjpotter: What is the plan for testing rkt integration? |
@vishh need to figure that out, I'm not 100% sure how to do that. In looking at the docker test cases, there isn't much there to inspire me i can probably test individual functions (such as what I use to parse the cgroups), but unsure how to test every part of the e2e process. In practice maybe can do integration tests like you have alrady (but aren't running) |
We can add a few nodes to the e2e setup that run rkt and have the e2e framework run either docker or rkt tests based on the environment. |
@k8s-bot test this |
I've changed this PR a bit to use new rkt api service support to lookup rkt pods by cgroup path but it has to wait till 1.3 release this week (to be vendored properly). This is necessary for it to be reliable in the presence of random slice definitions and pod cgroup names. |
@k8s-bot test this |
Tests are passing. Merging this PR. We can continue iterating on it. |
The actual handler
Since I originally wrote it and splitting it in 2 an ignore metrics map was created. I think I captured it, but not 100% sure.
The design of this is to capture 2 of the cgroups as a rkt pod and a rkt container pair.
i.e. under the cgroup fs we'd have
machine-slice//system.slice/
in terms of raw containers this will ignore any system.slice directory under the rkt pod and will not be tracked by cadvisor by either. I think this is correct, but feedback would be appreciated.
I'm also not convinced by the efforts of splitting the fs path, I was reccomended to do this over a regex but again feedback would be appreciated.