Skip to content

Conversation

@sjpotter
Copy link
Contributor

@sjpotter sjpotter commented May 4, 2016

So this PR is about trying to enable other methods of identifying container creation that can run in parallel with cgroup based detection as well as enabling one to override a cgroup based detection handler (either to operate outside the cgroup hierarchy, as might be case with VMs) or simply override the cgroup based handler.

idea is to use get it so the we an determine when rkt pods are created not via cgroup creation but via the rkt api service itself.

@sjpotter
Copy link
Contributor Author

sjpotter commented May 4, 2016

@timstclair

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 4, 2016

Jenkins GCE e2e

Build/test failed for commit e7322ca.

@sjpotter
Copy link
Contributor Author

sjpotter commented May 4, 2016

it contains 2 parts

  1. pass a "detection" type into the CanHandleAndAccept function, i.e. "raw" "rkt" but can be more in future

  2. break up the createContainer/destroyContainer into unlocked version (along with wrappers that lock them) as well as an updateContainer method that can destroy an old handler while creating a new one.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 4, 2016

Jenkins GCE e2e

Build/test passed for commit e1b0275.

}

// As we are proposing having multiple ways to detect a container (in addition to cgroups)
// need the ability to delete
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so currently we detect containers via raw cgroups (what I'll call the "raw" method). I'm proposing the ability to detect containers via other methods, such as the rkt api service.

if a raw handler was already created for a cgroup path for a rkt container (via raw detection), we want the rkt path to overwrite/update it to the rkt handler. Or at least that's what I thought you wanted. i.e. the raw handler will be destroyed and a rkt handler will take its place. all in one function to ensure atomic nature.

if the rkt handler is created first, the raw handler wont be created as createContainer() skips if the cgroup is already a key in the map.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, update the comment with that explanation? :)
I think the comment is not a proper sentence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// Enables overwriting an existing containerData/Handler object for a given containerName
// createContainer just returns if a given containerName has a handler already
// Ex: rkt handler will want to take priority over the raw handler, but the raw handler might be created first

@timstclair
Copy link
Contributor

I don't like passing around the detection type. I think it would be better if factories are registered per detection mechanism instead. Rather than registering the factories against a global (https://github.com/google/cadvisor/blob/master/container/factory.go#L76), we could create a "ContainerWatcher" interface, something like:

type ContainerWatcher interface {
  Start()
  Stop()
  RegisterFactory(factory ContainerHandlerFactory)
}

@sjpotter
Copy link
Contributor Author

sjpotter commented May 4, 2016

ok, I'll try to work someting up tomorrow, it will basically mirror the existing factory system.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit bf746e7.

@sjpotter
Copy link
Contributor Author

sjpotter commented May 5, 2016

so this commit I'm not sure is exactly where you want it, but instead of passing the detectionType all the way through it's just passed to container.NewContainerHandler (i.e. the factory method).

When we register factories we say what type of detection they support (I used a slice so they can technically support more than one, but unsure that's really an issue)

And I expanded the SubcontainerEvent to include the detection type, that way I hope we can just have multiple "detectors" writing to the same channel. Next commit will add a new event called Update which will use the update path from the original commit

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit fe2b317.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit be19bcf.

@sjpotter
Copy link
Contributor Author

sjpotter commented May 5, 2016

@timstclair @vishh so lots of surgery, but this moves the raw cgroup container watcher to its own package and enables registering other watchers with the manager to use the same channel to write on.

I have a polling rkt implementation here https://github.com/sjpotter/cadvisor/tree/new-handler-detection-with-rkt/

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit 5e1622b.

Name string

//who detected it
DetectionType string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't like having "detection type". It just feels messy and not the right abstraction. Why is it necessary? I don't think container factories should care how the container was detected, their job is to just create the handler for it. It should be up to the manager (or a child module) to ensure that the correct factories are hooked up to the right detection sources.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to minimize the code duplication, so only have them all going through the same event loop which goes to the same factory loop.

However, since we are dealing with the same container names if we don't distinguish where the container name came from, rkt wouldn't be able to determine if it was for it or not and would have to wait to make sure. it's just labeling which watcher found it.

I'm not convinced this is right, but I'm trying to keep the code simple and not overly complicated

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test failed for commit 76a5ae5.

@sjpotter
Copy link
Contributor Author

sjpotter commented May 5, 2016

@k8s-bot test this

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit 76a5ae5.


// Maintain the watch for the new or deleted container.
switch {
case eventType == container.SubcontainerAdd:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

switch on eventType

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit 634421a.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 5, 2016

Jenkins GCE e2e

Build/test passed for commit 8a446b8.

// The full container name of the container where the event occurred.
Name string

//who detected it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More description on the comment?

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 8, 2016

Jenkins GCE e2e

Build/test failed for commit 436db30.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 8, 2016

Jenkins GCE e2e

Build/test failed for commit 918145c.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 9, 2016

Jenkins GCE e2e

Build/test failed for commit 1e407cb.

}

// Register for new subcontainers.
eventsChannel := make(chan watcher.SubcontainerEvent, 16)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is the channel size (16) been decided? @sjpotter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I saw that you copied from the old code.

@yifan-gu
Copy link
Contributor

LGTM overall, with some nits. So this PR splits the watcher from the container handler, right? @sjpotter

@sjpotter
Copy link
Contributor Author

sjpotter commented May 11, 2016

@yifan-gu correct, this is just the refactor, the rkt support will be a separate drop. It will be a small rework of the handler along with a new "watcher" mechanism that use the rkt api service.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 11, 2016

Jenkins GCE e2e

Build/test passed for commit 49a0b93.

@timstclair timstclair self-assigned this May 11, 2016
case event.EventType == watcher.SubcontainerDelete:
err = self.destroyContainer(event.Name)
case event.EventType == watcher.SubcontainerOverride:
err = self.overrideContainer(event.Name, event.WatchSource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have some concerns around how override is handled, but it might make more sense once the rkt watcher is added. Can we just remove the override event & method from this PR, and I'll review it when it's actually used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the event for now

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 11, 2016

Jenkins GCE e2e

Build/test passed for commit c61c711.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 11, 2016

Jenkins GCE e2e

Build/test passed for commit 05a9b07.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 11, 2016

Jenkins GCE e2e

Build/test failed for commit bb10c38.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 11, 2016

Jenkins GCE e2e

Build/test passed for commit 1382470.


if containerData.handler.String() != "raw" {
return nil
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this check needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its now removed in this branch and is part of the rkt addons branch, but the reason is to only allow overriding a raw handler.

i.e. cgroup gets a raw handler, then we say it should get a rkt handler. it just to enforce. If you dont think its neccessary can remove it

@timstclair
Copy link
Contributor

Ok, I still don't like passing around the WatchSource but I think fixing it would be a more substantial change, so I'm willing to unblock this with the WatchSource. If you clean up the remaining issues, we can get this merged. Also, please update the PR title.

// Ex: rkt handler will want to take priority over the raw handler, but the raw handler might be created first

// Only allow raw handler to be overriden
func (m *manager) overrideContainer(containerName string, watchSource watcher.ContainerWatchSource) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought I said this elsewhere, but can't find the comment.... Can we remove overrrideContainer for now (since it's unused), so that I can review it when I can see how it's actually used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you did, and I did, and just pushed it.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit 55b640f.

@sjpotter sjpotter changed the title pondering how to create new way to detect rkt containers without cgroups Refactor container watching out of raw handler into its own inteface / package May 12, 2016
@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit 522895c.

for _, watcher := range self.containerWatchers {
err := watcher.Stop()
if err != nil {
errors = append(errors, fmt.Sprintf("%v", err))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just do err.Error()

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit 9159322.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit 6770081.

Start()

//Name of handler
String() string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could for now, but its needed for identifying handlers later on

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add it when it's needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, in progress

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit ac0e403.

@k8s-bot
Copy link
Collaborator

k8s-bot commented May 12, 2016

Jenkins GCE e2e

Build/test passed for commit e026324.

@yifan-gu
Copy link
Contributor

@timstclair Are we able to get those rkt support PRs in and bump cadvisor in 7 days?

@timstclair
Copy link
Contributor

Yes. Please make sure that any remaining issues / PRs are marked with the kubernetes v1.3 milestone

@sjpotter
Copy link
Contributor Author

Once this pr is merged i have another pr with the rkt implementation ready
to post.

Testing it with k8s now.
On May 12, 2016 2:15 PM, "Tim St. Clair" notifications@github.com wrote:

Yes. Please make sure that any remaining issues / PRs are marked with the kubernetes
v1.3 milestone
https://github.com/google/cadvisor/milestones/Kubernetes%20v1.3


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#1263 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants