Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix namespace switch issues and add ipvlan, macvlan, and bridge e2e testing #211

Merged
merged 5 commits into from May 20, 2016

Conversation

dcbw
Copy link
Member

@dcbw dcbw commented May 18, 2016

A subset of #167 while we discuss how the mocking should actually work.

Fixes #179.

@dcbw
Copy link
Member Author

dcbw commented May 18, 2016

cc @rosenhouse

@dcbw
Copy link
Member Author

dcbw commented May 18, 2016

cc @steveej

@steveej steveej self-assigned this May 18, 2016
@steveej steveej added this to the v0.3.0 milestone May 18, 2016
@dcbw
Copy link
Member Author

dcbw commented May 18, 2016

@rosenhouse one thing I'm not 100% sure about is the removal of LockOSThread() from the plugins init() functions themselves; that all works for the testcases because they run all the checks with Do(), and they run the plugin's cmdAdd/cmdDel from inside the Do() for the original namespace too. But when the plugins are run standalone from kubernetes or rkt or whatever, should they get the current host namespace and wrap inital setup with Do()?

@dcbw dcbw changed the title ipvlan, macvlan, and bridge e2e testing fix namespace switch issues and add ipvlan, macvlan, and bridge e2e testing May 18, 2016
@dcbw dcbw changed the title fix namespace switch issues and add ipvlan, macvlan, and bridge e2e testing Fix namespace switch issues and add ipvlan, macvlan, and bridge e2e testing May 18, 2016
@steveej
Copy link
Member

steveej commented May 18, 2016

But when the plugins are run standalone from kubernetes or rkt or whatever, should they get the current host namespace and wrap inital setup with Do()?
AFAIU this shouldn't be a problem, reasoning below.

If the plugins are invoked, the given network namespace will be explicitly set whenever necessary using NetNS.Do(). AFAICT the assumption is that the rest of the code is supposed to run in the caller's netns, and unless the caller doesn't mess with setting the netns themselves, CNI shouldn't be messing with it either.

@dcbw
Copy link
Member Author

dcbw commented May 18, 2016

@steveej yeah, that should be true. I guess the testcases have issues here (which were fixed by wrapping everything in Do()) because they create a completely new namespace to ensure they don't affect the host. So I guess we're OK.

@steveej
Copy link
Member

steveej commented May 18, 2016

yeah, that should be true. I guess the testcases have issues here (which were fixed by wrapping everything in Do()) because they create a completely new namespace to ensure they don't affect the host. So I guess we're OK.

Agreed! At this point, thank you again for this excellent contribution. I lack the overview if all of @rosenhouse's issues related to the code are addressed, @dcbw can you help here?

@dcbw
Copy link
Member Author

dcbw commented May 19, 2016

rebased to resolve the merge conflict

@dcbw
Copy link
Member Author

dcbw commented May 19, 2016

@steveej @rosenhouse I believe they were all addressed in the previous mocking PR; the PR here doesn't change anything from that other PR except for dropping the 'netops' object. Should get @rosenhouse to confirm though.

@steveej
Copy link
Member

steveej commented May 19, 2016

As discussed in the CNI meeting, this will be merged after bringing the locking back to the plugins init() functions.

@dcbw
Copy link
Member Author

dcbw commented May 19, 2016

@steveej reverted the init() changes; PTAL.

@rosenhouse
Copy link
Contributor

So, like I said at the CNI meeting, I'm traveling on PTO, and then at a conference and therefore unable to look at this closely until late next week.

However, at first glance, I think I see instances of the issue I mentioned on the call: given how we're trying to isolate namespace switches to a goroutine, nested calls to netns.Do are problematic. Specifically, you've got test code which calls netns.Do and then inside its callback, you directly invoke the plugin code. And inside the plugin code (e.g. in the bridge plugin) there is another call to netns.Do. That is likely the cause of the test flakes you're seeing. Calling LockOSThread at the start of the test suite, as I suggested on the call, won't help here.

I think the easiest way to fix those flakes would be to invoke the plugins as a separate OS process, rather than running them in the same process as the tests.

I'll leave it up to the other maintainers to decide on merging.

@steveej
Copy link
Member

steveej commented May 19, 2016

reverted the init() changes; PTAL.

I've checked the test cases before the revert, so I trust the green lights with this now.

@steveej
Copy link
Member

steveej commented May 19, 2016

@rosenhouse

Specifically, you've got test code which calls netns.Do and then inside its callback, you directly invoke the plugin code. And inside the plugin code (e.g. in the bridge plugin) there is another call to netns.Do. That is likely the cause of the test flakes you're seeing. Calling LockOSThread at the start of the test suite, as I suggested on the call, won't help here.

Let me try to repeat what you've explained to confirm and also lay the information open on the issue.
The first Do()'s goroutine is locked correctly, but in spite of this the nested goroutine could potentially be scheduled on a new OS thread because GOMAXPROCS has not been exhausted yet?

@rosenhouse
Copy link
Contributor

rosenhouse commented May 19, 2016

@steveej exactly -- that new goroutine may or may not end up on a new OS thread. If a new OS thread is created, then it will inherit the namespaces of its parent. So the new OS thread's "original" namespace is actually the "switched-to" special namespace of the first call. From that point on, the newly created OS thread is "polluted".

I've updated #183 with a description of this issue and to recommend against nested calls.

It's still not clear to me how to best expose namespace switching at a higher level of abstraction in a safe way.

For now, the only thing I can suggest is to carefully review code to check that there are no nested calls to netns.Do

@steveej
Copy link
Member

steveej commented May 19, 2016

@rosenhouse

If a new OS thread is created, then it will inherit the namespaces of its parent. So the new OS thread's "original" namespace is actually the "switched-to" special namespace of the first call. From that point on, the newly created OS thread is "polluted".

This seems to go very deep down to Go's thread/routine handling mechanism and I would appreciate the source to your knowledge.

@rosenhouse
Copy link
Contributor

@steveej I don't have a written reference for the fact that threads inherit the namespace context of their parent thread. However, another member of our CloudFoundry container networking team (@sykesm) has investigated this issue. I can share the results of his work here.

He wrote code that switched the namespace of the main goroutine and spun up several more goroutines from there. All the goroutines waited while the program reported the namespace associated with each OS thread. (code here) Here are the results:

lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15832/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15833/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15834/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15835/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15836/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15838/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15839/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15840/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15841/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15842/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15843/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15844/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15845/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15846/ns/net -> net:[4026533940]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15848/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/task/15849/ns/net -> net:[4026533719]
lrwxrwxrwx 1 root root 0 May  9 16:12 /proc/15832/ns/net -> net:[4026533719]

As you can see, the first few threads all share the original netns of the process, but several others are on the new netns.

As I mentioned earlier, I don't have time to dig into this now. But it might be interesting to perform a more detailed investigation to show how this state of the world evolves, perhaps reporting the full set of thread:netns pairings at each step of the loop.

}
type NetNS interface {
// Executes the passed closure in this object's network namespace,
// restoring the original namespace afterwards. If the closure returns
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as discussed OOB, we can't guarantee that the original namespace is restored. please add an explanation and maybe link to further docs, e.g. https://github.com/golang/go/wiki/LockOSThread

@dcbw dcbw force-pushed the e2e-testing branch 2 times, most recently from 94a7273 to 4be10d7 Compare May 20, 2016 18:57
### Namespace Switching
Switching namespaces with the `ns.Set()` method is not recommended without additional strategies to prevent unexpected namespace changes when your goroutines switch OS threads.

Go provides the `runtime.LockOSThread()` function to ensure a specific goroutine executes on its current OS thread and prevents any other goroutine from running in that thread until the locked one exits. Careful usage of `LockOSThread()` and short goroutines can provide good control over which network namespace a given goroutine executes in.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"short goroutines" worries me a little, is that really necessary? A lock's purpose is to synchronize concurrent tasks and prevent any unwanted runtime behavior.

@dcbw dcbw force-pushed the e2e-testing branch 2 times, most recently from 0629cf0 to 137b64f Compare May 20, 2016 20:54

### Do() The Recommended Thing
The `ns.Do()` method provides control over network namespaces for you by implementing these strategies. All code dependent on a particular network namespace should be wrapped in the `ns.Do()` method to ensure the correct namespace is selected for the duration of your code. For example:

Copy link
Member

@steveej steveej May 20, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please wrap the code snippet in a ```go ... block

dcbw added 5 commits May 20, 2016 17:10
…sues

Add a namespace object interface for somewhat cleaner code when
creating and switching between network namespaces.  All created
namespaces are now mounted in /var/run/netns to ensure they
have persistent inodes and paths that can be passed around
between plugin components without relying on the current namespace
being correct.

Also remove the thread-locking arguments from the ns package
per containernetworking#183 by doing all the namespace
changes in a separate goroutine that locks/unlocks itself, instead of
the caller having to track OS thread locking.
@steveej steveej merged commit d30040f into containernetworking:master May 20, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants