Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor run command networking options for Windows support. #1953

Merged
merged 1 commit into from
Mar 22, 2023

Conversation

aznashwan
Copy link
Contributor

Refactor the loading and application of the networking-related
arguments for the nerdctl run command in order to facilitate
Windows support.

Fixes/alleviates #559 and #1680.

Signed-off-by: Nashwan Azhari nazhari@cloudbasesolutions.com

@aznashwan aznashwan force-pushed the hyperv-netns branch 3 times, most recently from dff5c2d to 7751426 Compare January 30, 2023 14:36
@AkihiroSuda
Copy link
Member

compose: show detailed errors again

This is now merged in the main, please rebase

go.mod Outdated Show resolved Hide resolved
@aznashwan
Copy link
Contributor Author

Just pushed some updates, and despite still have some issues which will need ironing out, I would like to offer some notes for any pre-emptive reviews due to the initially excessive-looking amount of abstractions which are added:

  1. this initial implementation only offers CNI networking on Windows (host mode networking should be doable as well)
  2. OCI hooks (which is the current mechanism through which nerdctl sets up and tears down CNI networking) are not currently supported on Windows, so the only route forwards is to have the main nerdctl run/rm process set up and tear down the container's networking itself. (this has lead to potential dangling interface definitions and port mappings which I'll need to add some more explicit tests for)
  3. the nerdctl/pkg/netutil package only saves/reads CNI v1.0.0 configurations, but Windows container networking can only handle version up to v0.4.0.

Although that last one isn't as big of a problem as it sounds (it only means networks created by nerdctl cannot be read/used by HCS, so only affects tests which create their own network), I'm still looking into a better way to handle them without bypassing netutil.

Any feedback is appreciated in the meantime.

@dcantah
Copy link
Member

dcantah commented Feb 16, 2023

@aznashwan I don't think there's a technical reason why OCI hooks couldn't work on Windows, I just think we never had a reason to really implement them, but I feel like this is as good of a reason as they come hahah

@dcantah
Copy link
Member

dcantah commented Feb 16, 2023

Maybe worth exploring 😳

@aznashwan
Copy link
Contributor Author

Just pushed some updates, I believe I have managed to completely bypass the CNI versioning issue (at least in nerdctl itself while creating the container), and at this stage nerdctl seems to reliably create and clean up networking for Windows containers.

I still have to port/fix some tests, but this is the point where any reviews would be extra-appreciated!

@aznashwan aznashwan force-pushed the hyperv-netns branch 2 times, most recently from 93c1a7c to 9c1bd98 Compare February 27, 2023 14:22
@AkihiroSuda AkihiroSuda marked this pull request as draft February 28, 2023 05:31
@aznashwan
Copy link
Contributor Author

aznashwan commented Feb 28, 2023

LE: The cleanup errors were actually due to a failure to remove containers created with --rm on non-Windows which was introduced in this patch, but should be fixed now.

This PR should be review-able, and it looks like the rest of the failing tests are somehow related to this recent BuildKit layer pruning issue.

Failing tests are:

I'm pretty stumped at this stage on what the cause may be because as far as I can tell my patch should not modify/manipulate IPFS settings at all. (it's strictly related to the networking arguments for the containers themselves)

It's probably something obvious I'm missing, but I'm not familiar enough with how IPFS registries should work to know if any of my modifications to container creation even can bear any relevance to those tests?

@AkihiroSuda @fahedouch any pointers you may have for the IPFS failure would be greatly appreciated.

@dcantah @jsturtevant if you could please have a look over the Windows side of the code, that would be greatly appreciated as well!

cmd/nerdctl/container_run.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run.go Outdated Show resolved Hide resolved
pkg/containerutil/container_network_manager_unix.go Outdated Show resolved Hide resolved
func (m *containerNetworkManager) VerifyNetworkOptions(_ context.Context) error {
// TODO: check host OS, not client-side OS.
if runtime.GOOS != "linux" {
return errors.New("container networking mode is currently only supported on Linux")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was surprised to see this. Maybe I don't see how this is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initial PR's scope was abstracting the networking logic for all platforms and only adding minimal CNI networking support for Windows for now as that is the only network setup which is tested on Windows.

@aznashwan aznashwan force-pushed the hyperv-netns branch 4 times, most recently from 7e76e18 to c110fe4 Compare March 6, 2023 16:44
@aznashwan
Copy link
Contributor Author

@AkihiroSuda could I please get a recheck whenever you can spare the time?

@@ -26,4 +26,16 @@ const (
// use gcr.io/k8s-staging-e2e-test-images/busybox:1.29-2-windows-amd64-ltsc2022 locally on windows 11
// https://github.com/microsoft/Windows-Containers/issues/179
CommonImage = WindowsNano

// NOTE(aznashwan): the upstream e2e Nginx test image is actually based on BusyBox.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change the constant name to match the reality. Can be another PR.

return "", errors.New("namespace is required")
}
if strings.Contains(globalOptions.Namespace, "/") {
return "", errors.New("namespace with '/' is unsupported")
Copy link
Member

@AkihiroSuda AkihiroSuda Mar 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Windows you may want to validate \ (and :?) too

go.mod Outdated Show resolved Hide resolved
return err
}

// NOTE: only currently supported network type on Windows is nat:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is only nat supported?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main goal of the patch was to get minimal CNI networking running on Windows, enable as many tests which use networking (e.g. tests which set up a temporary local registry), and then add other network modes later.

Seeing as though nat was the default network type set up by the CI, and is the only one that gets tested, I added this explicit constraint for now.

var isErr bool
if errE := os.RemoveAll(stateDir); errE != nil {
isErr = true
if containerErr == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need && runtime.GOOS == "windows" here too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable netSetupErr is initially set to nil and will only be non-nil if runtime.GOOS == "windows". Therefore, it is unnecessary IMO.

Copy link
Contributor

@jsturtevant jsturtevant Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, Thanks! I see that now. Just a thought, It isn't immediately obvious that this code is only executing for Windows when reading through. Adding something here to indicate that would could be helpful.

cmd/nerdctl/container_create_linux_test.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run_log_driver_syslog_test.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run_log_driver_syslog_test.go Outdated Show resolved Hide resolved
cmd/nerdctl/container_run_log_driver_syslog_test.go Outdated Show resolved Hide resolved
@jsturtevant
Copy link
Contributor

lgtm for Windows
I agree with @dcantah the ideal path would be OCI hooks since the project heavily uses it. It is technically feasible for Windows as far as I know but would take some effort in the hcsshim project.

@AkihiroSuda
Copy link
Member

CI is failing. Please try rebasing.

Refactor the loading and application of the networking-related
arguments for the `nerdctl run` command in order to enable
Windows container networking support through CNI.

To facilitate this, the following major changes were made:
    * moved all networking-related container spec definitions
      to pkg/containerutil/container_network_manager_*
    * refactored network-related argiment loading and application
      in cmd/nerdctl/container_run.go
    * enabled some of the networking-related tests on Windows

Signed-off-by: Nashwan Azhari <nazhari@cloudbasesolutions.com>
@aznashwan
Copy link
Contributor Author

@AkihiroSuda looks all green now, thanks!

Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit 41ea1c7 into containerd:main Mar 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/refactor platform/Windows/Non-WSL2 Microsoft Windows (non-WSL2)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants