Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Fix agent initialization error by adjusting VMSwitch commands #3169

Merged
merged 1 commit into from Jan 17, 2022

Conversation

XinShuYang
Copy link
Contributor

@XinShuYang XinShuYang commented Jan 4, 2022

Set/Get-VMSwitch could return hostname resolution error in some environments.
The root cause is the remote DNS name didn’t match hostname, then Set/Get-VMSwitch
could return error once it needed to get hostname from DNS name.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Signed-off-by: Shuyang Xin gavinx@vmware.com

@codecov-commenter
Copy link

codecov-commenter commented Jan 4, 2022

Codecov Report

Merging #3169 (c8babe9) into main (089a008) will decrease coverage by 6.79%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #3169      +/-   ##
==========================================
- Coverage   58.36%   51.56%   -6.80%     
==========================================
  Files         295      297       +2     
  Lines       24907    25434     +527     
==========================================
- Hits        14536    13116    -1420     
- Misses       8792    10836    +2044     
+ Partials     1579     1482      -97     
Flag Coverage Δ
kind-e2e-tests 30.58% <ø> (-14.09%) ⬇️
unit-tests 40.32% <ø> (-0.02%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/controller/types/networkpolicy.go 0.00% <0.00%> (-100.00%) ⬇️
pkg/agent/controller/networkpolicy/reject.go 0.00% <0.00%> (-87.91%) ⬇️
pkg/apis/controlplane/v1beta2/helper.go 33.33% <0.00%> (-66.67%) ⬇️
...formers/externalversions/security/v1alpha1/tier.go 0.00% <0.00%> (-64.29%) ⬇️
...ers/externalversions/core/v1alpha2/clustergroup.go 0.00% <0.00%> (-64.29%) ⬇️
...s/externalversions/core/v1alpha2/externalentity.go 0.00% <0.00%> (-64.29%) ⬇️
...xternalversions/security/v1alpha1/networkpolicy.go 0.00% <0.00%> (-64.29%) ⬇️
...versions/security/v1alpha1/clusternetworkpolicy.go 0.00% <0.00%> (-64.29%) ⬇️
pkg/controller/networkpolicy/mutate.go 0.00% <0.00%> (-62.07%) ⬇️
pkg/agent/controller/networkpolicy/packetin.go 6.56% <0.00%> (-60.59%) ⬇️
... and 97 more

Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good to me, and please add more comments in the commitments to explain the issue.

@XinShuYang
Copy link
Contributor Author

Code looks good to me, and please add more comments in the commitments to explain the issue.

Updated.

Set/Get-VMSwitch could return hostname resolution error in some environments.
The root cause is the remote DNS name didn’t match hostname, then Set/Get-VMSwitch
could return error once it needed to get hostname from DNS name.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenyingd wenyingd requested a review from tnqn January 11, 2022 09:30
@XinShuYang
Copy link
Contributor Author

/test-e2e

@@ -177,7 +177,7 @@ func RemoveManagementInterface(networkName string) error {
var err error
var maxRetry = 3
var i = 0
cmd := fmt.Sprintf("Get-VMSwitch -Name %s | Set-VMSwitch -AllowManagementOS $false ", networkName)
cmd := fmt.Sprintf("Get-VMSwitch -ComputerName $(hostname) -Name %s | Set-VMSwitch -ComputerName $(hostname) -AllowManagementOS $false ", networkName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, could you explain more about the issue? Doesn't ComputerName already default to local computer?

-ComputerName
Specifies one or more Hyper-V hosts on which the virtual switch is to be configured. NetBIOS names, IP addresses, and fully qualified domain names are allowable. The default is the local computer. Use localhost or a dot (.) to specify the local computer explicitly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command could fail in TKG cluster environment, and we found the different DNS name can cause this problem. Since we don't know much about Hyper-V‘s implementation,we guess it has a cache-like design to get hostname from DNS name periodically if the default name is invalid. On the contrary, using command with hostname never failed in our experiment. @tnqn

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Jan 14, 2022

/test-windows-all
/skip-networkpolicy
/skip-conformance
/skip-e2e
/skip-integration

@XinShuYang
Copy link
Contributor Author

/test-windows-all

@XinShuYang
Copy link
Contributor Author

/test-windows-networkpolicy

2 similar comments
@XinShuYang
Copy link
Contributor Author

/test-windows-networkpolicy

@XinShuYang
Copy link
Contributor Author

/test-windows-networkpolicy

@tnqn
Copy link
Member

tnqn commented Jan 17, 2022

/skip-e2e as Linux is not affected

@tnqn tnqn merged commit c2f3670 into antrea-io:main Jan 17, 2022
@tnqn tnqn added the action/backport Indicates a PR that requires backports. label Jan 25, 2022
@tnqn
Copy link
Member

tnqn commented Jan 25, 2022

@XinShuYang if the issue could happen in release-1.2~1.4, could you backport it, following https://github.com/antrea-io/antrea/blob/main/docs/contributors/cherry-picks.md?

@XinShuYang
Copy link
Contributor Author

Sure.

yanjunz97 pushed a commit to yanjunz97/antrea that referenced this pull request Feb 14, 2022
…ds (antrea-io#3169)

Set/Get-VMSwitch could return hostname resolution error in some environments.
The root cause is the remote DNS name didn’t match hostname, then Set/Get-VMSwitch
could return error once it needed to get hostname from DNS name.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
XinShuYang added a commit to XinShuYang/antrea that referenced this pull request Feb 23, 2022
CI testbed building could still failed randomly with unmodified commands.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Refer to antrea-io#3169

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
XinShuYang added a commit to XinShuYang/antrea that referenced this pull request Feb 23, 2022
CI testbed building still failed randomly with unmodified commands.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Refer to antrea-io#3169

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
tnqn pushed a commit that referenced this pull request Feb 24, 2022
CI testbed building still failed randomly with unmodified commands.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Refer to #3169

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
bangqipropel pushed a commit to bangqipropel/antrea that referenced this pull request Mar 2, 2022
CI testbed building still failed randomly with unmodified commands.

This issue can be fixed by adding hostname to configuration.
e.g. Get-VMSwitch -ComputerName $(hostname) -Name %s

Refer to antrea-io#3169

Signed-off-by: Shuyang Xin <gavinx@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants