Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

daemon: fix race in config handler #17413

Merged
merged 1 commit into from Sep 23, 2021
Merged

daemon: fix race in config handler #17413

merged 1 commit into from Sep 23, 2021

Conversation

h3llix
Copy link
Contributor

@h3llix h3llix commented Sep 16, 2021

Fixes: #17321

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 16, 2021
@h3llix h3llix force-pushed the fixRace branch 2 times, most recently from 305e1be to 6ad0f25 Compare September 18, 2021 04:38
@h3llix h3llix marked this pull request as ready for review September 18, 2021 04:39
@h3llix h3llix requested review from a team and joamaki September 18, 2021 04:39
Copy link
Contributor

@joamaki joamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! One minor comment.

How did you figure out that "ConfigPatchMutex" shouldn't be serialized? Wondering if there's other fields that should be skipped and whether there's a better way than manually skipping specific fields...

daemon/cmd/config.go Outdated Show resolved Hide resolved
@joestringer
Copy link
Member

joestringer commented Sep 20, 2021

How did you figure out that "ConfigPatchMutex" shouldn't be serialized? Wondering if there's other fields that should be skipped and whether there's a better way than manually skipping specific fields...

I had prompted this, because "ConfigPatchMutex" isn't a configuration option, it's a mutex used internally inside the daemon to synchronize state. So it doesn't make sense to me that we would try to serialize a synchronization primitive onto the wire. Interesting question though, for instance when I peruse the DaemonConfig structure I see a couple of other fields that aren't actually agent configuration options, they're just conveniently placed directly into this structure. I'm looking at things like DryMode which is only used for unit tests. Maybe CreationTime falls under this category also. It probably makes sense for us to collect these non-option fields together in a dedicated section maybe at the start of the DaemonConfig structure (or even in a dedicated structure that's embedded directly into DaemonConfig) so that it's easier to identify these and that they should not be serialized.

Signed-off-by: Gaurav Genani <h3llix.pvt@gmail.com>
@h3llix
Copy link
Contributor Author

h3llix commented Sep 21, 2021

@joestringer That makes sense. We could have embedded structs containing configurations which do not need to be serialized. Simply skipping the struct at the time of mapping.

Copy link
Member

@joestringer joestringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix LGTM, thanks.

@joestringer joestringer added the release-note/misc This PR makes changes that have no direct user impact. label Sep 21, 2021
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Sep 21, 2021
@joestringer
Copy link
Member

joestringer commented Sep 21, 2021

test-me-please

Job 'Cilium-PR-K8s-1.16-net-next' failed and has not been observed before, so may be related to your PR:

Click to show.

Test Name

K8sServicesTest Checks service across nodes Tests NodePort BPF Tests with direct routing Tests NodePort

Failure Output

FAIL: Can not connect to service "tftp://192.168.36.11:31670/hello" from outside cluster (1/10)

If it is a flake, comment /mlh new-flake Cilium-PR-K8s-1.16-net-next so I can create a new GitHub issue to track it.

Copy link
Contributor

@joamaki joamaki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now, thanks!

@joestringer
Copy link
Member

ci-aks failure is likely a variation of the flake #16938.
test-1.16-netnext failure looks like #12511.
ci-l4lb failure appears to be #17305.

No need to block this PR on the test failures since they are unrelated to the PR.

@joestringer joestringer added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Sep 21, 2021
@jibi jibi merged commit f3887bd into cilium:master Sep 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
4 participants