Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fqdn: serialize requests per-name #30109

Merged
merged 1 commit into from
Jan 9, 2024
Merged

Conversation

squeed
Copy link
Contributor

@squeed squeed commented Jan 4, 2024

This re-creates similar behavior as in v1.14 - FQDN requests are serialized on a per-name basis. This was removed as part of the larger fqdn refactor but, it turns out, that was premature. The difference between v1.14 and now is that this locks per-name, rather than per-IP. It preserves the prime-sized count of bucketed locks.

This is needed because we release the lock on the NameManager before policy updates have been fully pushed out to all endpoints. If another endpoint requests the same name, it may have the DNS response returned to it before policy updates are complete. However, the NameManager will detect the noop case, take a shortcut, and release the DNS response before policy updates are potentially complete.

For precise details, see the comment in fqdn.go.

Fixes: #30014
Fixes: 507259f

@squeed squeed added kind/bug This is a bug in the Cilium logic. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies. release-note/misc This PR makes changes that have no direct user impact. area/fqdn Affects the FQDN policies feature labels Jan 4, 2024
@squeed squeed requested review from a team as code owners January 4, 2024 22:00
@maintainer-s-little-helper
Copy link

Commit 8984191 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 4, 2024
@maintainer-s-little-helper
Copy link

Commit 8984191 does not match "(?m)^Signed-off-by:".

Please follow instructions provided in https://docs.cilium.io/en/stable/contributing/development/contributing_guide/#developer-s-certificate-of-origin

@squeed squeed removed the dont-merge/needs-sign-off The author needs to add signoff to their commits before merge. label Jan 4, 2024
This re-creates similar behavior as in v1.14 - FQDN requests are
serialized on a per-name basis. This was removed as part of the larger
fqdn refactor but, it turns out, that was premature. The difference
between v1.14 and now is that this locks per-name, rather than per-IP.
It preserves the prime-sized count of bucketed locks.

This is needed because we release the lock on the NameManager before
policy updates have been fully pushed out to all endpoints. If another
endpoint requests the same name, it may have the DNS response returned
to it before policy updates are complete. However, the NameManager will
detect the noop case, take a shortcut, and release the DNS response
before policy updates are potentially complete.

For precise details, see the comment in fqdn.go.

Fixes: cilium#30014
Fixes: 507259f
Signed-off-by: Casey Callendrello <cdc@isovalent.com>
@squeed squeed added the needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch label Jan 4, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot added this to Needs backport from main in v1.15.0-rc.1 Jan 4, 2024
@squeed
Copy link
Contributor Author

squeed commented Jan 4, 2024

/test

Copy link
Member

@mhofstetter mhofstetter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sig-agent & cli ✔️

Copy link
Contributor

@lambdanis lambdanis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs ok

Copy link
Contributor

@danehans danehans left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should a test similar to Benchmark_notifyOnDNSMsg() be created to validate the fix?

Copy link
Member

@pippolo84 pippolo84 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 💯

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jan 9, 2024
@julianwiedmann julianwiedmann added this pull request to the merge queue Jan 9, 2024
Merged via the queue into cilium:main with commit 20975c9 Jan 9, 2024
62 checks passed
@jibi jibi mentioned this pull request Jan 12, 2024
32 tasks
@jibi jibi added backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. and removed needs-backport/1.15 This PR / issue needs backporting to the v1.15 branch labels Jan 12, 2024
@giorio94 giorio94 added backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. and removed backport-pending/1.15 The backport for Cilium 1.15.x for this PR is in progress. labels Jan 29, 2024
@maintainer-s-little-helper maintainer-s-little-helper bot moved this from Needs backport from main to Backport done to v1.15 in v1.15.0-rc.1 Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/fqdn Affects the FQDN policies feature backport-done/1.15 The backport for Cilium 1.15.x for this PR is done. kind/bug This is a bug in the Cilium logic. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact. sig/policy Impacts whether traffic is allowed or denied based on user-defined policies.
Projects
No open projects
v1.15.0-rc.1
Backport done to v1.15
Development

Successfully merging this pull request may close these issues.

(Possible issue) race condition between multiple endpoints and a FQDN selector.
8 participants