Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsproxy: Remove usage of regex lru #22584

Merged
merged 1 commit into from Jun 12, 2023

Conversation

odinuge
Copy link
Member

@odinuge odinuge commented Dec 6, 2022

Since we now have a separate reference counted cache storing compiled regexes. This will both make more space for other values in the cache, as well as making the huge regexes used in the dnsproxy avaible to gc when they are no longer used.

Signed-off-by: Odin Ugedal ougedal@palantir.com

Fixes: #22241

dnsproxy: stop using the regex lru in the dns proxy to avoid keeping large unused regex in memory when no longer needed

@maintainer-s-little-helper maintainer-s-little-helper bot added the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Dec 6, 2022
@github-actions github-actions bot added the kind/community-contribution This was a contribution made by a community member. label Dec 6, 2022
@odinuge
Copy link
Member Author

odinuge commented Dec 6, 2022

Ref. #22241 this will make all users able to utilize the benefits from #21288 when it comes to memory usage of unused regexes.

Not sure if we should remove all of it at once tho. Any thoughts @christarazi ?

pkg/fqdn/dnsproxy/proxy.go Outdated Show resolved Hide resolved
@odinuge odinuge marked this pull request as ready for review December 6, 2022 21:15
@odinuge odinuge requested review from a team as code owners December 6, 2022 21:15
@@ -347,7 +347,7 @@ func (c regexCache) lookupOrCompileRegex(pattern string) (*regexp.Regexp, error)
entry.referenceCount += 1
return entry.regex, nil
}
regex, err := re.CompileRegex(pattern)
regex, err := regexp.Compile(pattern)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can update the godoc on re.CompileRegex to say it's deprecated and not to use it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally, searching for all instances of re.CompileRegex and replacing them could work. Or alternatively, you could just make re.CompileRegex simply be a wrapper for regexp.Compile, that way you don't risk missing any paths.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can then remove this code in the v1.14 release cycle after we've deprecated it properly here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally back from 🌴 now, so ready to keep pushing these!

Do you have a preference here? I think a cache for the other usage of re.CompileRegex might be valuable to avoid compiling them all the time, but I don't have all the context on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second thought, maybe we don't want to entirely remove the LRU.

The problem with caching regexes as we've learned is if we don't clean them up. In your fixes, you've attached a refcount to the regex in the DNS proxy. But there are other areas too, where we can attach cleanup logic.

The other areas for regex compilation is policy YAML validation, aka pkg/fqdn/matchpattern.Validate() and during each DNS request in the DNS proxy. I see value in caching the regex in these cases.

In the former case, we could mitigate the issue of keeping the regexes around by hooking into the policy-delete event and removing the names (which map to regexes) in the to-be-deleted policy from the cache. In the latter case, we can use an LRU. I feel an LRU is suitable in this case because the access pattern will be based on the DNS requests that the workloads are performing.

Should these cases share the same LRU? Well, that was kinda the idea behind the initial LRU implementation to begin with. I think as long as we manage the lifecycle and ensure we delete "stale" entries, then we will probably be fine regarding all the memory issues we've had in the past (and that you've helped fix 🙂 ).

@jrajahalme
Copy link
Member

checkpatch fails due to a typo in the commit message:

  Warning: WARNING:TYPO_SPELLING: 'avaible' may be misspelled - perhaps 'available'?
  #8: 
  as well as making the huge regexes used in the dnsproxy avaible to gc

@jrajahalme jrajahalme added the release-note/misc This PR makes changes that have no direct user impact. label Jan 4, 2023
@maintainer-s-little-helper maintainer-s-little-helper bot removed the dont-merge/needs-release-note-label The author needs to describe the release impact of these changes. label Jan 4, 2023
@jrajahalme
Copy link
Member

/test

@christarazi
Copy link
Member

Seems that the Jenkins tests failed due to what seems to be a rebasing issue. Will rebase and re-run.

@christarazi christarazi added kind/enhancement This would improve or streamline existing functionality. area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. labels Jan 4, 2023
@christarazi
Copy link
Member

/test

@jibi jibi removed their request for review January 24, 2023 13:23
@christarazi christarazi marked this pull request as draft January 25, 2023 19:37
@github-actions
Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. and removed stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. labels Feb 27, 2023
@christarazi christarazi added the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Mar 6, 2023
@github-actions
Copy link

github-actions bot commented Apr 6, 2023

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Apr 6, 2023
@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label Apr 19, 2023
@github-actions
Copy link

This pull request has been automatically marked as stale because it
has not had recent activity. It will be closed if no further activity
occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label May 20, 2023
@odinuge odinuge force-pushed the ou-disable-lru-dnsproxy branch 2 times, most recently from f9950f6 to edf9613 Compare May 23, 2023 08:27
@odinuge
Copy link
Member Author

odinuge commented May 23, 2023

Forgot about this since I thought #23429 mitigated it, but seems like that only conflicted with one of the changes.

Pushed an updated version now

@odinuge odinuge marked this pull request as ready for review May 23, 2023 08:28
@christarazi
Copy link
Member

CI pointing out more places to remove

 Error: pkg/fqdn/dnsproxy/proxy_test.go:197:9: undefined: re

@github-actions github-actions bot removed the stale The stale bot thinks this issue is old. Add "pinned" label to prevent this from becoming stale. label May 24, 2023
@odinuge odinuge force-pushed the ou-disable-lru-dnsproxy branch 2 times, most recently from cf2e56f to 7138f62 Compare May 24, 2023 08:13
Copy link
Member

@christarazi christarazi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One nit on the commit title / msg, is that we are not entirely removing the usage of the LRU, but simply avoiding it in certain cases, see https://github.com/cilium/cilium/pull/22584/files#r1063772535. I think we should clarify that because it can certainly cause confusion for future readers. Maybe an additional code comment on the LRU and non-LRU code paths to explicitly state why a code path uses one or the other.

@odinuge
Copy link
Member Author

odinuge commented May 24, 2023

Hmm. The commit message body is in the context of the title with the dnsproxy: prefix, so I think it makes sense. But yeah, I can rewrite it tomorrow to make it even more explicit if its a bit ambiguous.

Since we now have a separate reference counted cache storing compiled
regexes, we no longer need the lru here. This will both make more space
for other values in the cache, as well as making the huge regexes used
in the dnsproxy available to gc when they are no longer used.

The regex LRU is still being used elsewhere in the cilium codebase,
where reusing compiled regexes makes sense, while reference counting is
non-trivial.

Signed-off-by: Odin Ugedal <ougedal@palantir.com>
Signed-off-by: Odin Ugedal <odin@uged.al>
@christarazi christarazi removed the dont-merge/needs-rebase This PR needs to be rebased because it has merge conflicts. label Jun 6, 2023
@christarazi
Copy link
Member

/test

@odinuge
Copy link
Member Author

odinuge commented Jun 9, 2023

Mind retesting this @christarazi ? Thanks

@christarazi
Copy link
Member

/ci-e2e

@christarazi
Copy link
Member

/ci-multicluster

@maintainer-s-little-helper maintainer-s-little-helper bot added the ready-to-merge This PR has passed all tests and received consensus from code owners to merge. label Jun 12, 2023
@christarazi christarazi merged commit 3b258a3 into cilium:main Jun 12, 2023
61 of 62 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/proxy Impacts proxy components, including DNS, Kafka, Envoy and/or XDS servers. kind/community-contribution This was a contribution made by a community member. kind/enhancement This would improve or streamline existing functionality. ready-to-merge This PR has passed all tests and received consensus from code owners to merge. release-note/misc This PR makes changes that have no direct user impact.
Projects
None yet
4 participants