New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Significant regression in API server due to policy code path #9368
Comments
@smarterclayton How about holding on the re-run until I've had time to plumb through shared caches? |
This seems a straight up regression, unless there's something I don't know On Jun 16, 2016, at 7:31 AM, David Eads notifications@github.com wrote: @smarterclayton https://github.com/smarterclayton How about holding on — |
/sub |
There are a couple spots to do with scoped tokens that bypass the cache since we didn't have clean plumbing for them. Fixing that is near the top of my list. I also want to re-wire our existing caches to use the new cache style we have. |
Ok. Will keep this as a high priority so we don't lose sight. We
should make sure performance knows not to start testing until we land
this.
@timothysc
|
ack /cc @mffiedler |
|
This is @ 7c4fc5a |
which I think is the same, is the bulk of them. It looks like we make ~1-3 per mutation call. |
How easy/hard is it to get the comparison mem chart for 1.2? Do you have a script? Given that its the |
I think the command has worked since 1.0. On Jun 24, 2016, at 7:54 AM, David Eads notifications@github.com wrote: which I think is the same, is the bulk of them. It looks like we make ~1-3 How easy/hard is it to get the comparison mem chart for 1.2? Do you have a Given that its the GetEffectivePolicyRules call, I'd suspect this copy: — |
Clayton thinks he's got this. |
Almost resolved with #9814 |
Wasn't that pre-existing? |
The fix in there fixes conversions. |
... the bad news is that we're allocating too much in policy again :)
From test-cmd in memprofile, it looks like we are bypassing / flushing the policy cache very aggressively (in an N^2 or N^3 pattern, maybe?) and so we are allocating lots of objects. For a test-cmd run we make ~40k API calls, but we are allocating ~26.5 million objects. It looks like confirmNoEscalation is the source of the GetClusterPolicy and GetEffectivePolicyRules calls, and it does not appear to be hitting caches. Should be reproducible with the following command (will dig in tomorrow and look for anything obvious).
The text was updated successfully, but these errors were encountered: