Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of GET requests #271

Open
wants to merge 13 commits into
base: main
Choose a base branch
from
Open

Conversation

moio
Copy link
Contributor

@moio moio commented Jan 22, 2024

👋 hamilton maintainers!

I am opening this to ask whether it would make sense to include some new code, or patch existing code, in order to improve resource consumption of the Get method - particularly memory usage but also CPU.

I came to this problem as part of my day work at SUSE for Rancher, which uses hamilton to implement Entra ID (formerly Azure AD) integration. In some high-scale use tests we have users with principals in the thousands, groups in the millions and group assignments per principal in the hundreds - and enough traffic to need multiple (tens of) goroutines to periodically refresh all these objects in parallel. In extreme cases we have seen hundreds of MBs to GiBs worth of heap objects, which tend to cause out-of-memory situations.

Here is an example of a similar flame graph:
image

Of course our usage pattern isn't typical, and we are working at ways to change the way we interact with the library. Nevertheless according to my analysis Get (and code called by it) does quite a bit of needless computation:

  • OData json.Unmarshal calls are amplified by a factor of 2 due to the error handling in OData.UnmarshalJSON
  • FromResponse calls, which drive OData.UnmarshalJSON calls, are amplified by a factor of 2 by performRequest
  • Get needlessly re-unmarshal responses once after FasterResponse (again with the amplification factors above), and especially
  • Get un-marshals and subsequently re-marshals responses if they are paginated. This unmarshaling-marshaling extra round is repeated by the number of pages

This PR cuts down unmarshaling to one pass only (divided in sub-passes via json.RawMessage) and eliminates remarshaling altogether.

I could not find ways to achieve this without altering the OData struct and structs/methods which use it (ConsistencyFailureFunc, ValidStatusFunc, GetHttpRequestInput), therefore, I opted for a completely new code path in fasterclient.go. The approach can be changed if API changes are acceptable.

For now and for the sake of discussion, I only patched the one endpoint that really hurts my use case: ListGroupMemberships. Changes are minimal other than using the FasterGet method in place of Get - ideally this could be extended to all other (internal) usages of Get.

I am also contributing code to exercise ListGroupMemberships more thoroughly so that the effects of the patch are evident. Furthermore I added a utility method that will output CPU and memory usage during the test run.

My results are as follows:

main faster_get
Run 1 610 ms 273 ms
Run 2 817 ms 122 ms
Run 3 643 ms 353 ms
Average 690 ms 249 ms

CPU load is ~36.14% of the original

main faster_get
Run 1 21 MiB 3 MiB
Run 2 22 MiB 3 MiB
Run 3 23 MiB 3 MiB
Average 22 MiB 3 MiB

Memory consumption is ~13.64% of the original

Is there any interest to merge such an improvement? What would be needed to have a similar PR accepted?

Thanks in advance, I hope this helps

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Cascades to GetHttpRequestInput and related functions

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Use json.RawMessage to unmarshal OData values separately from the rest of the OData structure,
and directly into the a desired result slice.

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Unmarshal errors separately from the rest so that unmarshaling with two
different types can be avoided.

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
moio added a commit to rancher/rancher that referenced this pull request Jan 22, 2024
Replace the hamilton library with a 0.46 version with this PR merged:

manicminer/hamilton#271

Signed-off-by: Silvio Moioli <silvio@moioli.net>
moio added a commit to rancher/rancher that referenced this pull request Jan 23, 2024
Replace the hamilton library with a 0.46 version with this PR merged:

manicminer/hamilton#271

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
Signed-off-by: Silvio Moioli <silvio@moioli.net>
… re-run if response middlewares are set

Signed-off-by: Silvio Moioli <silvio@moioli.net>
@moio
Copy link
Contributor Author

moio commented Feb 23, 2024

@manicminer do you have any impressions to share about this work?

moio added a commit to rancher/rancher that referenced this pull request Mar 1, 2024
includes manicminer/hamilton#271

Signed-off-by: Silvio Moioli <silvio@moioli.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant