-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] High load when updating internal users using PATCH #4008
Comments
@Jakob3xD Thanks for filing this issue - really appreciate the test configuration and the graphs on the issue. |
Maybe this is related https://stackoverflow.com/questions/36471723/bcrypt-performance-deterioration |
I have looked into it but the |
[Triage] Hi @Jakob3xD thanks for filing this issue. Thank you for filing this issue. We can go ahead and mark this as triaged with a closure criteria of either correcting the CPU usage if it directly in our control or at least pointing to the correct dependency or issue that is causing this behavior. |
I am not sure I understand your sentence correctly. Can you please rephrase it for me. I opened the issue as I am not familiar with java and have now further ideas on how to debug this or add more useful information. |
Hi @Jakob3xD, the comment I left was part of the weekly triaging process the contributors go through. You can read more about it on the TRIAGING markdown file. It is not clear whether OpenSearch or something we are dependent on is causing the CPU use. We may not be able to correct the issue if it is from a dependency, but we can at least point to the root issue if that is the case. |
I reproduced it on 1 node cluster and the issue as @peternied mentioned related to |
What is the bug?
We are currently experiencing following behavior, one of our clusters goes to 100% CPU usage on all nodes when we send a PATCH request changing internal_users. It does not matter if we add new users, delete them or delete non existing users. After the PATCH request is answered/returned the load goes up for a few minutes causing node failures from time to time or rejecting requests.
In other clusters we can also see an load increase after the PATCH requests returned but for a much shorter time frame.
It seems like this is caused by BCrypt see jstack below.
How can one reproduce the bug?
I could not reproduce it in with this the large time frame but the shorter load spike (~20s).
Steps to reproduce the behavior:
Golang create base load file:
What is the expected behavior?
I would expect a lower load increase on all nodes when users change.
What is your host/environment?
Do you have any screenshots?
![image](https://private-user-images.githubusercontent.com/29206900/301504193-484d288b-b266-41e2-8b84-f40f72194a50.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTg2NzA0ODQsIm5iZiI6MTcxODY3MDE4NCwicGF0aCI6Ii8yOTIwNjkwMC8zMDE1MDQxOTMtNDg0ZDI4OGItYjI2Ni00MWUyLThiODQtZjQwZjcyMTk0YTUwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MTglMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjE4VDAwMjMwNFomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTYyMWQ0NjZkMzBlYTQ1NGYyNTJhNjc3YjM2MTc0ZTdiODFjMzgyMmIzODkzN2I2NzdiYWNlYTBlNjNiN2RjZDEmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.qDdR3NW1NTPCt6hbMbV92nFijIrcuS2lA2AYRJoiX3s)
Do you have any additional context?
Jstack during the high load time frame:
opensearch-high-load-jstack.txt
The text was updated successfully, but these errors were encountered: