add filter to curl request to prevent oom exceptions #45

c-kr · 2022-04-13T09:49:43Z

Sometimes the apis response gets very long because it logs every http requests by default. On our test machine the following grep cmd filled up the 24gb memory within seconds:

if [[ -n $user ]] || [[ -n $(echo $esstatus | grep -i authentication) ]] ; then

We were able to mitigate it by adding a filter to the api requests to include only those values which are needed. This probably needs some more testing before merging, we only tested the memory mode

Napsty · 2022-04-20T13:04:38Z

~~Lets give #42 a bit more time to rebase the PR, then adjust this PR and get a new version out.~~
Nevermind, we'll go ahead with this one first and then we can do #42.
Very interesting that your Elasticsearch shows so much data though. Could it be a misconfigured ES at the cause?
On which check type did you see the huge data in the API response?

c-kr · 2022-04-28T08:19:39Z

I am not an elastic guy but i don't think it is misconfigured, just heavily used which leads to logging of a lot open http connections in the api output. It only happens sometimes a day.

We see this huge data in all checks used by the getstatus() method with endpoint (/stats).

Napsty · 2022-04-28T13:28:31Z

Thanks for the info @c-kr

We see this huge data in all checks used by the getstatus() method with endpoint (/stats).

That is certainly strange. I will do some tests with my Elasticsearch instances and maybe I can reproduce this sometime.
What kind of Elasticsearch did you see that on? Local self installed or a Cloud instance? Which version?

Nevertheless, I think your contribution makes sense and will be merged after tests. 👍

c-kr · 2022-04-30T13:56:40Z

What kind of Elasticsearch did you see that on? Local self installed or a Cloud instance? Which version?

Self Installed Elasticsearch Version 7.17.1

Nevertheless, I think your contribution makes sense and will be merged after tests. 👍

Thanks

Napsty · 2022-05-13T12:43:17Z

disk check does not work anymore with that change.

Before PR:

$ ./check_es_system.sh -H myes.eu-central-1.aws.cloud.es.io -u monitoring -p secret -S -P 9243 -t disk
ES SYSTEM OK - Disk usage is at 9% (41 G from 450 G)|es_disk=44708304152B;386547056640;459024629760;0;483183820800

When applying the PR:

$ ./check_es_system.sh -H myes.eu-central-1.aws.cloud.es.io -u monitoring -p secret -S -P 9243 -t disk
expr: non-integer argument
expr: non-integer argument
expr: non-integer argument
expr: non-integer argument
./check_es_system.sh: line 390: [: 44708304152: unary operator expected
./check_es_system.sh: line 393: [: 44708304152: unary operator expected
ES SYSTEM OK - Disk usage is at % (41 G from  G)|es_disk=44708304152B;;;0;null

Maybe the filter needs to be adjusted. Can you check please?

Napsty

needs some changes so that "disk" check works again.

add filter to curl request to prevent oom exceptions

19f630f

Napsty self-requested a review May 13, 2022 12:41

Napsty requested changes May 13, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add filter to curl request to prevent oom exceptions #45

add filter to curl request to prevent oom exceptions #45

c-kr commented Apr 13, 2022

Napsty commented Apr 20, 2022 •

edited

c-kr commented Apr 28, 2022

Napsty commented Apr 28, 2022

c-kr commented Apr 30, 2022

Napsty commented May 13, 2022

Napsty left a comment

add filter to curl request to prevent oom exceptions #45

Are you sure you want to change the base?

add filter to curl request to prevent oom exceptions #45

Conversation

c-kr commented Apr 13, 2022

Napsty commented Apr 20, 2022 • edited

c-kr commented Apr 28, 2022

Napsty commented Apr 28, 2022

c-kr commented Apr 30, 2022

Napsty commented May 13, 2022

Napsty left a comment

Choose a reason for hiding this comment

Napsty commented Apr 20, 2022 •

edited