Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High memory working set in azure function #1339

Closed
andrew-vdb opened this issue Sep 26, 2019 · 12 comments
Closed

High memory working set in azure function #1339

andrew-vdb opened this issue Sep 26, 2019 · 12 comments

Comments

@andrew-vdb
Copy link

@paulbatum you have to train your support, they keep talking and investigating "Memory working set" instead of "private bytes".
I assume calculation of cost is based on "private bytes" instead of "memory working set" ?

Because I have this strange graph
image
The memory working set go high
The first peak point has less work load from the second peak point

Until I check the private bytes which makes more sense
image

Another problem is that the auto healing (when they restart our application) is based on "memory working set"

a bit out of topic, how can I have high memory working set in azure function? what might be the cause of this? Is it caused by appinsight or webjobdashboard?

@andrew-vdb
Copy link
Author

Original issue is here: Azure/azure-functions-durable-extension#886

Fact 1:
Initially I thought the retry logic of durable function causing the memory working set spiking.
This is not true anymore because I can spike it without retry logic in durable function.

Fact 2:
Initial spike always higher than second spike.
While first load is lower than second load.
For example, I process 5 files on the first load, on the second load i process 10 files
Same file, same orchestration in durable function.

Fact 3:
It also happens even without load, the memory working set can go higher for no reason.

Last update from azure support:
They are loading the memory dump, find nothing on "ScriptHost" and kinda conclude that there is no problem/memory leak on azure function.

When I load the memory dump, I just see lot of Regex in the memory
My application have no regex at all.

@paulbatum
Copy link
Member

@andrew-vdb thanks for filing a separate issue for this. I would like to assist support with the investigation. Do you have a support case number you can share here?

@andrew-vdb
Copy link
Author

@paulbatum Thank you! I hope we will find the cause of this memory spike

119080522001676
This is when we suspect durable function retry causing memory spike. Initially we can spike the memory like what I did above, with forcing the orchestration to retry. Somehow, we are unable to always to spike the memory anymore and I already conclude that retry is not the real issue.

119072422000746
This is when the RaiseEvent is not called. No error in the caller. No sign also on azure function side if they get the call.
Upon investigation with azure support, they see the auto healing restart our application on exactly the time when RaiseEvent being called. This is when we start to investigate memory on azure function. As I said, the discussion with azure support always pointing to memory working set, they even just said, just auto scale your app service. It is pretty stalled case, no body know what to do anymore, that's why i put comment on github issue to have the memory usage per function as Azure support keep blaming our application of course, fine to me, but back your claim with data, which function causing this memory spike? Now we know its impossible to find out too.

Private bytes, its what I expect from my application, assuming that "m" is "MB"?
Memory working set, very scary, from 150 MB spiking to 625 MB just to process 5 files? (initial load)

@paulbatum
Copy link
Member

I have provided support some assistance with this case, but just in the interests of sharing information for anyone else that comes across this issue and is curious:

  1. Working set tracks how much of the application is currently loaded in physical memory. This number can vary significantly as the OS makes decisions around paging. It is expected that when under load, working set should increase significantly.
  2. Private bytes tracks how much memory is allocated to the application in total, regardless of whether its in physical memory or paged to disk. This is what is used to track billable memory usage in the consumption plan. In dedicated plans, your memory usage does not impact your bill, unless the memory usage is so high that it would force you to scale up to a higher SKU. The numbers shared in this issue are not high enough for that to be the case.

@andrew-vdb
Copy link
Author

Chart from "metrics",
image

image

Chart from "log analytics" 24 hours
image

Chart from "log analytics" 48 hours
image

  • The system is not really under heavy load
  • Chart from metrics and log analytics produce different value?
    I trust "metrics", when its 0, it was when I switch the azure function off
  • Is this "spike" a "normal" thing?

I already optimize the function which may cause the spike (chunked upload ms graph),
It still does not fix the issue

The next thing I will do is to optimize durable function replay
Some people report that the history object is not cleaned
Azure/azure-functions-durable-extension#340

@andrew-vdb
Copy link
Author

Memory dump
image

image

image

@paulbatum
Copy link
Member

@andrew-vdb Thanks for sharing those screenshots. 110mb for RegexInterpreter definitely seems suspicious. What's even more suspicious is the fact that your last screenshot does not show any of your code, or functions runtime code, referencing the RegexInterpreter.

In contrast, it should look more like this:

image

@paulbatum
Copy link
Member

paulbatum commented Oct 16, 2019

Here's something you could try - select the large RegexInterpreter and then go to the referenced objects view:

image

See if the large amount of memory usage can be attributed to a particular type. Maybe there is a really large string? My screenshot above shows a 90K string. If you see a large string, you can inspect it. This might give you a clue of what is creating this memory usage.

@andrew-vdb
Copy link
Author

image

It keep saying internal error in the expression evaluator
I use Visual studio 2017, already google and changing the debugging option is not fixing anything, any other tool i can use to inspect the value?

@andrew-vdb
Copy link
Author

Using windbg
image

I see some string value, i will try to fix that part

@andrew-vdb
Copy link
Author

I think i found the culprit, I will wait until my tester test it for sure, thanks for your help @paulbatum

@paulbatum
Copy link
Member

@andrew-vdb glad to hear it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants