Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime memory leak using Actor timers #6517

Closed
alicejgibbons opened this issue Jun 14, 2023 · 11 comments · Fixed by #6523
Closed

Runtime memory leak using Actor timers #6517

alicejgibbons opened this issue Jun 14, 2023 · 11 comments · Fixed by #6523
Assignees
Labels
kind/bug Something isn't working
Milestone

Comments

@alicejgibbons
Copy link
Contributor

In what area(s)?

/area runtime

What version of Dapr?

1.10.7

Expected Behavior

Actor timers should not put pressure on the Dapr sidecar.

Actual Behavior

The actor app's Dapr sidecar continuously grows in memory (see graph below at 23.5 GiB) until it is eventually killed. There is only one actor per app with a single timer however the RegisterTimerAsync method is called periodically with the same actor name in order to achieve the effect of a timer with a "floating window".

image

Steps to Reproduce the Problem

Instantiate an actor and then continuously call RegisterTimerAsync method with the actor name to watch how its memory behaves.

Release Note

RELEASE NOTE: FIX Runtime Memory leak with Actor reminders.

@alicejgibbons alicejgibbons added the kind/bug Something isn't working label Jun 14, 2023
@yaron2 yaron2 added this to the v1.12 milestone Jun 14, 2023
@ItalyPaleAle
Copy link
Contributor

Can you confirm if this is about timers or reminders?

The title mentions reminders, and #6503 is about workflows so reminders.

However, in the text above you mention timers.

Also, what's the delay set in the timers?

@artursouza
Copy link
Member

This is for timers, every time a message arrives (randomly) the timer is recreated with a 15min trigger. I suggest we create a test where a single timer is recreated "forever" and verify the sidecar never crashes due to out-of-memory from K8s limit.

@alicejgibbons alicejgibbons changed the title Runtime memory leak using Actor reminders Runtime memory leak using Actor timers Jun 14, 2023
@yaron2
Copy link
Member

yaron2 commented Jun 14, 2023

Thanks for the clarification, we'll track this separately

@ItalyPaleAle
Copy link
Contributor

@alicejgibbons could you please share the code used to repro this issue?

Also what do you mean with continuously here?

Instantiate an actor and then continuously call RegisterTimerAsync method with the actor name to watch how its memory behaves.

@artursouza
Copy link
Member

artursouza commented Jun 15, 2023

I have reproduced an issue that might be related to this. It basically re-registers the same timer over and over and the number of go-routines grows forever - seems to be one per request. Memory did not seem to grow at the same rate, but I cannot discard this is the same issue yet. Repro steps:

Run the app in dapr/dapr:

dapr run --dapr-http-port=3500 --metrics-port=9090  --app-port=3000 -- bash -c "cd tests/apps/actorfeatures && go run app.go"

Register same timer forever:

curl --location 'http://localhost:3500/v1.0/actors/testactorfeatures/1234/method/SayHello' --header 'Content-Type: application/json' --data '{
    "interval": 10,
    "feature": "hello world"
}'


while [[ true ]]; do curl --location --request PUT 'http://localhost:3500/v1.0/actors/testactorfeatures/1234/timers/SetTimer' --header 'Content-Type: application/json' --data '{
  "dueTime":"0h0m0s0ms",
  "period":"0h0m9s0ms"
}'; done

Then, fetch the number of go routines over time and see the number quickly grow to 10K:

curl http://localhost:9090/metrics | grep go_goroutines | grep -v \# | cut -d\  -f2

Used 1.11 runtime.

@ItalyPaleAle
Copy link
Contributor

@artursouza that is actually expected.

Your timers have no TTL, so they are repeated forever. This means that for each call, you are adding new ones.

The current implementation for timers (and in this case, for reminders too), creates a goroutine for each timer (and reminder) that is just waiting for the timer/reminder to fire. The goroutine ends only when the timer ends.

I started working on a fix for this, but more urgent things came up and I never got to complete that: #6040

@artursouza
Copy link
Member

Member

It might be expected as per implementation, but users expect a timer to override the previous one when using the same name.

@ItalyPaleAle
Copy link
Contributor

Oh, I missed the part where the name is the same. Agree there for sure.

@ItalyPaleAle
Copy link
Contributor

I think I have a fix in #6523

@ItalyPaleAle ItalyPaleAle self-assigned this Jun 15, 2023
@berndverst
Copy link
Member

@ItalyPaleAle do we need to cherry pick #6523 into 1.11 and 1.10?

@ItalyPaleAle
Copy link
Contributor

@berndverst yes, when it's merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants