Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error R14 (Memory quota exceeded) on _web_ dyno #2377

Closed
eddierubeiz opened this issue Sep 26, 2023 · 10 comments
Closed

Error R14 (Memory quota exceeded) on _web_ dyno #2377

eddierubeiz opened this issue Sep 26, 2023 · 10 comments
Assignees
Labels
1-2-days 1-2 day of developer time estimate

Comments

@eddierubeiz
Copy link
Contributor

eddierubeiz commented Sep 26, 2023

First over the weekend of Sept 23-24.. This issue is to track these errors.

@jrochkind jrochkind changed the title Error R14 (Memory quota exceeded) over the weekend of Sept 23-24. Error R14 (Memory quota exceeded) on _web_ dyno over the weekend of Sept 23-24. Sep 26, 2023
@jrochkind
Copy link
Contributor

jrochkind commented Sep 26, 2023

we get R14 errors in scheduler/worker all the time, which is also annoying, but different.

Our web dyno actually has a lot of RAM, this is disturbing and unfortunate. But I think it would take a lot of time investigating to have anything at all to say on it (probably still not conclusive), and I don't at present plan to spend today doing that.

Perhaps just hope it doens't happen again?

One easy thing we can do is try implemetning the jemalloc patch again, something some people find reduces Rails memory consumption, including on heroku (need to google for references/links on this subject).

@eddierubeiz
Copy link
Contributor Author

Jemalloc work from last fall: #1797

@eddierubeiz
Copy link
Contributor Author

eddierubeiz commented Sep 26, 2023

Recent web dyno memory usage
(from the web dyno memory use dashboard)

Roughly speaking, memory usage was hovering between 2 and 2.5 GB for most of Thursday through Sunday of last week.

@eddierubeiz
Copy link
Contributor Author

@jrochkind jrochkind added the 1-2-days 1-2 day of developer time estimate label Sep 27, 2023
@jrochkind
Copy link
Contributor

jrochkind commented Oct 2, 2023

Happened again Monday Oct 2 between 7am and 9am.

It would be interesting to see if we had more traffic than usual then. Our rate-limiter didn't seem to catch anything during that period.

Nothing obvious in Heroku metrics either.

@jrochkind
Copy link
Contributor

Our heroku production is a single performanc-m dyno, which has 2.5GB of RAM.

We currently run 5 puma processes on it, each with 5 threads.

So I guess our puma processes have edged over an average of 512MB/per, at least in some cases.

We could first try pemalloc.

We could reduce to 4 puma processes, reducing our capacity by 20%.

Or we could upgrade our dyno, however the next step up is a performance-l dyno, at an additional $250/month. However, that would give us significantly more capacity, it more than doubles our capacity (14GB RAM instead of 2.5GB, and around 4x the CPU capacity).

@jrochkind
Copy link
Contributor

jrochkind commented Oct 2, 2023

R14's continued this morning Monday May 2 past 9am as well, and may be ongoing.

I'm going to try deploying the jemalloc thing....

heroku buildpacks:add --index 1 https://github.com/gaffneyc/heroku-buildpack-jemalloc.git -r production
heroku config:set JEMALLOC_ENABLED=true -r production

Then deploy. Let's see how the graph looks after this.

If we keep this, make sure to document in wiki on heroku config.

@jrochkind jrochkind self-assigned this Oct 2, 2023
@jrochkind jrochkind changed the title Error R14 (Memory quota exceeded) on _web_ dyno over the weekend of Sept 23-24. Error R14 (Memory quota exceeded) on _web_ dyno Oct 2, 2023
@eddierubeiz
Copy link
Contributor Author

Not sure if this is helpful but Scout does have some memory-related diagnostics.

@jrochkind
Copy link
Contributor

The last R14 today error was 10:53:33AM Oct 2. So it's possible our jemalloc change improved things; or just the dyno restart that happened when we deployed.

We'll have to keep keeping an eye on things, will leave this ticket open in "in progress" for a bit to remind us to monitor.

@jrochkind
Copy link
Contributor

The past week of jemalloc does seem to have on average saved 300-500MB of RAM -- and there have been no R14 errors in the past week.

We're going to say this is a real improvement, sufficient for now, close this ticket. Can be re-opeend or new ticket if problem re-arises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1-2-days 1-2 day of developer time estimate
Projects
None yet
Development

No branches or pull requests

2 participants