Processes reading from cache blocked by generational gc process #197

szajbus · 2023-03-13T21:10:29Z

We use nebulex to cache responses to web requests, we use Nebulex.Adapters.Local with :ets backend.

We observed periodic spikes in response times for small number of requests, they were correlated in time with generational garbage collection.

I believe the problem has a similar root cause to #121, i.e. a race condition - gc may be started while there still are processes accessing the ets table.

In #121 it was mitigated by delaying the deletion of the ets table (so that other processes can still access it), but instead removing all its data.

However, :ets.delete_all_objects/1 is an "atomic and isolated" operation, which means that the same processes that used to crash before #121 will now need to wait until this operation finishes.

In our case, with cache of almost 1GB deleting all objects takes > 1 second, which unfortunately is a noticeable problem.

I'm happy to submit a PR with a solution, but I'm not sure what would be the best approach, some options:

instead of calling :ets.delete_all_obejcts/1 immediately, schedule it to happen in some time (perhaps configurable) when race condition is highly unlikely
don't call :ets.delete_all_obejcts/1 at all and wait until next gc process, which simply deletes the backend table

The text was updated successfully, but these errors were encountered:

cabol · 2023-03-22T19:59:54Z

This is a bit tricky, I agree we should avoid calling :ets.delete_all_obejcts/1, and regarding the options, I'd rather the first one schedule it to happen in some time (perhaps configurable). The problem with the other one is the GC may take some time, depending on the configured value, which means not releasing memory when we are supposed to do it. So, the first option seems to me better, but since we schedule the deletion, perhaps we can use :ets.delete/1 instead, that will be much better.

szajbus mentioned this issue Mar 13, 2023

Release space asynchronously when removing older generation #196

Closed

szajbus mentioned this issue Jul 14, 2023

Delay flushing ets table to avoid blocking processes using it #210

Merged

cabol closed this as completed in #210 Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Processes reading from cache blocked by generational gc process #197

Processes reading from cache blocked by generational gc process #197

szajbus commented Mar 13, 2023

cabol commented Mar 22, 2023 •

edited

Loading

Processes reading from cache blocked by generational gc process #197

Processes reading from cache blocked by generational gc process #197

Comments

szajbus commented Mar 13, 2023

cabol commented Mar 22, 2023 • edited Loading

cabol commented Mar 22, 2023 •

edited

Loading