Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERL-560: VM not yielding on long ets:inserts #4205

Closed
OTP-Maintainer opened this issue Feb 5, 2018 · 4 comments
Closed

ERL-560: VM not yielding on long ets:inserts #4205

OTP-Maintainer opened this issue Feb 5, 2018 · 4 comments
Assignees
Labels
bug Issue is reported as a bug priority:medium team:VM Assigned to OTP team VM
Milestone

Comments

@OTP-Maintainer
Copy link

Original reporter: yetihehe
Affected version: OTP-19.1.1
Fixed in version: OTP-23.0
Component: stdlib
Migrated from: https://bugs.erlang.org/browse/ERL-560


When inserting data into bag table, when there is many objects with the same key, whole VM is blocked for many seconds. VM uses 100% of one core and is not responsive, no other process can run. How to replicate:

{code:erlang}
ets:new(speed_test,[bag,named_table]).
List=[ {test,X} || X <- lists:seq(0,120000) ].
timer:tc(fun() -> ets:insert(speed_test,List) end).
{code}

ets:insert will block whole VM for about minute. Temporary fix - inserting records one by one with erlang:yield() after each one.
@OTP-Maintainer
Copy link
Author

lukas said:

The (somewhat) problematic thing about fixing this is this guarantee given in the manual "The entire operation is guaranteed to be atomic and isolated, even when a list of objects is inserted.".

@OTP-Maintainer
Copy link
Author

yetihehe said:

This would not be a problem if inserting duplicated keys didn't take 1500x as long as normal insert. When you insert different keys at once, it only takes tens of milliseconds. AFAIK it's because vm has to update list for that bucket for each element, which takes more time for each next element (for each object there is linear scan over all previous objects and then insert, it's a painter's algorithm).

@OTP-Maintainer
Copy link
Author

sverker said:

This is two separate problems:
# Bags are bad at handling lots of object with identical keys. Basically every operation you do will be linear scanning O(N) where N is number of objects with the key. This is a big job to fix, as it requires an extension of the bag design with some sort of secondary lookup index. *Workaround*: Don't put lots of identical key in your bags.
# ets:insert promise atomicity and does not yield. We have an idea of how to fix this that may make it into OTP-21. *Workaround*: Don't pass long lists to ets:insert, especially if it's just of convenience and you don't really need the atomicity.

When these two problems are combined, like in your example, you get a quadratic O(N*N) blocking time of the scheduler.

I will consider this report addressing the ets:insert yielding issue.

@OTP-Maintainer
Copy link
Author

sverker said:

In master branch planned for OTP-23 is now  an implementation that makes {{ets:insert/2}} and {{ets:insert_new/2}} yield the scheduler for long lists of records to insert.

The problem of quadratic time for insert of records with identical keys into bags is still unsolved. But at least it will yield and no longer block the scheduler.

 

@OTP-Maintainer OTP-Maintainer added bug Issue is reported as a bug team:VM Assigned to OTP team VM priority:medium labels Feb 10, 2021
@OTP-Maintainer OTP-Maintainer added this to the OTP-23.0 milestone Feb 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue is reported as a bug priority:medium team:VM Assigned to OTP team VM
Projects
None yet
Development

No branches or pull requests

2 participants