New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vinyl: Don't bloat tx manager with rows when doing space:pairs() #2534
Comments
Blocked by #2700 |
alyapunov
added a commit
that referenced
this issue
Aug 21, 2017
RO TXs was removed by 9c57ed6. That caused uncontrolled memory overusage by simple space:pairs. Reimplement back read-only transactions. Send select/get/pairs to read view if read-write TX was not explicitly started before. Unfortunatelly the patch disables cache filling by autocommit TXs. Fixes #2534
locker
added a commit
that referenced
this issue
Dec 14, 2017
Currently, if tx is not NULL, tx->read_view is used, otherwise the global read view is used. For the sake of #2534 (read view for read only autocommit statements), we will need to pass an arbitrary read view to this function. So let's add the corresponding argument. This also allows us to drop env from the argument list. While we are at it, let's also inline vy_index_full_by_stmt() as it is a trivial wrapper around vy_index_get(). Needed for #2534
locker
added a commit
that referenced
this issue
Dec 14, 2017
Every iteration over a secondary index tracks a point in the transaction manager (due to lookup in the primary index). As a result, if the user calls 'select' or 'pairs' over a huge data set, it will consume a lot of memory due to this tracked points, even if the user doesn't uses transactions. To mitigate this, let's send all read only transactions to read view immediately so that tracking is disabled completely during iteration. Closes #2534
kostja
pushed a commit
that referenced
this issue
Dec 19, 2017
Currently, if tx is not NULL, tx->read_view is used, otherwise the global read view is used. For the sake of #2534 (read view for read only autocommit statements), we will need to pass an arbitrary read view to this function. So let's add the corresponding argument. This also allows us to drop env from the argument list. While we are at it, let's also inline vy_index_full_by_stmt() as it is a trivial wrapper around vy_index_get(). Needed for #2534
locker
added a commit
that referenced
this issue
Dec 19, 2017
Every iteration over a secondary index tracks a point in the transaction manager (due to lookup in the primary index). As a result, if the user calls 'select' or 'pairs' over a huge data set, it will consume a lot of memory due to this tracked points, even if the user doesn't uses transactions. To mitigate this, let's send all read only transactions to read view immediately so that tracking is disabled completely during iteration. Note, with this patch select() called outside a transaction doesn't populate the cache any more, but it seems to be OK as caching large select() requests results in cache thrashing. Closes #2534
locker
added a commit
that referenced
this issue
Dec 19, 2017
Every iteration over a secondary index tracks a point in the transaction manager (due to lookup in the primary index). As a result, if the user calls 'select' or 'pairs' over a huge data set, it will consume a lot of memory due to this tracked points, even if the user doesn't uses transactions. To mitigate this, let's send all read only transactions to read view immediately so that tracking is disabled completely during iteration. Note, with this patch select() called outside a transaction doesn't populate the cache any more, but it seems to be OK as caching large select() requests results in cache thrashing. Closes #2534
kostja
pushed a commit
that referenced
this issue
Dec 19, 2017
Every iteration over a secondary index tracks a point in the transaction manager (due to lookup in the primary index). As a result, if the user calls 'select' or 'pairs' over a huge data set, it will consume a lot of memory due to this tracked points, even if the user doesn't uses transactions. To mitigate this, let's send all read only transactions to read view immediately so that tracking is disabled completely during iteration. Note, with this patch select() called outside a transaction doesn't populate the cache any more, but it seems to be OK as caching large select() requests results in cache thrashing. Closes #2534
Fixed by a31c2c1 |
locker
added a commit
that referenced
this issue
Mar 7, 2018
When scanning a secondary index, we actually track each tuple in the transaction manager twice - as a part of the interval read from the secondary index and as a point in the primary index when retrieving the full tuple. This bloats the read set - instead of storing just one interval for a range request, we also store each tuple returned by it, which may count to thousands. There's no point in this extra tracking, because whenever we change a tuple in the primary index, we also update it in all secondary indexes. So let's remove it to save us some memory and cpu cycles. This is an alternative fix for #2534 It should also mitigate #3197
locker
added a commit
that referenced
this issue
Mar 7, 2018
When scanning a secondary index, we actually track each tuple in the transaction manager twice - as a part of the interval read from the secondary index and as a point in the primary index when retrieving the full tuple. This bloats the read set - instead of storing just one interval for a range request, we also store each tuple returned by it, which may count to thousands. There's no point in this extra tracking, because whenever we change a tuple in the primary index, we also update it in all secondary indexes. So let's remove it to save us some memory and cpu cycles. This is an alternative fix for #2534 It should also mitigate #3197
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Since the conflict manager still keeps every row, and doesn't have gaps, don't add every row read through a consistent read view to the conflict manager, otherwise you can easily run out of memory when iterating over a medium size space.
These rows are useless in the tx manager anyway.
The text was updated successfully, but these errors were encountered: