Skip to content
This repository has been archived by the owner on Dec 22, 2020. It is now read-only.

Batch inserts #68

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Batch inserts #68

wants to merge 1 commit into from

Conversation

macobo
Copy link
Contributor

@macobo macobo commented Sep 18, 2014

This pull request adds support for batching sequential INSERTs when doing tailing, speeding up tailing under certain conditions while not being slower than the current state. See also issue #47.

r? @nelhage
cc @snoble

Basic strategy is to batch consecutive inserts together per namespace. Batch gets saved whenever:

  • An update or delete is done to the same namespace as the insert
  • After streaming (up to) 1000 updates from oplog, time from last batch update is larger than 5 seconds.
  • More than a threshold of updates have happened in this namespace.
  • Program is exiting/streaming stops.

Some handwavy measurements for tailing 20000 oplog entries:

  • Speed is roughly the same on current master and this when alternating between doing inserts and updates. (~350s on local machine)
  • 10 inserts per update: ~4.6x faster (76s on local machine)
  • 20 inserts per update: ~7.4x faster (47s)
  • 50 inserts per update: ~11x faster (32s)
  • 1000 inserts per update: ~31x faster (~11.1s, though probably running into measurement overhead here)

Notes on potential future work (That I may or may not be working on soonish):

The next "low hanging" performance fruit to work on after this would be to optimize updates, though this wouldn't have this large of an effect.

Some ideas on how can be done: $set entries in oplog can directly be translated into postgres queries only updating those columns mentioned. Updates without $set can replace the current row in postgres with the data in oplog entry. Tricky part here is figuring out if/how this applies to tokumx even after mongoriver does oplog entry translation (if they support any other $ operations) and unset.

Another performance improvement would be to have multiple tailers in either separate threads or processes, separated by namespace. This would however keeping multiple tailing states in database (one per namespace) and I'm not quite sure what the performance implications are for mongo for querying the same oplog (with filters?) from multiple processes.

Basic strategy is to batch consecutive inserts together per namespace. Batch gets saved whenever:
- An update or delete is done to the same namespace as the insert
- After streaming (up to) 1000 updates from oplog, time from last batch update is larger than 5 seconds.
- More than a threshold of updates have happened in this namespace.
- Program is exiting/streaming stops.
@@ -170,14 +174,31 @@ def optail
if tail_from.is_a? Time
tail_from = tailer.most_recent_position(tail_from)
end

last_batch_insert = Time.now
tailer.tail(:from => tail_from)
until @done
tailer.stream(1000) do |op|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't done the digging to confirm this, but I'm pretty sure the contract on Tailer by default is that once your block returns, the op is considered to have been handled, and the timestamp may be persisted to postgres. However, with batched inserts, we haven't actually processed the op until we've flushed the inserts, so this could result in data loss if we save a timestamp before flushing the inserts.

mongoriver does have a batch mode, which allows you to explicitly mark batches and tell mongoriver when you're done with a batch. Unfortunately I've forgotten the details, so you'll probably have to source-dive :(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!
My original assumption was that if the process is told to stop, it would flush due to the signal handler. However in hindsight you're right - if something catastrophic happens the data would not get flushed resulting in data loss.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we can't assume we'll get to shutdown gracefully -- we need to handle the case where the machine dies, the program gets killed via SIGKILL, whatever.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the author is inactive for long time, I create a branch based on this where I added the @nelhage suggestions: #137
I hope someone can see this.

@nelhage
Copy link
Contributor

nelhage commented Sep 23, 2014

Modulo the concerns around making sure we don't update timestamps too early, I think this lgtm.

@barretod
Copy link

Did you figure out how to address the concerns around timestamps? We really need this optimization in our environment.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants