Optimize sequential insert into memtable - Part 2: Implementation #1449

yiwu-arbug · 2016-10-31T20:53:09Z

Summary:

Implement a insert hint into skip-list to hint insert position. This is
to optimize for the write workload where there are multiple stream of
sequential writes. For example, there is a stream of keys of a1, a2,
a3... but also b1, b2, b2... Each stream are not neccessary strictly
sequential, but can get reorder a little bit. User can specify a prefix
extractor and the SkipListRep can thus maintan a hint for each of the
stream for fast insert into memtable.

This is the internal implementation part. See #1419 for the interface part.
See inline comments for details.

Test Plan:
See the new tests.

yiwu-arbug · 2016-10-31T20:54:14Z

cc @nbronson @mvm3k @al13n321

facebook-github-bot · 2016-10-31T21:20:29Z

@yiwu-arbug has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

siying · 2016-10-31T22:44:50Z

Will take a look. By the way, the Windows build failed.

yiwu-arbug · 2016-11-02T18:43:08Z

Offline discussed with @siying @IslamAbdelRahman @lightmark yesterday. Will try the follow way:

Only store a few levels of prev for binary search in small range.
If prev is not helping do a full binary search.
In this way the code will be much simpler and takes less memory overhead, and hopefully have comparable performance.

siying

I didn't finish reviewing it yet. Some comments I have so far.

siying · 2016-11-03T23:30:06Z

db/inlineskiplist.h

+template <class Comparator>
+void InlineSkipList<Comparator>::InsertWithHint(
+    const char* key, InsertHint** hint_ptr) {
+  hint_valid_.store(false, std::memory_order_relaxed);


Same as in InsertConcurrently, any further insert not using prev_ breaks the prev_ optimization in Insert().

siying · 2016-11-03T23:37:55Z

db/inlineskiplist.h

-    // NoBarrier_SetNext() suffices since we will add a barrier when
-    // we publish a pointer to "x" in prev[i].
-    x->NoBarrier_SetNext(i, prev_[i]->NoBarrier_Next(i));
-    prev_[i]->SetNext(i, x);


Can we avoid slowdown in existing Insert()? It does look like PushLast() is more expensive than SetNext(). Insert() is a very critical code path. Please make sure the performance doesn't regress.

In fact, I suggest we keep the Insert() logic as it is. Even the logic like hint_valid_ may cause performance regression.

PushLast() captures line 567-569 and line 581-585 in the existing code and they are doing exactly the same thing.

yiwu-arbug

Don't review the current code. I'll update the PR later today, which looks quite different than this version.

yiwu-arbug · 2016-11-04T00:33:17Z

db/inlineskiplist.h

-    // NoBarrier_SetNext() suffices since we will add a barrier when
-    // we publish a pointer to "x" in prev[i].
-    x->NoBarrier_SetNext(i, prev_[i]->NoBarrier_Next(i));
-    prev_[i]->SetNext(i, x);


PushLast() captures line 567-569 and line 581-585 in the existing code and they are doing exactly the same thing.

yiwu-arbug · 2016-11-04T00:34:40Z

db/inlineskiplist.h

+template <class Comparator>
+void InlineSkipList<Comparator>::InsertWithHint(
+    const char* key, InsertHint** hint_ptr) {
+  hint_valid_.store(false, std::memory_order_relaxed);


Same as in InsertConcurrently, any further insert not using prev_ breaks the prev_ optimization in Insert().

facebook-github-bot · 2016-11-08T02:35:42Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-08T02:37:01Z

PR updated. Will update inline comment and prepare docs to explain the implementation.

siying · 2016-11-08T17:48:13Z

db/inlineskiplist.h

+    return false;
+  }
+  Node* next = n->NoBarrier_Next(level);
+  return next == nullptr || compare_(key, n->Next(level)->Key()) < 0;


Why not next->key()?

siying · 2016-11-08T18:34:55Z

db/inlineskiplist.h

+  Node* FindLessThan(const char* key, Node** prev, Node* top, int start_level,
+                     int stop_level) const;
+
+  void FindWithHint(


Add comments explaining what start_level and stop_level mean.

siying · 2016-11-08T18:51:50Z

db/inlineskiplist.h

+  if (p != nullptr && h > hint->num_levels) {
+    hint->prev[hint->num_levels] = p;
+    hint->prev_height[hint->num_levels] = h;
+    hint->num_levels++;


This function is too hard for me to understand. Is there a way we go with a simple solution that treats hint->prev the same way as prev_?

I'm getting ~40% less comparisons with this approach per my benchmark (inserting 5M keys into a skip-list, there are 10000 prefixes and for each of the prefixes keys are mostly inserted sequentially but can be reordered up to 5 positions. Each prefix gets its own hint). The benchmark is probably in favor of the current approach, but that the perf gain looks a lot.

I'm not about to understand Mathmatically how there can be 40% saving. Only 1/4 of the chance prev[0] is more than one level. In this case, perhaps extra 6 comparisons will be made. For the other 3/4 cases, the average is perhaps 8 comparisons.

In your benchmark, how many average comparisons are issued per insert?

40% gain is comparing with a naive solution I run earlier but didn't sent a PR. Comparing this version with the version I present last week there's 20% gain, with average comparisons per insert being ~6.4 vs ~5.3.

siying · 2016-11-08T23:09:02Z

In terms of correctness, does it make sense to write a validation function to validate the skip list and use it in the unit test? It's very hard to prove correctness just by code review.

yiwu-arbug · 2016-11-08T23:59:27Z

@siying that's my plan. I'm wanting to send the PR before finishing the test to get early feedback.

facebook-github-bot · 2016-11-09T00:30:34Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-09T00:33:17Z

Updated with inline comments, address comments and fix test failures.

Pending unit test. Will send benchmark code in a separate PR.

@siying If you want I can prepare a quip doc with better explanation.

siying · 2016-11-08T23:30:04Z

db/inlineskiplist.h

+      break;
+    }
+  }
+  if (level >= hint->num_levels) {


I'm confused here. How can level > hint->num_levels?

I mean if (level == hint->num_levels) here. Will update.

siying · 2016-11-09T00:05:50Z

db/inlineskiplist.h

+  if (level > stop_level) {
+    FindLessThan(key, hint->prev, hint->prev[level], level, stop_level);
+  }
+}


This function confused me. I see the function is used in two places and serve very different cases. Can we have two functions instead?

siying · 2016-11-09T00:43:59Z

db/inlineskiplist.h

+    }
+  }
+  if (level >= hint->num_levels) {
+    FindWithHint(key, hint, std::max<int>(level, height - 1), level);


This too hard for me too understand. If I understand correctly, in this case we basically start from the root. Can we write specific code for it?

siying · 2016-11-09T00:46:19Z

db/inlineskiplist.h

+      hint_max_height = std::max<int>(hint_max_height, hint->prev_height[i]);
+    }
+    if (height > hint_max_height) {
+      FindWithHint(key, hint, height - 1, hint_max_height);


We are looking for prev from root, right? Can we write more specific code for that, rather than a general FindWithHint()?

In both cases where I call FindWithHint(), I'm looking for the lowest level where hint->prev is a valid prev, and search from that level. In the worst case it can search from root.

yiwu-arbug · 2016-11-09T07:17:05Z

@siying I added some inline comments which hopefully give better explanation. Hope they helps, or we can discuss offline.

siying

We should discuss offline about how we can make it easier to maintain.

siying · 2016-11-09T18:02:40Z

db/inlineskiplist.h

+  // [stop_level, start_level]. Using previous value of hint->prev to help
+  // speed-up the search.
+  void FindWithHint(const char* key, InsertHint* hint, int start_level,
+                    int stop_level) const;


It is still not clear to me what this function does after reading the comment. We can discuss offline.

Also maybe rename it to something like AdjustHintInPrev().

siying · 2016-11-09T18:28:37Z

db/inlineskiplist.h

+    while (current_level < height && current_level < hint->prev_height[i]) {
+      assert(KeyIsAfterNode(key, hint->prev[i]));
+      assert(!KeyIsAfterNode(key, hint->prev[i]->Next(current_level)));
+      x->InsertAfter(hint->prev[i], current_level);


One thing made it hard for me to understand the code is the dual function of hint->prev. It is used as the location to insert the key here, but some part of it is also the position to insert the new entry.

If we can separate the two. Use another local array for the position to insert, just as tmp in line 828 to 835, it may be easier to understand.

yiwu-arbug · 2016-11-09T18:48:34Z

@siying I agree with you. After reading your comment, I think the upper half of prev together with FindWithHint() gives little benefit. I'm to remove them and make it cleaner.

facebook-github-bot · 2016-11-10T00:26:04Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-10T00:26:54Z

Removed FindWithHint() and related logic.

facebook-github-bot · 2016-11-10T02:55:43Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-10T02:56:31Z

Make InsertHint struct public. Add unit tests to inlineskiplist_test.

yiwu-arbug · 2016-11-10T07:03:51Z

clang-format; Fix windows build.

facebook-github-bot · 2016-11-10T07:04:29Z

@yiwu-arbug updated the pull request - view changes - changes since last import

siying

It's much clearer to me. I don't have further comment for the code. A comment about the test validation.

siying · 2016-11-10T18:41:32Z

db/inlineskiplist_test.cc

+      iter.Next();
+    }
+    ASSERT_FALSE(iter.Valid());
+  }


This validation only validates the link of the next level is correct.

To validate it is a valid skip list, we also need to validate every level is at the correct order, for every node for a higher level ink, the node is included in lower level too.

siying · 2016-11-10T18:42:58Z

db/inlineskiplist_test.cc

+  }
+  Validate(list);
+}
+


I also suggest we create a randomize test and validate the skip list after that.

Both InsertWithHint_MultipleHintsRandom and InsertWithHintAndWithoutHint has some randomness in it. Any other random test you want me to add?

It's fine then. Sorry I missed that.

yiwu-arbug

@siying thank you for reviewing the complex diff!

yiwu-arbug · 2016-11-10T19:31:38Z

db/inlineskiplist_test.cc

+  }
+  Validate(list);
+}
+


Both InsertWithHint_MultipleHintsRandom and InsertWithHintAndWithoutHint has some randomness in it. Any other random test you want me to add?

yiwu-arbug · 2016-11-10T19:31:54Z

db/inlineskiplist_test.cc

+      iter.Next();
+    }
+    ASSERT_FALSE(iter.Valid());
+  }


facebook-github-bot · 2016-11-12T01:18:16Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-12T01:19:19Z

Tests validate the skiplist on all levels.

facebook-github-bot · 2016-11-12T01:20:47Z

@yiwu-arbug updated the pull request - view changes - changes since last import

siying · 2016-11-12T01:30:25Z

db/inlineskiplist.h

+    nodes[i] = head_;
+  }
+  while (nodes[0] != nullptr) {
+    nodes[0] = nodes[0]->Next(0);


Should assert nodes[0] is smaller than nodes[0]->Next(0).

siying · 2016-11-12T01:38:18Z

db/inlineskiplist.h

+  }
+  for (int i = 1; i < max_height; i++) {
+    assert(nodes[i]->Next(i) == nullptr);
+  }


I think you also need to verify all the levels in all the nodes are used. Otherwise, you may get something like this:

+-+ +-+ ++ | | | | || | | | | || | | | | || | | | | || | +------------------------> |------> | | | | || | | | | || | +-----------> | +------------------> | | | | || | | | | || | | +---------> | | +------> || +-+ +-+ ++

which will not get you the correct result.

Line 918-920 is verifying it. I only advance nodes when it appears on level 0. If nodes[i]->next(i) is nullptr, that means all nodes on level i do appears on level < i.

facebook-github-bot · 2016-11-12T01:48:12Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-12T01:48:24Z

Assert nodes[0] < nodes[0]->Next().

siying

Good to go!

siying · 2016-11-12T02:13:29Z

db/inlineskiplist.h

+        //   * no other nodes less than prev[level-1] has height greater than
+        //     current_level, and prev[level-1] > key.
+        assert(KeyIsAfterNode(key, hint->prev[i]));
+        assert(!KeyIsAfterNode(key, hint->prev[i]->Next(current_level)));


I still feel we should turn those asserts to actual check, just to be safe. If the check fails, simply fall back to normal Insert().

I don't think it is a good reason to turn assert into actual check just for safety. Anything wrong in the skiplist will crash quite fatally with random tests. But I think if we can remove the requirement "keys with the same hint has to be consecutive", i.e. if we detect keys doesn't following the requirement, invalidate the hint and start over, then it make sense.

will work on it on a separate PR.

Summary: Implement a insert hint into skip-list to hint insert position. This is to optimize for the write workload where there are multiple stream of sequential writes. For example, there is a stream of keys of a1, a2, a3... but also b1, b2, b2... Each stream are not neccessary strictly sequential, but can get reorder a little bit. User can specify a prefix extractor and the `SkipListRep` can thus maintan a hint for each of the stream for fast insert into memtable. This is the internal implementation part. See #1419 for the interface part. See inline comments for details. Test Plan: See the new tests.

facebook-github-bot · 2016-11-13T20:07:06Z

@yiwu-arbug updated the pull request - view changes - changes since last import

yiwu-arbug · 2016-11-13T20:07:27Z

Fix lint error.

yiwu-arbug assigned siying Oct 31, 2016

siying suggested changes Nov 3, 2016

View reviewed changes

yiwu-arbug commented Nov 4, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from b8a2774 to 6560d3f Compare November 8, 2016 02:35

siying reviewed Nov 8, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from 6560d3f to 0c4de23 Compare November 9, 2016 00:30

siying reviewed Nov 9, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from 0c4de23 to cb5edc3 Compare November 10, 2016 00:25

yiwu-arbug force-pushed the insert_hint branch from cb5edc3 to 984f09b Compare November 10, 2016 02:55

yiwu-arbug force-pushed the insert_hint branch from 984f09b to c4c7114 Compare November 10, 2016 07:03

siying reviewed Nov 10, 2016

View reviewed changes

yiwu-arbug commented Nov 10, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from c4c7114 to b1dea02 Compare November 12, 2016 01:18

yiwu-arbug force-pushed the insert_hint branch from b1dea02 to ae0d98d Compare November 12, 2016 01:20

yiwu-arbug mentioned this pull request Nov 12, 2016

Optimize sequential insert into memtable - Part 1: Interface #1419

Closed

siying reviewed Nov 12, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from ae0d98d to 391e8ec Compare November 12, 2016 01:47

siying approved these changes Nov 12, 2016

View reviewed changes

siying reviewed Nov 12, 2016

View reviewed changes

yiwu-arbug force-pushed the insert_hint branch from 391e8ec to 4169336 Compare November 13, 2016 20:06

facebook-github-bot closed this in df5eeb8 Nov 13, 2016

yiwu-arbug deleted the insert_hint branch November 13, 2016 23:02

Optimize sequential insert into memtable - Part 2: Implementation #1449

Optimize sequential insert into memtable - Part 2: Implementation #1449

Conversation

yiwu-arbug commented Oct 31, 2016 • edited Loading

yiwu-arbug commented Oct 31, 2016

facebook-github-bot commented Oct 31, 2016

siying commented Oct 31, 2016

yiwu-arbug commented Nov 2, 2016

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiwu-arbug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 8, 2016

yiwu-arbug commented Nov 8, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

siying commented Nov 8, 2016

yiwu-arbug commented Nov 8, 2016

facebook-github-bot commented Nov 9, 2016

yiwu-arbug commented Nov 9, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiwu-arbug commented Nov 9, 2016

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiwu-arbug commented Nov 9, 2016

facebook-github-bot commented Nov 10, 2016

yiwu-arbug commented Nov 10, 2016

facebook-github-bot commented Nov 10, 2016

yiwu-arbug commented Nov 10, 2016

yiwu-arbug commented Nov 10, 2016

facebook-github-bot commented Nov 10, 2016

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

yiwu-arbug left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 12, 2016

yiwu-arbug commented Nov 12, 2016

facebook-github-bot commented Nov 12, 2016

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 12, 2016

yiwu-arbug commented Nov 12, 2016

siying left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Nov 13, 2016

yiwu-arbug commented Nov 13, 2016

yiwu-arbug commented Oct 31, 2016 •

edited

Loading