Bigtable doesn't have a magical prefix range helper? #1790

jgeewax · 2016-11-14T18:40:57Z

A common thing with Bigtable is the ability to scan based on a prefix, which is basically a range with start=prefix and end=someAlteration(prefix), but its tricky to do manually.

Any chance we can add a prefix filter that "just works"?

Go does this a bit trickily (see https://github.com/GoogleCloudPlatform/google-cloud-go/blob/master/bigtable/bigtable.go#L307), the gist seems to be (for those not familiar with go):

if the string is empty, the end range marker should just be empty string
otherwise, find the last character in the string that can be incremented (ie, abcde can increment position 4, abc\xff\xff can increment position 2.
using that pivot, take start.substr(0, n-1) and append the incremented character (start[n]+1)

I think -- in JS -- this would look something like....

var getPrefixEndRange = function(start) {
  var maxChar = String.fromCharCode(0xff);
  var position = start.length-1;

  // Walk backwards until we get to a character we can increment.
  while (start[position] == maxChar && position >= 0) position--;

  // If the position is -1, there is no reasonable end range for the prefix.
  if (position == -1) return '';

  var nextChar = String.fromCharCode(start.charCodeAt(position)+1)
  return start.substring(0, position) + nextChar;
}

Some test cases...

getPrefixEndRange('start'); // -> 'staru'
getPrefixEndRange('X' + String.fromCharCode(0xff)); // -> 'Y'
getPrefixEndRange('xoo' + String.fromCharCode(0xff)); // -> 'xop'
getPrefixEndRange('com.google.'); // -> 'com.google/'
getPrefixEndRange(String.fromCharCode(0xff)); // -> ''
getPrefixEndRange(''); // -> ''

The text was updated successfully, but these errors were encountered:

callmehiphop · 2016-11-14T19:06:40Z

I can dig a prefix option. I might misunderstand, but couldn't we just use a row key filter?

Using the current implementation, I believe it would look similar to this

table.getRows({
  filter: {
    key: /^start/
  }
}, function(err, rows) {});

mbrukman · 2016-11-14T19:50:34Z

Implementing a prefix row scan via row key regex filter is inefficient, because it will scan the entire table, but return only the rows that match. A prefix row scan is typically implemented by computing the exact (start, end) rows, so it will only scan the subset of the table that will actually be returned back to the caller.

callmehiphop · 2016-11-14T19:55:36Z

Gotcha, ok, I'll get started on this then :)

jgeewax · 2016-11-15T11:18:41Z

Yes -- what @mbrukman said. We want to look at keys only, and stop when we get to "the next one". The little snippet I jotted down gets us "where to stop" (ie, the first non-matching lexicographic key prefix)

jgeewax added api: bigtable Issues related to the Bigtable API. enhancement labels Nov 14, 2016

callmehiphop mentioned this issue Nov 17, 2016

Bigtable prefix #1802

Merged

stephenplusplus closed this as completed in #1802 Nov 17, 2016

JustinBeckwith assigned stephenplusplus Feb 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bigtable doesn't have a magical prefix range helper? #1790

Bigtable doesn't have a magical prefix range helper? #1790

jgeewax commented Nov 14, 2016 •

edited

callmehiphop commented Nov 14, 2016

mbrukman commented Nov 14, 2016

callmehiphop commented Nov 14, 2016

jgeewax commented Nov 15, 2016

Bigtable doesn't have a magical prefix range helper? #1790

Bigtable doesn't have a magical prefix range helper? #1790

Comments

jgeewax commented Nov 14, 2016 • edited

callmehiphop commented Nov 14, 2016

mbrukman commented Nov 14, 2016

callmehiphop commented Nov 14, 2016

jgeewax commented Nov 15, 2016

jgeewax commented Nov 14, 2016 •

edited