Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bigtable doesn't have a magical prefix range helper? #1790

Closed
jgeewax opened this issue Nov 14, 2016 · 4 comments
Closed

Bigtable doesn't have a magical prefix range helper? #1790

jgeewax opened this issue Nov 14, 2016 · 4 comments
Assignees
Labels
api: bigtable Issues related to the Bigtable API.

Comments

@jgeewax
Copy link
Contributor

jgeewax commented Nov 14, 2016

A common thing with Bigtable is the ability to scan based on a prefix, which is basically a range with start=prefix and end=someAlteration(prefix), but its tricky to do manually.

Any chance we can add a prefix filter that "just works"?


Go does this a bit trickily (see https://github.com/GoogleCloudPlatform/google-cloud-go/blob/master/bigtable/bigtable.go#L307), the gist seems to be (for those not familiar with go):

  • if the string is empty, the end range marker should just be empty string
  • otherwise, find the last character in the string that can be incremented (ie, abcde can increment position 4, abc\xff\xff can increment position 2.
  • using that pivot, take start.substr(0, n-1) and append the incremented character (start[n]+1)

I think -- in JS -- this would look something like....

var getPrefixEndRange = function(start) {
  var maxChar = String.fromCharCode(0xff);
  var position = start.length-1;

  // Walk backwards until we get to a character we can increment.
  while (start[position] == maxChar && position >= 0) position--;

  // If the position is -1, there is no reasonable end range for the prefix.
  if (position == -1) return '';

  var nextChar = String.fromCharCode(start.charCodeAt(position)+1)
  return start.substring(0, position) + nextChar;
}

Some test cases...

getPrefixEndRange('start'); // -> 'staru'
getPrefixEndRange('X' + String.fromCharCode(0xff)); // -> 'Y'
getPrefixEndRange('xoo' + String.fromCharCode(0xff)); // -> 'xop'
getPrefixEndRange('com.google.'); // -> 'com.google/'
getPrefixEndRange(String.fromCharCode(0xff)); // -> ''
getPrefixEndRange(''); // -> ''
@jgeewax jgeewax added api: bigtable Issues related to the Bigtable API. enhancement labels Nov 14, 2016
@callmehiphop
Copy link
Contributor

I can dig a prefix option. I might misunderstand, but couldn't we just use a row key filter?

Using the current implementation, I believe it would look similar to this

table.getRows({
  filter: {
    key: /^start/
  }
}, function(err, rows) {});

@mbrukman
Copy link
Contributor

Implementing a prefix row scan via row key regex filter is inefficient, because it will scan the entire table, but return only the rows that match. A prefix row scan is typically implemented by computing the exact (start, end) rows, so it will only scan the subset of the table that will actually be returned back to the caller.

@callmehiphop
Copy link
Contributor

Gotcha, ok, I'll get started on this then :)

@jgeewax
Copy link
Contributor Author

jgeewax commented Nov 15, 2016

Yes -- what @mbrukman said. We want to look at keys only, and stop when we get to "the next one". The little snippet I jotted down gets us "where to stop" (ie, the first non-matching lexicographic key prefix)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigtable Issues related to the Bigtable API.
Projects
None yet
Development

No branches or pull requests

4 participants