Skip to content

Commit

Permalink
feat: Add String type with Utf8Raw encoding to Bigtable API
Browse files Browse the repository at this point in the history
Bigtable will allow users to configure the type of a column family with string type

PiperOrigin-RevId: 636631633
  • Loading branch information
Google APIs authored and Copybara-Service committed May 23, 2024
1 parent 1834a96 commit 89a8364
Showing 1 changed file with 31 additions and 3 deletions.
34 changes: 31 additions & 3 deletions google/bigtable/admin/v2/types.proto
Original file line number Diff line number Diff line change
Expand Up @@ -41,18 +41,18 @@ option ruby_package = "Google::Cloud::Bigtable::Admin::V2";
// * Natural sort: Does the encoded value sort consistently with the original
// typed value? Note that Bigtable will always sort data based on the raw
// encoded value, *not* the decoded type.
// - Example: STRING values sort in the same order as their UTF-8 encodings.
// - Example: BYTES values sort in the same order as their raw encodings.
// - Counterexample: Encoding INT64 to a fixed-width STRING does *not*
// preserve sort order when dealing with negative numbers.
// INT64(1) > INT64(-1), but STRING("-00001") > STRING("00001).
// - The overall encoding chain sorts naturally if *every* link does.
// - The overall encoding chain has this property if *every* link does.
// * Self-delimiting: If we concatenate two encoded values, can we always tell
// where the first one ends and the second one begins?
// - Example: If we encode INT64s to fixed-width STRINGs, the first value
// will always contain exactly N digits, possibly preceded by a sign.
// - Counterexample: If we concatenate two UTF-8 encoded STRINGs, we have
// no way to tell where the first one ends.
// - The overall encoding chain is self-delimiting if *any* link is.
// - The overall encoding chain has this property if *any* link does.
// * Compatibility: Which other systems have matching encoding schemes? For
// example, does this encoding have a GoogleSQL equivalent? HBase? Java?
message Type {
Expand All @@ -78,6 +78,31 @@ message Type {
Encoding encoding = 1;
}

// String
// Values of type `String` are stored in `Value.string_value`.
message String {
// Rules used to convert to/from lower level types.
message Encoding {
// UTF-8 encoding
// * Natural sort? No (ASCII characters only)
// * Self-delimiting? No
// * Compatibility?
// - BigQuery Federation `TEXT` encoding
// - HBase `Bytes.toBytes`
// - Java `String#getBytes(StandardCharsets.UTF_8)`
message Utf8Raw {}

// Which encoding to use.
oneof encoding {
// Use `Utf8Raw` encoding.
Utf8Raw utf8_raw = 1;
}
}

// The encoding to use when converting to/from lower level types.
Encoding encoding = 1;
}

// Int64
// Values of type `Int64` are stored in `Value.int_value`.
message Int64 {
Expand Down Expand Up @@ -140,6 +165,9 @@ message Type {
// Bytes
Bytes bytes_type = 1;

// String
String string_type = 2;

// Int64
Int64 int64_type = 5;

Expand Down

0 comments on commit 89a8364

Please sign in to comment.