Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Key-Value store #5196

Closed
doublex opened this issue Mar 12, 2016 · 13 comments
Closed

Key-Value store #5196

doublex opened this issue Mar 12, 2016 · 13 comments

Comments

@doublex
Copy link

doublex commented Mar 12, 2016

Hello,
I would like to start some tests with this database.
Is it possible to use CockroachDB as key-value store?
If yes, is there a documentation?
Thanks a lot
Marcus

@doublex
Copy link
Author

doublex commented Mar 13, 2016

A lot of companies use Riak or Cassandra (or HBase). A key-value interface similar to Riak (especially the 2i index) would allow clients to switch from Riak/Cassandra to CockroachDB.

@Frank-Jin
Copy link

We have the similar requrement to replace cassandra, if CockroachDB can support the same interface.

@doublex
Copy link
Author

doublex commented Mar 14, 2016

We use "Riak". We don't need exactly the same interface - but a simple key/value API.
IMHO "CockroachDB" competes with "HBase", "Hypertable", "Riak", "Cassandra", ... - and not with "MySQL".

@vivekmenezes
Copy link
Contributor

Thanks for your interest. You can create a table with two columns Key and
Value and use it like a kv store if that is what you want. Perhaps you have
a more complex schema
which can also be accommodated using sql CREATE TABLE

Thanks

On Mon, Mar 14, 2016 at 4:23 AM doublex notifications@github.com wrote:

We use "Riak". We don't need exactly the same interface - but a simple
key/value API.
IMHO "CockroachDB" competes with "HBase", "Hypertable", "Riak",
"Cassandra", ... - and not with "MySQL".


Reply to this email directly or view it on GitHub
#5196 (comment)
.

@JackKrupansky
Copy link

To some extent it depends on what you mean by key-value store. Sure, way down at the bottom of the storage layer there is a BigTable-like key-value store in Cassandra, Riak (via LevelDB), and CockroachDB (via RocksDB), but... in none of these systems is key-value the intended client-level API. Cassandra in particular goes to great efforts to offer an SQL-like data model (CQL). CockroachDB offers a true SQL data model (well... when JOIN is added for 1.0).

So, if your goal with key-value is key-value STORAGE, CockroachDB is already there.

But if you have some sort of idea that the client-level API is simply key-value pairs, neither Cassandra nor CockroachDB is focused on that, although as you have seen, you can always create an SQL table that has just a key and a value column - which will actually translate into a key-value pair down at the RocksDB storage level.

If you are using Cassandra today, what interface are you using? The current, more modern CQL and Java driver interface (or the other languages that DaatStax offers drivers for), or the older, deprecated, and soon to be removed Thrift interface? Or Astyanax? But in any case, there is no raw key-value interface for Cassandra - nor is there any need for any.

Riak - or Riak KV as it is being renamed to (in addition to Riak TS and Riak S2) - is more key-value oriented. As their web site says, "With a key/value design that delivers powerful – yet simple – data models for storing massive amounts of unstructured data..." That's the key (ha ha!) to their value (ha ha!) - unstructured data, while both Cassandra and CockroachDB are focused on structured data. Sure, you can stored JSON data in a Cassandra or CockroachDB column as well, but that's not the primary focus of their data models. Cassandra has support for semi-structured data as well (explicit JSON support, collection data types including key/values with map, basic keyword text search with the new SASI feature, and DataStax Enterprise Search which supports full Solr search, including full text search.)

That said, are there specific features of Riak (or Cassandra) that are missing from CockroachDB that would make it easier to migrate to CockroachDB - short of a fully-compatible API?

Cassandra in particular uses a binary native protocol to communicate from the client-level drivers to the nodes of a Cassandra cluster. Sure, a comparable, compatible API could be implemented for CockroachDB, but... I doubt that is likely in the near future - unless there is strong demand. Also, the Cassandra driver goes to great lengths to direct queries to the nodes that own the primary key being referenced by the query - to avoid an extra hop from the gateway node to the owning node, by using a hashed partition key, which CockroachDB does not have. Not that a smart CockroachDB client could not do something comparable, but it would not likely be exactly compatible with Cassandra and its client drivers.

@doublex
Copy link
Author

doublex commented Mar 14, 2016

To create a table with two columns (key + value) does not work for a true key-value database.
In Riak for example you can add a variable number of secondary indexes for a record, e.g.:

curl -XPOST localhost:8098/types/indexes/buckets/users/keys/john_smith \
  -H 'x-riak-index-twitter_bin: jsmith123' \
  -H 'x-riak-index-email_bin: jsmith@basho.com' \
  -H 'Content-Type: application/json' \
  -d '{"userData":"data"}'

This gives you all the freedom you need. For example, if you want to index N-grams ("TESTER" -> "TES" "EST" "STE" "TER"). You don't know how many secondary keys you need when you create the table - simply because each record has a different amount of secondary indexes (there are a lot of this cases).

@JackKrupansky
Copy link

You make my point - there is no clear sense of agreement on what constitutes a... "true key-value database." A key-value store is obvious, but then if you try to extend the concept to databases, there is no obvious consensus. Again, are you talking about key-value as a storage-level concept or as a client-level data model. And... the topic title is simply "key-value store", with no hint as to true intent and focus.

To be clear, CockroachDB has no support for n-gram indexing today. Exactly what the full-text search features will look like remains to be seen - it is listed on the roadmap for 1.0 release, but the current focus is on the beta release, especially stability and performance.

If Riak meets your needs, then you are all set. If not, what might those unmet needs be?

Riak seems more focused on some specialized niches, rather than being a... true database.

@doublex
Copy link
Author

doublex commented Mar 14, 2016

You are right - there is no "true key-value database". Sorry.
Riak has a great API - but a bad design for secondary indexes (they are stored on the same machine as the record).

Is there a chance to index a variable number of secondary keys in CockroachDB? For example, is there a chance to index the values "name1", "name2", "name3" for the record "ABC"? If you ask the database-index "name" for "name2", the result would be "ABC"?

An other example where multiple indexes are needed is a geo spatial index (if you want to find a value with one hit, e.g. via hierarchical triangular mesh).

@doublex
Copy link
Author

doublex commented Mar 14, 2016

Hello,
Is there a chance to add a "list" datatype?
E.g.:

CREATE TABLE table_name
(
key BIGINT,
value TEXT,
name_index STRINGLIST
);

If "name_index" contains "name1", "name2" and "name3", then all three values get indexed?

Thanks a lot!
Marcus

@JackKrupansky
Copy link

Geo spatial indexing is also on the CockroachDB roadmap for 1.0:
#2132

Multiple indexes per row? Absolutely - this is SQL. Each column can be indexed. But if you wish to pursue specific questions on indexing (or other topics), best to do that on the user list:
https://groups.google.com/forum/#!forum/cockroach-db

@JackKrupansky
Copy link

List? Array is on the roadmap for 1.0, although I personally would prefer the Cassandra collection features for list and map.

@doublex
Copy link
Author

doublex commented Mar 14, 2016

Hi Jack,
Thanks a lot for your answer - and sorry for disturbing.
If you need a function to calculate the "hierarchical triangular mesh" id (+ 12 surrounding triangles) - I could give you this function (but it is written in C++).
Again - thanks a lot
Marcus

@doublex doublex closed this as completed Mar 14, 2016
@ggaaooppeenngg
Copy link
Contributor

I am not sure if it's worth mentioning it, I am using cockroach as a key value storage, and writes a package to wrap SQL as a kv client.

https://godoc.org/github.com/ggaaooppeenngg/crkv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants