SPQR 2.0 roadmap: breaking changes and stuff. #431

reshke · 2024-01-19T21:27:58Z

reshke
Jan 19, 2024
Maintainer

At the very beginning of the SPQR project, we made a huge design decision. We decided to develop a simple PostgreSQL sharding solution that would be easy to configure and run. The main feature is considered to be query auto-routing, meaning the aim of the whole thing was a transactional router, which parses your query and resolves shards, responsible for query execution. To keep things as easy as possible, we rejected all schema-like proposals, and moved to SHARDING RULE routing schemas.

SHARDING RULE is simply a column name (which can be restricted within some relation name), defined in some Dataspace. Dataspace is simply a set of rules and key ranges. So, router logic for now is something like: parse query, and try to match in to some sharding rule. After that, try to learn some consts or params from query and route it to shard, based on key range definitions.

While this solution perfectly fits for some of our client requests and can actually be used to query routing, it is very hard to implement some other features with this design. One of them being multidimensional Key Ranges and multi-column sharding schemas. The other big problem is ambiguity in router logic, while dealing with const values. As sharding rule contains no information/hints about sharding column type, it becomes hard to deal with. For example, one can define sharding rule and key ranges, then try to route INTEGER and VARCHAR query with it. As integers are compared to each other in a different way than strings, it can (and it does) cause problems in deployment and development of such a solution.

So, we would like to propose some breaking change in our soon-to-be-released SPQR 2.0.

We would like to introduce a new way to configure your sharding. It is a Dataspace-based sharding schema.

The main idea is to use DATASPACE to store information about sharding column types. Moreover, we will now restrict all rules within one dataspace to have exact same number of sharding columns in them. Even more, we will refuse to route your query to any shard, if it contains no relations, for which sharding columns are defined, except some easy-to-support cases like select 1 or create table.
Key ranges will no longer be a global structure, key ranges will be attached to some dataspace, and requested to have the exact same number of key bounds as dataspace itself.

So, new syntax will look like (see [1]):

CREATE DATASPACE ds1 SHARDING COLUMN TYPES varchar, int RELATIONS t1(id, id2), t2(indx, indx2)
CREATE KEY RANGE IN DATASPACE ds krid2 FROM ("aa", 1) TO ("zz", 200) ROUTE TO shard1;

In this example we create a dataspace for 2-dimentional keys. Any PostgreSQL query, which accesses relation t1 and contain columns id and id2 can be routed to some shard. For example, SELECT * from users join t1 on true where id = 'ab' and id2 = 7 can be routed. Another example is DELETE from t2 where indx = 'ac' and indx2 = 7. If query does not contain sharded relations (t1 and t2 here), or does not have enough column specified, it will be considered un-routable (multishard).
UPDATE t2 SET colname = 'lolkek' where indx = 'ac' is one example.

SHARDING RULE would have no use at all, so can be dropped entirely.

[1] #416

Please ask your question if anything is unclear.
Maybe I'll answer. Who knows.

Denchick · 2024-01-22T08:41:38Z

Denchick
Jan 22, 2024
Maintainer

Some notes after our offline disscussion.

Naming
@x4m thinks that Keyspace is a better name for Dataspace.

Better syntax

CREATE KEYSPACE ds1 COLUMN TYPES varchar, int [RELATIONS t1(id, id2), t2(indx, indx2)];
CREATE KEY RANGE [krid2] IN KEY SPACE ds FROM ("aa",  1) [TO ("zz", 200)] ROUTE TO shard1;
ALTER KEY SPACE ds1 ATTACH RELATION t1(id, id2);
ALTER KEY SPACE ds1 DETACH RELATION t1(id, id2);

Branches
Create spqr-v1 branch for developing old spqr version.

0 replies

reshke · 2024-01-25T20:58:03Z

reshke
Jan 25, 2024
Maintainer Author

We decided to rename DATASPACE, as this name does not reflect its meaning.
Naming proposed to be DISTRIBUTION

#431

0 replies

Denchick · 2024-03-25T13:08:26Z

Denchick
Mar 25, 2024
Maintainer

it looks like most of the work has already been done, it remains to implement typed and multidimensional keys

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPQR 2.0 roadmap: breaking changes and stuff. #431

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

SPQR 2.0 roadmap: breaking changes and stuff. #431

reshke Jan 19, 2024 Maintainer

Replies: 3 comments

Denchick Jan 22, 2024 Maintainer

reshke Jan 25, 2024 Maintainer Author

Denchick Mar 25, 2024 Maintainer

reshke
Jan 19, 2024
Maintainer

Denchick
Jan 22, 2024
Maintainer

reshke
Jan 25, 2024
Maintainer Author

Denchick
Mar 25, 2024
Maintainer