SPQR 2.0 roadmap: breaking changes and stuff. #431
reshke
announced in
Announcements
Replies: 3 comments
-
Some notes after our offline disscussion. Naming Better syntax
Branches |
Beta Was this translation helpful? Give feedback.
0 replies
-
We decided to rename |
Beta Was this translation helpful? Give feedback.
0 replies
-
it looks like most of the work has already been done, it remains to implement typed and multidimensional keys |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
At the very beginning of the SPQR project, we made a huge design decision. We decided to develop a simple PostgreSQL sharding solution that would be easy to configure and run. The main feature is considered to be query auto-routing, meaning the aim of the whole thing was a transactional router, which parses your query and resolves shards, responsible for query execution. To keep things as easy as possible, we rejected all schema-like proposals, and moved to
SHARDING RULE
routing schemas.SHARDING RULE
is simply a column name (which can be restricted within some relation name), defined in some Dataspace. Dataspace is simply a set of rules and key ranges. So, router logic for now is something like: parse query, and try to match in to some sharding rule. After that, try to learn some consts or params from query and route it to shard, based on key range definitions.While this solution perfectly fits for some of our client requests and can actually be used to query routing, it is very hard to implement some other features with this design. One of them being multidimensional Key Ranges and multi-column sharding schemas. The other big problem is ambiguity in router logic, while dealing with const values. As sharding rule contains no information/hints about sharding column type, it becomes hard to deal with. For example, one can define sharding rule and key ranges, then try to route INTEGER and VARCHAR query with it. As integers are compared to each other in a different way than strings, it can (and it does) cause problems in deployment and development of such a solution.
So, we would like to propose some breaking change in our soon-to-be-released SPQR 2.0.
We would like to introduce a new way to configure your sharding. It is a Dataspace-based sharding schema.
The main idea is to use DATASPACE to store information about sharding column types. Moreover, we will now restrict all rules within one dataspace to have exact same number of sharding columns in them. Even more, we will refuse to route your query to any shard, if it contains no relations, for which sharding columns are defined, except some easy-to-support cases like
select 1
orcreate table.
Key ranges will no longer be a global structure, key ranges will be attached to some dataspace, and requested to have the exact same number of key bounds as dataspace itself.
So, new syntax will look like (see [1]):
CREATE DATASPACE ds1 SHARDING COLUMN TYPES varchar, int RELATIONS t1(id, id2), t2(indx, indx2)
CREATE KEY RANGE IN DATASPACE ds krid2 FROM ("aa", 1) TO ("zz", 200) ROUTE TO shard1;
In this example we create a dataspace for 2-dimentional keys. Any PostgreSQL query, which accesses relation t1 and contain columns id and id2 can be routed to some shard. For example,
SELECT * from users join t1 on true where id = 'ab' and id2 = 7
can be routed. Another example isDELETE from t2 where indx = 'ac' and indx2 = 7
. If query does not contain sharded relations (t1 and t2 here), or does not have enough column specified, it will be considered un-routable (multishard).UPDATE t2 SET colname = 'lolkek' where indx = 'ac'
is one example.SHARDING RULE
would have no use at all, so can be dropped entirely.[1] #416
Please ask your question if anything is unclear.
Maybe I'll answer. Who knows.
Beta Was this translation helpful? Give feedback.
All reactions