Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YCQL] comparable TIMEUUID #801

Open
patrick-li opened this issue Jan 25, 2019 · 12 comments
Open

[YCQL] comparable TIMEUUID #801

patrick-li opened this issue Jan 25, 2019 · 12 comments
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature kind/question This is a question priority/medium Medium priority issue
Projects

Comments

@patrick-li
Copy link

patrick-li commented Jan 25, 2019

Jira Link: DB-4712
There is a data type TIMEUUID. and two functions totimestamp() and tounixtimestamp. it acts like an unique timestamp.
My question is whether it can be sorted/order by, >='2019-01-24T05:26:51' , <= '2019-01-24T05:26:51', or as clustering key.

Thanks.

@kmuthukk kmuthukk added the kind/question This is a question label Jan 25, 2019
@kmuthukk kmuthukk added this to To Do in YCQL via automation Jan 25, 2019
@m-iancu
Copy link
Contributor

m-iancu commented Jan 25, 2019

@patrick-li A TIMEUUID (Uuid type 1: https://tools.ietf.org/html/rfc4122) is formed from a timestamp component (most-significant-bytes) and a node/clock component (least significant bytes). It will be sorted by the timestamp component first and then (assuming timestamp is equal) by the (typically random) node/clock component.

So it will be sorted as you expect on a clustering key, and it is also possible (but see details below) to "compare" with a timestamp value.

For the latter use-case, one must be careful though since multiple timeuuids can have the same timestamp value and then they will be compared by the clock/node component.
That's why CQL provides two functions minTimeUUID(<timestamp>)
and maxTimeUUID(<timestamp>) that should be used in the right hand side for such comparisons -- they generate corresponding timeuuids with min/max node/clock component so that they include all regular timeuuids with that timestamp in the comparison.
See this test for an example of how to use it.
E.g. something like:

SELECT * FROM test_minmax WHERE h = 2 AND r >= minTimeuuid('2019-01-25 00:34:32+0000') 
     AND r < maxTimeuuid('2019-01-25 00:40:47.758+0000');

Also see this comment for some more details.

Also, just to note since you mentioned them, totimestamp and tounixtimestamp are almost the same except that the first returns a value of type timestamp and the second of type bigint (but they logically represent the same value).

@patrick-li
Copy link
Author

@m-iancu , thanks for quick response and reminding. You may update your online docs 'API Reference/YCQL/UUID & TIMEUUID'.
A further question is whether there is a function, like this: TIMEUUID('2019-01-24T05:26:51.123'), so that when we insert a record with a timestamp we already have, we can directly insert a timestamp as TIMEUUID.
An extra one is the Date ISO String supports 'Z', not only '+0000'. in Nodejs, date.toISOString() will generate a string like : '2019-01-24T05:26:51.123Z'. Hope Yugabyte can accept it sooner.

@patrick-li
Copy link
Author

'minTimeuuid()' looks like what I want ? TIMEUUID = minTimeuuid() ?

@patrick-li
Copy link
Author

Will twice call 'minTimeuuid()' return same TIMEUUID?
minTimeuuid('2019-01-24T05:26:51.123') =?= minTimeuuid('2019-01-24T05:26:51.123')

@m-iancu
Copy link
Contributor

m-iancu commented Jan 25, 2019

@patrick-li Yes calling minTimeuuid twice should return exactly the same timeuuid. It should generally not be used to insert values for the following reasons:

  1. min/maxTimeUUID produce "pseudo-timeuuids" that do not technically conform to the RFC spec (i.e. clock/node component has some extra properties/restrictions too).
  2. Since they always produce the same clock/node component chance of collision is high (otherwise one could just use timestamp.
  3. They comparisons behavior may get confusing because now results may be weird when using e.g. <= vs < with minTimeuuid.

If I understand correctly, you needs something like the function now() but that actually takes a timestamp as an argument. Let me get back to you on that soon.

@kmuthukk
Copy link
Collaborator

I forked off this sub-issue into its own issue (#802), as a separate enhancement request.

@patrick-li
Copy link
Author

no, not current time. but an existing timestamp like '2019-01-24T05:26:51.123'

@patrick-li
Copy link
Author

I guess my requirement may break the TIMEUUID uniqueness. the length of the TIMEUUID is longer/stronger enough for a specific timestamp, no duplicate TIMEUUID can be generated or most likely. if there is a constructor with a timestamp, sometime the TIMEUUID will be duplicate. OK. Thank you guys.

@kmuthukk
Copy link
Collaborator

hi @patrick-li

a) I think it is possible for us to implement a builtin function/constructor for timeuuid which takes a timestamp and returns a unique timeuuid whose timestamp component is the specified timestamp.

It is just that we don't have such a built-in yet. Hence, filed #802 as an enhancement request. I couldn't find anything in JDK 8 also where you could do the same in the application and insert as a timeuuid into the database.

b) In the interim, want to understand your requirement better. Suppose your table is:

CREATE TABLE stock_profile(
       stock_symbol text,
       version timeuuid,
       details text,
   PRIMARY KEY ((stock_symbol), version))
   WITH CLUSTERING ORDER BY (version DESC);

and assume we had a builtin timeuuidFromTimestamp(), and recall that YCQL/Cassandra's INSERT really has UPSERT semantics by default on colliding primary keys.

So the question is, do you want:

INSERT INTO stock_profile(stock_symbol, version, details) 
    VALUES ('ORCL', timeuuidFromTimestamp('2019-01-25 00:40:47.758+0000'), 'details-1');
INSERT INTO stock_profile(stock_symbol, version, details) 
    VALUES ('ORCL', timeuuidFromTimestamp('2019-01-25 00:40:47.758+0000'), 'details-2');

to end up with 2 rows or 1 row (where the second one overwrites the first row because the primary key matches)?

I am guessing you want the above to end up as two different rows, correct?

@patrick-li
Copy link
Author

Yes, exactly.
CREATE TABLE test_table (
id1 text,
id2 text,
found_at timeuuid,
details text,
PRIMARY KEY ((id1, id2), found_at))
WITH CLUSTERING ORDER BY (found_at DESC);

Sometime, id1 + id2 + timestamp got to be duplicate. then missing record(s), only the last one saved. I wrote a local one = timestamp(UTC number) * 100 + random() * 100. if TIMEUUID works, that's the best.

@kmuthukk
Copy link
Collaborator

How about this instead, where you use a 2-part clustering column based on a timestamp (ts) and uuid (rand_uuid). See example below.

You could bind in a application generated uuid (using Java) as the tie-breaker for the rand_uuid column. [You could also use timeuuid type instead for rand_uuid column and use now() as the tiebreaker.]

CREATE TABLE IF NOT EXISTS test_table(
    id1 text,
    id2 text,
    ts timestamp,
    rand_uuid uuid,
    details text,
 PRIMARY KEY ((id1, id2), ts, uuid))
 WITH CLUSTERING ORDER BY (ts DESC);

But your queries wouldn't have to care about the rand_uuid column. They could simply be like:

SELECT * from test_table 
WHERE ts < '2017-07-04 12:31:30 UTC' 
AND id1='1' and id2='1';

@patrick-li
Copy link
Author

cool. let me have a try.

@rthallamko3 rthallamko3 added the area/ysql Yugabyte SQL (YSQL) label Dec 29, 2022
@yugabyte-ci yugabyte-ci added the priority/medium Medium priority issue label Dec 29, 2022
@yugabyte-ci yugabyte-ci changed the title comparable TIMEUUID [YCQL] comparable TIMEUUID Jan 4, 2023
@yugabyte-ci yugabyte-ci added kind/new-feature This is a request for a completely new feature status/awaiting-triage Issue awaiting triage kind/enhancement This is an enhancement of an existing feature and removed kind/new-feature This is a request for a completely new feature status/awaiting-triage Issue awaiting triage labels Jan 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/enhancement This is an enhancement of an existing feature kind/question This is a question priority/medium Medium priority issue
Projects
YCQL
  
To Do
Status: No status
Development

No branches or pull requests

5 participants