Skip to content

Commit a831265

Browse files
committed
Days 65-66
1 parent 93296fd commit a831265

5 files changed

+361
-0
lines changed
Lines changed: 243 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,243 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1730107298369171943), [LinkedIn post]().
2+
3+
---
4+
5+
# UUID v7 and partitioning (TimescaleDB)
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
Ok, you've asked – here it is, a draft recipe to use UUIDv7 and partitioning (we'll
11+
use [@TimescaleDB](https://twitter.com/TimescaleDB)). It's not super elegant, might be not the best, and requires some
12+
effort to reach efficient plans (with partition pruning involved). If you have an alternative or improvement ideas in
13+
mind – let me know.
14+
15+
We'll take the function by [@DanielVerite](https://twitter.com/DanielVerite) to generate UUIDv7 as basis:
16+
17+
```sql
18+
create or replace function uuid_generate_v7() returns uuid
19+
as $$
20+
-- use random v4 uuid as starting point (which has the same variant we need)
21+
-- then overlay timestamp
22+
-- then set version 7 by flipping the 2 and 1 bit in the version 4 string
23+
select encode(
24+
set_bit(
25+
set_bit(
26+
overlay(
27+
uuid_send(gen_random_uuid())
28+
placing substring(int8send(floor(extract(epoch from clock_timestamp()) * 1000)::bigint) from 3)
29+
from 1 for 6
30+
),
31+
52, 1
32+
),
33+
53, 1
34+
),
35+
'hex')::uuid;
36+
$$ language SQL volatile;
37+
```
38+
39+
## Helper functions, UUIDv7 <-> timestamptz
40+
41+
Next, we'll create two functions:
42+
43+
1. `ts_to_uuid_v7` – generate UUIDv7 based on any arbitrary `timestamptz` value, and
44+
2. `uuid_v7_to_ts` – extract `timestamptz` from the existing UUIDv7 value.
45+
46+
Note that this approach is not what the authors of revised RFC4122 (that will likely be finalized soon) would encourage;
47+
see [this discussion and the words](https://postgresql.org/message-id/flat/C80B8FDB-8D9E-48A2-82A2-48863987A1B1%40yandex-team.ru#074a05d31c9ce38bee2f8c8097877485)
48+
by
49+
[@x4mmmmmm](https://twitter.com/x4mmmmmm):
50+
51+
> ... as far as I know, RFC discourages extracting timestamps from UUIDs.
52+
53+
Anyway, let's just do it:
54+
55+
```sql
56+
create extension pgcrypto;
57+
58+
create or replace function ts_to_uuid_v7(timestamptz) returns uuid
59+
as $$
60+
select encode(
61+
set_bit(
62+
set_bit(
63+
overlay(
64+
uuid_send(gen_random_uuid())
65+
placing substring(int8send(floor(extract(epoch from $1) * 1000)::bigint) from 3)
66+
from 1 for 6
67+
),
68+
52, 1
69+
),
70+
53, 1
71+
),
72+
'hex')::uuid;
73+
$$ language SQL volatile;
74+
75+
create or replace function uuid_v7_to_ts(uuid_v7 uuid) returns timestamptz
76+
as $$
77+
select
78+
to_timestamp(
79+
(
80+
'x' || substring(
81+
encode(uuid_send(uuid_v7), 'hex')
82+
from 1 for 12
83+
)
84+
)::bit(48)::bigint / 1000.0
85+
)::timestamptz;
86+
$$ language sql;
87+
```
88+
89+
Checking the functions:
90+
91+
```sql
92+
test=# select now(), ts_to_uuid_v7(now() - interval '1y');
93+
now | ts_to_uuid_v7
94+
-------------------------------+--------------------------------------
95+
2023-11-30 05:36:32.205093+00 | 0184c709-63cd-7bd1-99c3-a4773ab1e697
96+
(1 row)
97+
98+
test=# select uuid_v7_to_ts('0184c709-63cd-7bd1-99c3-a4773ab1e697');
99+
uuid_v7_to_ts
100+
----------------------------
101+
2022-11-30 05:36:32.205+00
102+
(1 row)
103+
```
104+
105+
Pretending that we haven't noticed the loss of microseconds, we continue.
106+
107+
> 🎯 **TODO:** :
108+
> 1) may it be the case when we need that precision?
109+
> 2) timezones
110+
111+
## Hypertable
112+
113+
Create a table, where we'll store ID as UUID, but additionally have a `timestamptz` column – this column will be used as
114+
partitioning key when we convert the table to partitioned table ("hypertable" in TimescaleDB's terminology):
115+
116+
```sql
117+
create table my_table (
118+
id uuid not null
119+
default '00000000-0000-0000-0000-000000000000'::uuid,
120+
payload text,
121+
uuid_ts timestamptz not null default clock_timestamp() -- or now(), depending on goals
122+
);
123+
```
124+
125+
The default value `00000000-...00` for "id" is "fake" – it will always be replaced in trigger, based on the timestamp:
126+
127+
```sql
128+
create or replace function t_update_uuid() returns trigger
129+
as $$
130+
begin
131+
if new.id is null or new.id = '00000000-0000-0000-0000-000000000000'::uuid then
132+
new.id := ts_to_uuid_v7(new.uuid_ts);
133+
end if;
134+
135+
return new;
136+
end;
137+
$$ language plpgsql;
138+
139+
create trigger t_update_uuid_before_insert_update
140+
before insert or update on my_table
141+
for each row execute function t_update_uuid();
142+
```
143+
144+
Now, use TimescaleDB partitioning:
145+
146+
```sql
147+
create extension timescaledb;
148+
149+
select create_hypertable(
150+
relation := 'my_table',
151+
time_column_name := 'uuid_ts',
152+
-- !! very small interval is just for testing
153+
chunk_time_interval := '1 minute'::interval
154+
);
155+
```
156+
157+
## Test data - fill the chunks
158+
159+
And now insert some test data – some rows for the "past" and some "current" rows:
160+
161+
```sql
162+
insert into my_table(payload, uuid_ts)
163+
select random()::text, ts
164+
from generate_series(
165+
timestamptz '2000-01-01 00:01:00',
166+
timestamptz '2000-01-01 00:05:00',
167+
interval '5 second'
168+
) as ts;
169+
170+
insert into my_table(payload)
171+
select random()::text
172+
from generate_series(1, 10000);
173+
174+
vacuum analyze my_table;
175+
```
176+
177+
Checking the structure of `my_table` in psql using `\d+` we now see that multiple partitions ("chunks") were created by
178+
TimescaleDB:
179+
180+
```sql
181+
test=# \d+ my_table
182+
...
183+
Child tables: _timescaledb_internal._hyper_2_3_chunk,
184+
_timescaledb_internal._hyper_2_4_chunk,
185+
_timescaledb_internal._hyper_2_5_chunk,
186+
_timescaledb_internal._hyper_2_6_chunk,
187+
_timescaledb_internal._hyper_2_7_chunk,
188+
_timescaledb_internal._hyper_2_8_chunk,
189+
_timescaledb_internal._hyper_2_9_chunk
190+
```
191+
192+
## Test queries – partition pruning
193+
194+
Now we just need to remember that `created_at` should always participate in queries, to let planner deal with as few
195+
partitions as possible – but knowing the `id` values, we can always reconstruct the `created_at` values, using
196+
`uuid_v7_to_ts()`:
197+
198+
```sql
199+
test=# explain select * from my_table where created_at = uuid_v7_to_ts('00dc6ad0-9660-7b92-a95e-1d7afdaae659');
200+
QUERY PLAN
201+
--------------------------------------------------------------------------------------------------------------------------
202+
Append (cost=0.14..8.16 rows=1 width=41)
203+
-> Index Scan using _hyper_5_11_chunk_my_table_created_at_idx on _hyper_5_11_chunk (cost=0.14..8.15 rows=1 width=41)
204+
Index Cond: (created_at = '2000-01-01 00:01:00+00'::timestamp with time zone)
205+
(3 rows)
206+
207+
test=# explain select * from my_table
208+
where created_at >= uuid_v7_to_ts('018c1ecb-d3b7-75b1-add9-62878b5152c7')
209+
order by created_at desc limit 10;
210+
QUERY PLAN
211+
-----------------------------------------------------------------------------------------------------------------------------------
212+
Limit (cost=0.29..1.17 rows=10 width=41)
213+
-> Custom Scan (ChunkAppend) on my_table (cost=0.29..11.49 rows=126 width=41)
214+
Order: my_table.created_at DESC
215+
-> Index Scan using _hyper_5_16_chunk_my_table_created_at_idx on _hyper_5_16_chunk (cost=0.29..11.49 rows=126 width=41)
216+
Index Cond: (created_at >= '2023-11-30 05:55:23.703+00'::timestamp with time zone)
217+
(5 rows)
218+
```
219+
220+
– partition pruning in play, although it will require certain effort to have it in various queries. But it works.
221+
222+
223+
--------
224+
225+
## Postscript
226+
227+
Also read the following comment by [@jamessewell](https://twitter.com/jamessewell), originaly posted
228+
[here](https://twitter.com/jamessewell/status/1730125437903450129):
229+
230+
> If update your `create_hypertable` call with:
231+
>
232+
> ```
233+
> time_column_name => 'id'
234+
> time_partitioning_func => 'uuid_v7_to_ts'
235+
> ```
236+
>
237+
> Then you'll be able to drop the `uuid_ts` col and your trigger!
238+
>
239+
> ```sql
240+
> SELECT * FROM my_table WHERE id = '018c1ecb-d3b7-75b1-add9-62878b5152c7';
241+
> ```
242+
>
243+
> Will just work 🪄
Lines changed: 116 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,116 @@
1+
Originally from: [tweet](https://twitter.com/samokhvalov/status/1730609033860858080), [LinkedIn post]().
2+
3+
---
4+
5+
# How many tuples can be inserted in a page
6+
7+
> I post a new PostgreSQL "howto" article every day. Join me in this
8+
> journey – [subscribe](https://twitter.com/samokhvalov/), provide feedback, share!
9+
10+
In Postgres, all tables have hidden, system columns; `ctid` being one of them. Reading it, we can see physical
11+
location of the tuple (tuple = row physical version), the page number and offset inside it:
12+
13+
```sql
14+
nik=# create table t0 as select 1 as id;
15+
SELECT 1
16+
17+
nik=# select ctid, id from t0;
18+
ctid | id
19+
-------+----
20+
(0,1) | 1
21+
(1 row)
22+
```
23+
24+
👉 page 0, position 1.
25+
26+
A single PostgreSQL page, which is 8 KiB by default, and can be checked by looking at `block_size`:
27+
28+
```sql
29+
nik=# show block_size;
30+
block_size
31+
------------
32+
8192
33+
(1 row)
34+
```
35+
36+
How many tuples can fit into a single page? Let's see:
37+
38+
```sql
39+
nik=# create table t0 as select i
40+
from generate_series(1, 1000) as i;
41+
SELECT 1000
42+
43+
nik=# select count(*)
44+
from t0
45+
where (ctid::text::point)[0] = 0;
46+
count
47+
-------
48+
226
49+
(1 row)
50+
51+
nik=# select pg_column_size(i) from t0 limit 1;
52+
pg_column_size
53+
----------------
54+
4
55+
(1 row)
56+
```
57+
58+
👉 If we use 4-byte numbers, then it's 226 tuples. Here I used `(ctid::text::point)[0]` to convert `ctid` value to
59+
"point" to get the first its component, then (the page number).
60+
61+
If we use 2-byte numbers or, say, 1-byte `boolean` values (yes, boolean needs 1 byte, not 1 bit), the number is the
62+
same:
63+
64+
```sql
65+
nik=# drop table t0;
66+
DROP TABLE
67+
68+
nik=# create table t0 as select true
69+
from generate_series(1, 1000) as i;
70+
SELECT 1000
71+
72+
nik=# select count(*)
73+
from t0
74+
where (ctid::text::point)[0] = 0;
75+
count
76+
-------
77+
226
78+
(1 row)
79+
```
80+
81+
Why 226 again? The thing is that, the size of the value doesn't matter here; it just needs to be less or equal to 8
82+
bytes. For every row, alignment padding adds zeroes, so we'll always have 8 bytes for each row. Math:
83+
84+
![floor((8192 - 24) / (4 + 24 + 8)) = 226](./files/0066-formula-1.gif)
85+
86+
👉 What we have counted here:
87+
88+
1. A single 24-byte page header (`PageHeaderData`).
89+
2. N pointers to each tuple – 4 bytes each (`ItemIdData`).
90+
3. N tuple headers – 23 bytes each, padded to 24 (`HeapTupleHeaderData`).
91+
4. N tuple values – if <= 8 bytes, then it's padded to 8 bytes.
92+
93+
Source code defining the
94+
structures (for [PG16](https://github.com/postgres/postgres/blob/REL_16_STABLE/src/include/storage/bufpage.h)).
95+
96+
**Can we fit even MORE tuples?**
97+
98+
The answer is YES. Postgres allows tables without columns (!) In this case, the math is:
99+
100+
![floor((8192 - 24) / (4 + 24)) = 291](./files/0066-formula-2.gif)
101+
102+
Let's see (note the empty column list in the `SELECT` clause):
103+
104+
```sql
105+
nik=# create table t0 as select
106+
from generate_series(1, 1000) as i;
107+
SELECT 1000
108+
109+
nik=# select count(*)
110+
from t0
111+
where (ctid::text::point)[0] = 0;
112+
count
113+
-------
114+
291
115+
(1 row)
116+
```

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -91,6 +91,8 @@ As an example, first 2 rows:
9191
- 0062 [How to create an index, part 2](./0062_how_to_create_an_index_part_2.md)
9292
- 0063 [How to help others](./0063_how_to_help_others.md)
9393
- 0064 [How to use UUID](./0064_how_to_use_uuid.md)
94+
- 0065 [UUID v7 and partitioning (TimescaleDB)](./0065_uuid_v7_and_partitioning_timescaledb.md)
95+
- 0066 [How many tuples can be inserted in a page](./0066_how_many_tuples_can_be_inserted_in_a_page.md)
9496
- ...
9597

9698
## Contributors

files/0066-formula-1.gif

962 Bytes
Loading

files/0066-formula-2.gif

886 Bytes
Loading

0 commit comments

Comments
 (0)