Hardware requirements #3852
baek-sang
started this conversation in
Anything Else
Replies: 1 comment
-
The following is based on 2023 event data for https://plausible.io/plausible.io (taken from the staging mirror) There are two main tables that store event data, events_v2 (which is close to the raw events) and sessions_v2. Everything else is mostly irrelevant as it doesn't grow as events come in. select hostname from events_v2 limit 1;
-- ┌─hostname─────┐
-- │ plausible.io │
-- └──────────────┘
select min(timestamp), max(timestamp) from events_v2;
-- ┌──────min(timestamp)─┬──────max(timestamp)─┐
-- │ 2023-02-09 13:54:27 │ 2024-01-23 12:25:04 │
-- └─────────────────────┴─────────────────────┘
with e as (select toYYYYMM(timestamp) as month, count(*) as events from events_v2 group by month),
p as (select partition, bytes_on_disk from system.parts where table='events_v2' and active)
select month, events, formatReadableSize(bytes_on_disk), formatReadableSize(bytes_on_disk / events) from e
left join p on toString(e.month) = p.partition
order by month asc;
-- ┌──month─┬──events─┬─formatReadableSize(bytes_on_disk)─┬─formatReadableSize(divide(bytes_on_disk, events))─┐
-- │ 202302 │ 446740 │ 5.01 MiB │ 11.77 B │
-- │ 202303 │ 835341 │ 10.66 MiB │ 13.38 B │
-- │ 202304 │ 601907 │ 7.92 MiB │ 13.80 B │
-- │ 202305 │ 825908 │ 8.99 MiB │ 11.41 B │
-- │ 202306 │ 773004 │ 9.55 MiB │ 12.95 B │
-- │ 202307 │ 757485 │ 9.61 MiB │ 13.31 B │
-- │ 202308 │ 1009596 │ 10.64 MiB │ 11.06 B │
-- │ 202310 │ 3635510 │ 19.54 MiB │ 5.64 B │
-- │ 202311 │ 5158640 │ 27.36 MiB │ 5.56 B │
-- │ 202312 │ 4320582 │ 22.42 MiB │ 5.44 B │
-- │ 202401 │ 2591880 │ 12.61 MiB │ 5.10 B │
-- └────────┴─────────┴───────────────────────────────────┴───────────────────────────────────────────────────┘
with e as (select toYYYYMM(timestamp) as month, count(*) as sessions from sessions_v2 group by month),
p as (select partition, bytes_on_disk from system.parts where table='sessions_v2' and active)
select month, sessions, formatReadableSize(bytes_on_disk), formatReadableSize(bytes_on_disk / sessions) from e
left join p on toString(e.month) = p.partition
order by month asc;
-- ┌──month─┬─sessions─┬─formatReadableSize(bytes_on_disk)─┬─formatReadableSize(divide(bytes_on_disk, sessions))─┐
-- │ 202302 │ 96036 │ 3.73 MiB │ 40.78 B │
-- │ 202303 │ 222797 │ 8.69 MiB │ 40.90 B │
-- │ 202304 │ 170025 │ 6.59 MiB │ 40.66 B │
-- │ 202305 │ 177303 │ 6.90 MiB │ 40.79 B │
-- │ 202306 │ 196727 │ 7.66 MiB │ 40.83 B │
-- │ 202307 │ 201426 │ 7.88 MiB │ 41.03 B │
-- │ 202308 │ 198089 │ 7.81 MiB │ 41.35 B │
-- │ 202310 │ 363769 │ 13.52 MiB │ 38.96 B │
-- │ 202311 │ 518834 │ 19.12 MiB │ 38.64 B │
-- │ 202312 │ 417271 │ 15.13 MiB │ 38.03 B │
-- │ 202401 │ 221666 │ 8.26 MiB │ 39.07 B │
-- └────────┴──────────┴───────────────────────────────────┴─────────────────────────────────────────────────────┘ As for your question, I'd need to extrapolate. It seems like MAU would roughly correspond to sessions (see https://plausible.io/docs/metrics-definitions#unique-visitors for the difference) 100M sessions would be ~4GiB with avg as (
select (select sum(bytes_on_disk) from system.parts where table='sessions_v2' and active) /
(select count(*) from sessions_v2) as bytes)
select formatReadableSize(100000000 * bytes) from avg;
-- ┌─formatReadableSize(multiply(100000000, bytes))─┐
-- │ 3.69 GiB │
-- └────────────────────────────────────────────────┘ and since each session has on average (for plausible.io) 8 events select (select count(*) from events_v2) / (select count(*) from sessions_v2);
-- ┌─divide(_subquery31, _subquery32)─┐
-- │ 7.527665975919765 │
-- └──────────────────────────────────┘ it would add an additional ~5GiB with avg as (
select (select sum(bytes_on_disk) from system.parts where table='events_v2' and active) /
(select count(*) from events_v2) as bytes)
select formatReadableSize(7.5 * 100000000 * bytes) from avg;
-- ┌─formatReadableSize(multiply(multiply(7.5, 100000000), bytes))─┐
-- │ 5.04 GiB │
-- └───────────────────────────────────────────────────────────────┘ So MAU 100M = 10GB |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
this
Is there a guide for SSD (HDD) capacity?(postgresql, clickhouse)
I was wondering how much capacity I would need roughly depending on how many users I have on my site. ex) MAU 100M = 100G
Beta Was this translation helpful? Give feedback.
All reactions