-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compatibity with Clickhouse 24.2 #3855
Comments
Can you please run show create on events_v2 and sessions_v2 tables? I'll try out ClickHouse 24.2 on my instance later today. |
nevermind, got it
|
Thank you! I've been able to reproduce it but I'm not sure what's the cause yet. 7eeb32dddd7a :) select sum(s.pageviews * s._sample_factor) from sessions_v2 s sample 100;
-- ┌─sum(multiply(pageviews, _sample_factor))─┐
-- │ 390594.55999999976 │
-- └──────────────────────────────────────────┘
-- 1 row in set. Elapsed: 0.006 sec. Processed 20.73 thousand rows, 248.77 KB (3.22 million rows/s., 38.60 MB/s.)
-- Peak memory usage: 0.00 B.
7eeb32dddd7a :) select version();
-- ┌─version()───┐
-- │ 24.2.1.2248 │
-- └─────────────┘ If the fix would involve changing the queries, it'd be unlikely to be merged until Plausible Cloud switches to the version requiring it. |
Hm, this query seems to work in SELECT
toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin')),
toUInt64(round(uniq(e0.user_id) * any(_sample_factor)))
FROM events_v2 AS e0
SAMPLE 20000000
GROUP BY toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin'))
ORDER BY toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin')) ASC
-- ┌─toStartOfHour(toTimeZone(timestamp, 'Europe/Berlin'))─┬─toUInt64(round(multiply(uniq(user_id), any(_sample_factor))))─┐
-- │ 2022-08-10 17:00:00 │ 1 │
-- │ 2022-08-10 18:00:00 │ 1 │
-- │ 2022-08-11 11:00:00 │ 1 │
-- │ 2022-08-11 12:00:00 │ 1 │
-- etc.
-- 981 rows in set. Elapsed: 0.039 sec. Processed 599.36 thousand rows, 7.19 MB (15.34 million rows/s., 184.13 MB/s.) but adding a SELECT
toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin')),
toUInt64(round(uniq(e0.user_id) * any(_sample_factor)))
FROM events_v2 AS e0
SAMPLE 20000000
WHERE (e0.site_id = 2) AND ((e0.timestamp >= '2024-03-01 23:00:00') AND (e0.timestamp < '2024-03-02 23:00:00'))
GROUP BY toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin'))
ORDER BY toStartOfHour(toTimeZone(e0.timestamp, 'Europe/Berlin')) ASC
-- Received exception from server (version 24.2.1):
-- Code: 10. DB::Exception: Received from localhost:9000. DB::Exception: Not found column _sample_factor: in block timestamp DateTime UInt32(size = 0), site_id UInt64 UInt64(size = 0), user_id UInt64 UInt64(size = 0). (NOT_FOUND_COLUMN_IN_BLOCK) |
Seems to be more of a ClickHouse bug, I'll try opening an issue. |
I'll re-open this issue if it turns out to not be a bug. |
Thanks a lot, @ruslandoga! I will subscribed to that issue |
FYI, I tried the workaround in the linked ClickHouse issue and it worked for me by setting |
Past Issues Searched
Issue is a Bug Report
Using official Plausible Cloud hosting or self-hosting?
Self-hosting
Describe the bug
hey 👋🏻
I tried to upgrade my self-hosted instance of Clickhouse, and it looks like Plausible is not yet ready for it
This is the exception I'm seeing on the console:
Expected behavior
Compatible with CH changes
Screenshots
No response
Environment
Deployment configuration (I'm using Kamal)
The text was updated successfully, but these errors were encountered: