Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FieldFilter variables mapped to DateTime datatype don't show results correctly #229

Open
anaklime opened this issue Mar 19, 2024 · 12 comments
Assignees
Labels
bug Something isn't working

Comments

@anaklime
Copy link

anaklime commented Mar 19, 2024

Describe the bug

Hello. I didn't knew how to lable it - bug or suggestion, but IMHO this looks like a bug.

I've discovered, that Field Filter doesn't work "correctly" if being mapped to DateTime datatype. All Filter Widget types for dates are affected, except one - Single Date type.

Steps to reproduce

We are checking result of the following query for Date Filter Filter Widget type:

select date_time
from date_filter
[[where {{date_time}}]]
order by date_time desc
Filter option Where clause Actual Result Expected Result
Today testing.date_filter.date_time = toDate('2024-03-19') We got data only for midnight time: 19/3/2024, 00:00(check first screenshot 1) Should show all timestamps for 19/3/2024
Yesterday testing.date_filter.date_time = toDate('2024-03-18') We got data only for midnight time: 18/3/2024, 00:00(check first screenshot 2) Should show all timestamps for 18/3/2024
Last 7 days testing.date_filter.date_time BETWEEN toDate('2024-03-12') AND toDate('2024-03-18') For 18/3/2024 shows only midnight time: 18/3/2024, 00:00(check first screenshot 3) Should show all timestamps between 11/3/2024 till 18/3/2024 including timestamps from 18/3/2024
Specific dates -> Between without time testing.date_filter.date_time BETWEEN toDate('2024-03-11') AND toDate('2024-03-12') For 12/3/2024 shows only midnight time: 12/3/2024, 00:00(check first screenshot 4) Should show all timestamps for 12/3/2024
Specific dates -> Between with time testing.date_filter.date_time BETWEEN '2024-03-11 00:00:00.000' AND '2024-03-12 23:59:00.000' It didn't showed record for 2024-03-12 where time was 23:59:59(check first screenshot 5) Should show all timestamps for 2024-03-12 23:59
Specific dates -> After testing.date_filter.date_time > toDate('2024-03-18') We're missing midnight for 2024-03-18, and show other records for that day(screenshot 6) Shouldn't show any timestamps for 2024-03-18
Specific dates -> On CAST(testing.date_filter.date_time AS date) = toDate('2024-03-12') As expected - we got all records for 2024-03-12 (check screenshot7) All timestamps for 2024-03-12 had been shown

Following table will be all affected with same issue as above, so i'll just mention

Filter type (& option) Where clause
Month and Year testing.date_filter.date_time BETWEEN toDate('2023-05-01') AND toDate('2023-05-31')
Quater and Year where testing.date_filter.date_time BETWEEN toDate('2023-07-01') AND toDate('2023-09-30')
Data Range without time testing.date_filter.date_time BETWEEN toDate('2024-03-18') AND toDate('2024-03-19')
Data Range with time testing.date_filter.date_time BETWEEN '2024-03-18 00:00:00.000' AND '2024-03-19 23:59:00.000'
Data Range single date testing.date_filter.date_time = toDate('2024-03-18')
Relative date Basically the same as for Date Filter options

And here is table for Single Date Filter Widget type:

Filter option Where clause Result
Just date CAST(testing.date_filter.date_time AS date) = toDate('2024-03-19') As we wanted - all records for 2024-03-19 are shown by this query (screenshot 8)
Date with time toStartOfMinute(testing.date_filter.date_time) = '2024-03-19 23:59:00.000' It showed every record for 23:59, as we expected(screenshot 9)

These works perfectly with DateTime datatype, and negate all "issues" mentioned above.
Can we somehow implement same "methods" for other Filter Widgets?

Configuration

Environment

  • metabase-clickhouse-driver version: 1.4.0
  • metabase-clickhouse-driver configuration: default
  • Metabase version: 0.49.0
  • OS: Oracle Linux Server 8.6

ClickHouse server

  • ClickHouse Server version: 23.8.9.54 (official build)
  • CREATE TABLE statements for tables involved:

CREATE TABLE date_filter
(
date_time DateTime('Europe/Moscow'),
)
ENGINE = TinyLog

This is example of DateTime records from the table - 5 records with different time for a day:

2023-03-25 00:00:00
2023-03-25 06:30:34
2023-03-25 12:00:00
2023-03-25 18:45:58
2023-03-25 23:59:59
2023-03-26 00:00:00

Where max value for date_time is 2024-03-19 23:59:59 and min value for date_time is 2023-03-25 00:00:00

Screenshots:

Footnotes

  1. image

  2. image

  3. image

  4. image

  5. image

  6. image

  7. image

  8. image

  9. image

@anaklime anaklime added the bug Something isn't working label Mar 19, 2024
@anaklime anaklime changed the title FieldFilter variables mapped to DateTime datatype correctly FieldFilter variables mapped to DateTime datatype don't show results correctly Mar 19, 2024
@slvrtrn
Copy link
Collaborator

slvrtrn commented Mar 19, 2024

Environment
metabase-clickhouse-driver version:
metabase-clickhouse-driver configuration:
Metabase version:
OS:

Could you share these? What is the driver version? OS? What is the timezone configuration (OS, CH, JVM/Metabase)?

@anaklime
Copy link
Author

Yeah, i forgot to put this information there, sorry. Accidentally ctrl-enter before finished whole report .-.

@slvrtrn
Copy link
Collaborator

slvrtrn commented Mar 19, 2024

No worries. Thank you for the detailed report (emoji navigation is the next level).
In the latest driver (which requires MB 0.49.x), there was a fix related to this feature; see 1.4.0.

@anaklime
Copy link
Author

Added environment info.
TZ configuration is the same for all of OS, CH and Metabase - UTC +3

So yeah 1.4.0 is working perfectly fine, i was wondering is there any chance to implement this to other Filter Widgets?

@anaklime
Copy link
Author

I found that it could be made like this, but i have no idea how it will affect performance:

toStartOfMinute(date_time) BETWEEN '2024-03-17 23:59:00' AND '2024-03-19 23:59:00'
CAST(date_time as date) BETWEEN toDate('2024-03-18') AND toDate('2024-03-19')

@slvrtrn
Copy link
Collaborator

slvrtrn commented Mar 20, 2024

OK, I see that 1.4.0 is affected. I will have a look when I am back from vacation next week.

@ecyshor
Copy link

ecyshor commented Apr 1, 2024

@slvrtrn Just double checking on this one, hit the same issue as reported, where the filters basically exclude the start/end days of the intervals (so for example a last 7 days relative range filter will return the data only for 5 days).

@slvrtrn
Copy link
Collaborator

slvrtrn commented Apr 2, 2024

I tried to reproduce it with MB 0.49.2, driver version 1.4.0, ClickHouse 24.3, and this dataset (6 rows per day, with different times: 00:00, 00:59, 01:00, 06:30, 18:45, 23:59:59; i.e. expected 42 rows per week or 7 days). I also tried both UTC and Europe/Moscow timezones in ClickHouse + MB + DateTime column.

The results were correct with all filter variations in both time zones. I am pretty positive that this was fixed in #218 (similar report - #216). But if I am missing something, please let me know; I will double-check.

Here are my results:

(Single date) April 1st - all records for this date.

Screenshot from 2024-04-02 08-15-59

(Relative date) Past 7 days - as expected, 42 dates.

Screenshot from 2024-04-02 08-15-39

(Date Range with time) April 1st 12:30pm - April 2nd 12:30pm - filtered correctly.

Screenshot from 2024-04-02 08-15-13

(Date Range default) April 1st 00:00 - April 2nd 23:59

Screenshot from 2024-04-02 08-14-34

(Date filter) This week - 42 dates as expected.

Screenshot from 2024-04-02 08-13-54

(Date filter) Today (2nd of April) - only 6 dates, correct results.

Screenshot from 2024-04-02 08-13-44

(Date filter) Previous week - 42 dates as expected, March 25th - March 31st

Screenshot from 2024-04-02 08-13-34

(Date filter) Previous 7 days - 42 dates as expected, March 25th - March 31st

Screenshot from 2024-04-02 08-11-58

Regarding time zones setup: CH had either <timezone>UTC</timezone> or <timezone>Europe/Moscow</timezone> in config.xml; MB was running in Docker with either 'JAVA_TIMEZONE': 'UTC' or 'JAVA_TIMEZONE': 'Europe/Moscow', and the column was defined as either DateTime('UTC') or DateTime('Europe/Moscow').

@anaklime
Copy link
Author

anaklime commented Apr 8, 2024

As i said earlier - it seems that #218 fixed some particular cases, but not the issue itself.
I updated my
MB to 0.49.2 version:
image

driver version still 1.4.0:

[root@test ch_mt_driver]# cat metabase-plugin.yaml 
info:
  name: Metabase ClickHouse Driver
  version: 1.4.0

ClickHouse to 24.3.2 version:

[root@test clickhouse-server]# clickhouse-server --version
ClickHouse server version 24.3.2.23 (official build).

The issue itself that many filters using toDate instead of Cast, which leads to strange results.( see examples below)
Generally speaking - toDate('some_date') = some_dateT00:00 .

I used same dataset for this.

Here are all examples for all filters:

Filter and Widget Type Time Period Actual Result Expected Result Comment
Field Filter: Month and Year March 2024 Well, this query won't show any data beside with midnight timestamp for last day of the month 1 But if we use Cast it will show us all data we need 2
Field Filter: Quarter and Year Q1 2024 Same result as previous example - no data for last day of the quarter 3 And the same solution - using Case will solve this issue 2 It is pointless to make a new separate screenshot for this as queries are identical
Field Filter: Single Date 8 April Here we can see that, if we use Case there is no issue with showing all timestamps for one day 4 We got what we expected No additional comments here
Field Filter: Date Range 8 April - 10 April Well, this is default Date Range values here - it's using toDate without Cast so we're missing some data for 10/04/2024 5 Same range with Case will get us missed timestamps 6 I actually don't get it why we have different result for this one :<
Filed Filter: Date Range with time 8 April 00:00 - 10 April 18:45 Because in this clause absolute values used without any functions we are getting pretty much what we've expected 7 <--- It could be considered as an issue, but i actually don't know if it should be - any timestamps after 18:45 won't be shown, for example 18:45:00.001. This is expected because we used absolute values with BETWEEN clause.
Field Filter: Relative Date Today No Cast in clause - no expected result. Nothing, besides midnight is shown 8 We're expecting same result as in Single Date widget 4 No additional comments here
Field Filter: Relative Date Yesterday Same result as in previous example 9 Using Cast will fix this 4 No additional comments here
Field Filter: Relative Date Past 7 days Same result as earlier with combination of BETWEEN and toDate without Cast 10 As always Cast makes anything better 11 I don't understand - same tests, same versions - different results ¯_(ツ)_/¯
Field Filter: Date Filter Today Same result as for Relative Date widget 12 Using Cast will fix this 4 No additional comments here
Field Filter: Date Filter Last Week Again usage toDate without Cast 13 We need Cast to fix this 6 No additional comments here
Field Filter: Date Filter This Week Same result as Last Week widget 14 We need Cast to fix this 6 No additional comments here
Field Filter: Date Filter Previous 7 Days Same result as earlier 15 We need Cast to fix this 6 No additional comments here

It's very strange that we have different results, despite using identical versions of MD, Driver and CH.
Also, i'm a little bit lost in this - as far as i understood how widget creating clauses was full on driver side. Am i wrong about that?

I'll try full clean install later to be sure that i missed nothing while updating to newer versions.

And again apologies for my poor English - it's not my native language. I hope you could understand what i was writing about)

Screenshots:

Footnotes

  1. image

  2. image 2

  3. image

  4. image 2 3 4

  5. image

  6. image 2 3 4

  7. image

  8. image

  9. image

  10. image

  11. image

  12. image

  13. image

  14. image

  15. image

@slvrtrn
Copy link
Collaborator

slvrtrn commented Apr 9, 2024

Thanks for the detailed answer :)

how widget creating clauses was full on driver side. Am i wrong about that?

This is correct. This is mainly handled by Metabase itself via the common code; the only tweak for CH is this.

Considering CAST calls, we had this issue: #196, which is why these were removed and that bit in the QP sources was introduced.

Regarding the statements: If you run the queries in clickhouse-client (from CLI), are the results the same as those shown in Metabase?

I suspect that this could be due to the disabled TZ switch (Related: #200, see the discussion); however, this shouldn't be the case since everything is in the same TZ on your configuration.

Lastly, could you contact me in the community Slack? I think it might be easier/faster to debug that way.

@anaklime
Copy link
Author

anaklime commented Apr 9, 2024

So i was wondering if it's possible to "map" Cast(date_time as date) to toDate function, so it will appear only if toDate is used in these filters.

Yeah i saw that some issues had been mentioned as fixed in the changelog for 1.3.0 version.
The #196 issue is relatable to this issue. But issue there had been caused by Cast used when it shouldn't be - with absolute values we don't need to use any functions (at least i believe so =D ).

If i run queries in CLI i have the same result.

Also, i perform a clean install of MB, MB driver and CH - latest versions for all mentioned, and still got same results

I don't think that this could be related to TZ issue, even if TZ would differ in my config - i believe TZ issue would show results which differ from expected for + or - 3 hours. ( If only +/-3 UTC is the only difference between configured timezones )

Unfortunately, Slack is restricted in my country and i can't join this community. Is it possible to communicate via different method/messenger? Telegram, perhaps?

@slvrtrn
Copy link
Collaborator

slvrtrn commented Apr 10, 2024

Unfortunately, Slack is restricted in my country and i can't join this community. Is it possible to communicate via different method/messenger? Telegram, perhaps?

Feel free to contact me there. Same username.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants