Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time zone issues : another use case #5927

Open
Tracked by #27178
manurana opened this issue Sep 11, 2017 · 14 comments
Open
Tracked by #27178

Time zone issues : another use case #5927

manurana opened this issue Sep 11, 2017 · 14 comments

Comments

@manurana
Copy link

manurana commented Sep 11, 2017

Metabase 26rc1, running on a EC2 with UTC. I am in IST (+5.30)(Asia/Calcutta). Metabase client is set to use India/Calcutta as the timezone. The mySQL DB is on UTC. Java version is "1.8.0_141"
Time zone on the client is set to Asia/Calcutta

MB is running on the server without a time zone set
nohup java -Xmx1536m -Xms1536m -jar ~/current/metabase.jar &

Running it on the server with the time zone set makes no difference
#nohup java -Xmx1536m -Xms1536m -Duser.timezone=Asia/Calcutta -jar ~/current/metabase.jar &

This is a follow up to #2035

I have created a question which groups a certain metric by the hour of day. The field in question, "Appointment Dt", is a DATETIME data type in mySQL
There are 3 issues here

  1. The data being selected is of the day PREVIOUS to the one being requested. This can be also be seen from the log which i have attached at the very end
  2. The hours being shown on the client are not on the client time, but the server time (UTC). We know this because the pattern of the metrics is not correct. We do not get high numbers in the early morning hours.
  3. The data is not taking into account the 30 minute difference in times, it only considers whole hours. This is shown by the fact that even after compensating for the diff in hours between the MB question and the SQL, the numbers do not match up.

See the following screenshot :
image

This is the RIGHT query which should execute (which we are currently doing in SQL). I have deliberately used the PREVIOUS date so the figures can be compared for the third part of my question

SELECT Hour(Convert_tz(appointment_dt, "+00:00", "+05:30")) AS 'Hour of the day',
       Count(1) AS total_trips
FROM   appointment
WHERE  ( ( Date(Convert_tz(appointment_dt, "+00:00", "+05:30")) = '2016-03-30' )
         AND ( appointment.status IN ( 9 ) ) )
GROUP  BY Hour(Convert_tz(appointment_dt, "+00:00", "+05:30"))
ORDER  BY Hour(Convert_tz(appointment_dt, "+00:00", "+05:30")) ASC 

This is the result of the query
image

And finally, the query log

09-11 11:07:02 DEBUG middleware.log ::
QUERY: 😎
{:database 2,
 :type "query",
 :query {:source_table 6, :filter ["and" ["SEGMENT" 2] ["=" ["field-id" 48] "2016-03-31"]], :aggregation [["count"]], :breakout [["datetime-field" ["field-id" 48] "hour-of-day"]]},
 :parameters [],
 :constraints {:max-results 10000, :max-results-bare-rows 2000},
 :info
 {:executed-by 1,
  :context :ad-hoc,
  :card-id nil,
  :nested? false,
  :query-hash [57, -15, 107, -3, 47, 63, 43, 61, 55, -64, 106, -58, -36, -23, -36, 108, -14, 12, -92, 70, -22, 124, 121, 103, 119, -6, -84, 67, 18, -78, -26, -128],
  :query-type "MBQL"}}

09-11 11:07:02 DEBUG middleware.expand-macros ::

MACRO/SUBSTITUTED: 😻
{:database 2,
 :type "query",
 :query {:source_table 6, :filter ("and" ["AND" ["=" 84 9]] ["=" ["field-id" 48] "2016-03-31"]), :aggregation [["count"]], :breakout [["datetime-field" ["field-id" 48] "hour-of-day"]]},
 :parameters [],
 :constraints {:max-results 10000, :max-results-bare-rows 2000},
 :info
 {:executed-by 1,
  :context :ad-hoc,
  :card-id nil,
  :nested? false,
  :query-hash [57, -15, 107, -3, 47, 63, 43, 61, 55, -64, 106, -58, -36, -23, -36, 108, -14, 12, -92, 70, -22, 124, 121, 103, 119, -6, -84, 67, 18, -78, -26, -128],
  :query-type "MBQL"},
 :driver {},
 :settings {:report-timezone "Asia/Calcutta"}}

09-11 11:07:02 WARN middleware.expand :: Referring to fields by their bare ID (84) is deprecated in MBQL '98. Please use [:field-id 84] instead.
09-11 11:07:02 WARN middleware.expand :: Referring to fields by their bare ID (84) is deprecated in MBQL '98. Please use [:field-id 84] instead.
09-11 11:07:02 WARN middleware.expand :: You shouldn't specify an :and filter with only one subclause.
09-11 11:07:02 DEBUG middleware.log ::
PREPROCESSED/EXPANDED: 😻
{:constraints {:max-results 10000, :max-results-bare-rows 2000},
 :fk-field-ids #{},
 :table-ids #{6},
 :settings {:report-timezone "Asia/Calcutta"},
 :type "query",
 :info
 {:executed-by 1,
  :context :ad-hoc,
  :nested? false,
  :query-hash [57, -15, 107, -3, 47, 63, 43, 61, 55, -64, 106, -58, -36, -23, -36, 108, -14, 12, -92, 70, -22, 124, 121, 103, 119, -6, -84, 67, 18, -78, -26, -128],
  :query-type "MBQL"},
 :database
 {:name "Baxi Prod RO",
  :id 2,
  :engine :mysql,
  :details "😋 ",
  :timezone "UTC",
  :features #{:basic-aggregations :standard-deviation-aggregations :expression-aggregations :foreign-keys :native-parameters :nested-queries :expressions :set-timezone :binning}},
 :driver "MySQL",
 :query
 {:source-table {:name "appointment", :id 6},
  :filter
  {:compound-type :and,
   :subclauses
   [{:filter-type :=,
     :field
     {:base-type :type/Integer,
      :table-id 6,
      :special-type :type/Category,
      :field-name "status",
      :dimensions
      {:dimension-id 4,
       :field-id 84,
       :dimension-name "Status",
       :dimension-type :internal,
       :created-at #inst "2017-07-29T12:18:53.426000000-00:00",
       :updated-at #inst "2017-07-29T12:18:53.426000000-00:00"},
      :field-display-name "Status",
      :values
      {:field-value-id 23,
       :field-id 84,
       :values (0 1 3 4 5 6 7 8 9 10),
       :human-readable-values ("0" "1" "3" "4" "5" "6" "7" "8" "9" "10"),
       :created-at #inst "2016-02-09T08:46:44.913000000-00:00",
       :updated-at #inst "2017-08-10T06:21:13.074000000-00:00"},
      :visibility-type :normal,
      :field-id 84,
      :table-name "appointment"},
     :value
     {:value 9,
      :field
      {:base-type :type/Integer,
       :table-id 6,
       :special-type :type/Category,
       :field-name "status",
       :dimensions
       {:dimension-id 4,
        :field-id 84,
        :dimension-name "Status",
        :dimension-type :internal,
        :created-at #inst "2017-07-29T12:18:53.426000000-00:00",
        :updated-at #inst "2017-07-29T12:18:53.426000000-00:00"},
       :field-display-name "Status",
       :values
       {:field-value-id 23,
        :field-id 84,
        :values (0 1 3 4 5 6 7 8 9 10),
        :human-readable-values ("0" "1" "3" "4" "5" "6" "7" "8" "9" "10"),
        :created-at #inst "2016-02-09T08:46:44.913000000-00:00",
        :updated-at #inst "2017-08-10T06:21:13.074000000-00:00"},
       :visibility-type :normal,
       :field-id 84,
       :table-name "appointment"}}}
    {:filter-type :=,
     :field
     {:field
      {:base-type :type/DateTime,
       :table-id 6,
       :field-name "appointment_dt",
       :dimensions [],
       :field-display-name "Appointment Dt",
       :values [],
       :visibility-type :normal,
       :field-id 48,
       :table-name "appointment"},
      :unit :day},
     :value
     {:value #inst "2016-03-30T18:30:00.000000000-00:00",
      :field
      {:field
       {:base-type :type/DateTime,
        :table-id 6,
        :field-name "appointment_dt",
        :dimensions [],
        :field-display-name "Appointment Dt",
        :values [],
        :visibility-type :normal,
        :field-id 48,
        :table-name "appointment"},
       :unit :day}}}]},
  :aggregation [{:aggregation-type :count}],
  :breakout
  [{:field
    {:base-type :type/DateTime,
     :table-id 6,
     :field-name "appointment_dt",
     :dimensions [],
     :field-display-name "Appointment Dt",
     :values [],
     :visibility-type :normal,
     :field-id 48,
     :table-name "appointment"},
    :unit :hour-of-day}],
  :order-by
  ({:field
    {:field
     {:base-type :type/DateTime,
      :table-id 6,
      :field-name "appointment_dt",
      :dimensions [],
      :field-display-name "Appointment Dt",
      :values [],
      :visibility-type :normal,
      :field-id 48,
      :table-name "appointment"},
     :unit :hour-of-day},
    :direction :ascending})}}

09-11 11:07:02 DEBUG middleware.permissions :: Permissions Check 🔒 : Can User 1 access Table 6 (appointment)?
09-11 11:07:02 DEBUG middleware.permissions :: Permissions Check 🔓 : Yes ✅  because User 1 is a member of Group 2 (Administrators) which has permissions for '/'
09-11 11:07:02 DEBUG generic-sql.query-processor :: HoneySQL Form: 🍯
 {:from (:appointment),
 :select ([:%count.* "count"] [{:name :hour, :args (:appointment.appointment_dt)} :appointment_dt]),
 :group-by ({:name :hour, :args (:appointment.appointment_dt)}),
 :where [:and [:= :appointment.status 9] [:= {:name :date, :args (:appointment.appointment_dt)} {:name :date, :args (#inst "2016-03-30T18:30:00.000000000-00:00")}]],
 :order-by ([{:name :hour, :args (:appointment.appointment_dt)} :asc])}

09-11 11:07:02 DEBUG middleware.mbql-to-native :: NATIVE FORM: 😳
{:query
 "SELECT count(*) AS `count`, hour(`appointment`.`appointment_dt`) AS `appointment_dt` FROM `appointment` WHERE (`appointment`.`status` = 9 AND date(`appointment`.`appointment_dt`) = date(?)) GROUP BY hour(`appointment`.`appointment_dt`) ORDER BY hour(`appointment`.`appointment_dt`) ASC",
 :params (#inst "2016-03-30T18:30:00.000000000-00:00")}


09-11 11:07:02 DEBUG generic-sql.query-processor :: Setting timezone with statement: SET @@session.time_zone = 'Asia/Calcutta';
09-11 11:07:08 DEBUG query-processor.sort :: Sorted fields:
([:0-breakout 0 :2-other :appointment_dt]
 [:1-aggregation 2147483647 :2-other :count]
 [:3-other 2147483647 :2-other :appointment_dt]
 [:3-other 2147483647 :2-other :appointment_dt]
 [:3-other 2147483647 :2-other :appointment_dt])

09-11 11:07:08 DEBUG middleware.log :: Result Metadata:
 ({:table_id 6, :unit :hour-of-day, :name "appointment_dt", :source :breakout, :extra_info {}, :id 48, :visibility_type :normal, :display_name "Appointment Dt", :base_type :type/DateTime}
 {:special_type :type/Number, :name "count", :source :aggregation, :extra_info {}, :display_name "count", :base_type :type/Integer})

09-11 11:07:08 DEBUG metabase.middleware :: POST /api/dataset 200 (6 s) (15 DB calls)
@manurana manurana changed the title Time zone issues Time zone issues in 26rc1 Sep 11, 2017
@manurana manurana changed the title Time zone issues in 26rc1 Time zone issues : another use case Sep 11, 2017
@manurana
Copy link
Author

Went through #5902 and the associated writeup on timezone troubleshooting. Seems like the case of "Dates without an explicit time zone are being converted to another day". The suggested solution, to explicitly set the server time zone, does not seem to work.

Also realised that not ALL timezone fixes are in 26RC1, so I will wait till the 26 release to test this again.

@senior
Copy link
Contributor

senior commented Sep 11, 2017

@manurana It looks like you have specified a report timezone of Asia/Calcutta, is that true? Nevermind, I see it in the log output 👍

A JVM timezone is always specified. If one is provided via user.timezone, that is used, if not, the system's timezone is used. If you are using a report timezone, the JVM timezone should no longer matter.

I think I found a separate issue causing your problems. This code uses a default formatter from Joda Time. This will use UTC and it needs to be aware of the report timezone you have specified. I'll get that fixed.

@manurana
Copy link
Author

Thanks @senior ! I hope the fix is able to solve all 3 of my issues ! And thanks for the clarification on the report timezone vs JVM timezone. Maybe a candidate for the timezone doc ?

@senior
Copy link
Contributor

senior commented Sep 11, 2017

This bug seems like it would cause 1 and 2. I'm not sure about 3. What do you mean that it should consider the 30 minute differences? The "Hour of the Day" bucketing puts things together that occur in the same hour, not the same half hour. Could you elaborate a bit more on what you are looking for here?

@senior senior added this to the 0.26 milestone Sep 11, 2017
@senior senior added Type:Bug Product defects Misc/Timezones labels Sep 11, 2017
@manurana
Copy link
Author

Let me try to clarify what I mean by half hour.

If a row/metric has a time of 2016-03-29 18:36:56 (UTC), it should be counted in in the hr 1 bucket of 2016-03-30 for the report in IST.

For the report in IST, the hr 1 bucket of the day for 2016-03-30 should start at 2016-03-29 18:30:00 UTC and end at 2016-03-29 19:29:59 UTC, and the bucket label should show this as 1 AM (or 12 AM, depending on whether the system is using the start or end time of the bucket to label the hour of day)

All this is pretty straightforward, and I am sure the system takes care of it. I am just trying to explain away the discrepancy between the manual SQL numbers and the report, because they are not aligning. It seems to be just another manifestation of the same bug.

@salsakran salsakran modified the milestones: 0.26, 0.26.1 Sep 15, 2017
@salsakran
Copy link
Contributor

bumping this since we're in the process of cutting an RC2 for 0.26 and it doesn't seem like we've made enough headway to be sure it'll be fixed in the next day or two. If it is, we'll include it in 0.26.

@alirezastack
Copy link

We have encountered the similar problem of ignoring +30M in pulses. We are in IRDT (+3.30)(Asia/Tehran). And as we want to send pulses at midnight, it is sent at 00:30.

Will it be fixed with the issue addressed above by @manurana, @salsakran or it is a new issue?

@senior
Copy link
Contributor

senior commented Sep 20, 2017

@alirezastack Your issue sounds different than this one. Is the issue that the pulses are sent at the wrong time, or is it at the dates/timestamps in the pulse (the data) are in the wrong timezone?

@salsakran
Copy link
Contributor

@alirezastack what is your server/docker timezone and reporting timezone?

@alirezastack
Copy link

@salsakran Server is in UTC and Metabase timezone setting is set to Asia/Tehran.

@salsakran salsakran added the Priority:P2 Average run of the mill bug label Sep 25, 2017
@pdiniz13
Copy link

pdiniz13 commented Oct 4, 2017

This seems to be a similar issue to #3584, but in that case the issue is with timestamp

@SuryaSankar
Copy link

SuryaSankar commented Oct 7, 2017

Facing the same issue.

Server Timezone: UTC
Metabase Reporting Timezone: Asia/Calcutta
Database: MySQL 5.6
Column type: MySQL Datetime

I need a query to find the number of carts which were checked out today. When I set up a Question for the same via GUI, I get the answer as 10.
But I can see from our internal dashboard that 15 carts were checked out today. On viewing the generated sql, I can see that it does not have any convert_tz offset specified in it.

SELECT count(*) AS `count`
FROM `cart`
WHERE ((`cart`.`created_for_testing` = FALSE
   AND `cart`.`is_checked_out` = TRUE) AND date(`cart`.`checked_out_on`) = date(now()))

On manually adding the convert_tz line, it works perfectly and gives 15 as output.

SELECT count(*) AS `count`
FROM `cart`
WHERE ((`cart`.`created_for_testing` = FALSE
   AND `cart`.`is_checked_out` = TRUE) AND date(convert_tz(`cart`.`checked_out_on`, 'UTC', 'Asia/Calcutta')) = date(now()))

Isn't the Report timezone setting in Admin Panel supposed to add this offset to queries automatically? If that can be done, it will help us a lot - as we can let everyone in the company set up any question they want. If not, they have to depend on dev team to manually repair the SQL queries to get the correct data.
And also - it seems like X-Ray Features work on Timeline graphs only when they are generated via GUI. When I manually tweak the SQL to add the timezone offset - the useful features like X-Ray, group by week/month etc are not available anymore.

Metabase is an amazing tool which has given us an unlimited number of dashboards and queries to use for free, as opposed to other commercial alternatives which have expensive quota limits. But this one drawback is preventing full adoption by our team. Hope to see it fixed soon.

@senior
Copy link
Contributor

senior commented Nov 21, 2017

@manurana I spent some time digging into this problem yesterday and I think I have a pretty good idea of what the problem is, but unfortunately I think it is not an easy fix.

The issue is that you are using a datetime column in MySQL which doesn't have an associated timezone. Whatever timezone you specify for the database session, the datetime will always be the same. You show an example query above where you are adjusting from the database timezone (UTC) to your specified report timezone (Asia/Calcutta). As a user/owner of the data, you know what timezone the data was originally stored in and you have an expectation around how it should be adjusted to the report timezone. Other users might have their database in UTC, but are storing the datetime fields in Asia/Calcutta. In that example, they would not want their datetime fields converted like you have above.

This issue here is that you have the knowledge around how that is stored. If Metabase tried to guess what your intent was (UTC vs. Asia/Calcutta) we could be wrong. The real fix is that we need to allow you to tell us how to interpret datetime fields. I have created #6439 to cover this work. I think if we have that, it will solve your problem as we would be able to implement convert_tz changes in a way that would work for you and other users as well.

Another option would be to switch datatypes. I'm not sure how easy this is for you. MySQL has another datatype called timestamp that is similar to datetime but always stores the data in UTC. With that, it will automatically adjust the timestamp values from UTC to whatever timezone the database session is currently in. Using timestamp fields with the report timezone setting I think will achieve your desired result.

@manurana
Copy link
Author

Hi @senior !

Other users might have their database in UTC, but are storing the datetime fields in Asia/Calcutta. In that example, they would not want their datetime fields converted like you have above.

Is this not a very unlikely use case ? I thought (and I might be wrong here) that the whole purpose of having a database in UTC was to make sure that your datetimes are also stored in UTC. And that is also recommended practice for apps (everything in UTC was the mantra given to me once :))

If we do consider this an outlier use case, then we do have all the information necessary to deal with this data. Any time stored in a UTC DB can be assumed to be in UTC. And the report timezone is what I specify in one/both of metabase UI or JVM settings (I still don't fully understand the interaction between them)

The new issue you have created would still be valid for the outlier use case. If we can have a simpler solution to cover the generic use case (where we assume that the datetimes in a UTC db are stored in UTC), it might benefit more people quicker. If its the same effort, then my argument is moot. Would love to hear your thoughts on it.

@camsaul camsaul modified the milestones: 0.27, 0.28 Nov 28, 2017
@salsakran salsakran modified the milestones: 0.28, 0.28.1 Feb 6, 2018
@salsakran salsakran added this to the 0.28.2 milestone Feb 9, 2018
@salsakran salsakran modified the milestones: 0.28.2, 0.29 Mar 20, 2018
@salsakran salsakran modified the milestones: 0.29, 0.29.1 May 1, 2018
@salsakran salsakran modified the milestones: 0.29.1, 0.29.2, 0.30 May 11, 2018
@salsakran salsakran modified the milestones: 0.30, 0.31 Sep 14, 2018
@camsaul camsaul modified the milestones: 0.31, 1.0 Nov 21, 2018
@salsakran salsakran modified the milestones: 0.32.0, 0.33 Apr 1, 2019
@camsaul camsaul modified the milestones: 0.33, 0.34.0 Sep 18, 2019
@camsaul camsaul modified the milestones: 0.34.0, 0.35.0 Jan 7, 2020
@camsaul camsaul modified the milestones: 0.35.0, 0.36.0 May 11, 2020
@camsaul camsaul removed this from the 0.36.0 milestone Jul 16, 2020
@NevRA NevRA added .Backend .Team/Querying and removed .Team/QueryProcessor(deprecated) Use .Team/Querying instead labels Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants