Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dttm format #506

Closed
wants to merge 22 commits into from
Closed

Dttm format #506

wants to merge 22 commits into from

Conversation

yxjames
Copy link
Contributor

@yxjames yxjames commented May 24, 2016

Add support for custom datetime format for tables.

@joshwalters
Copy link
Contributor

This should help with the following issues: #440, #411, #397, #397, #185

@landscape-bot
Copy link

Code Health
Repository health decreased by 0.01% when pulling 4336f83 on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health decreased by 0.13% when pulling 34b6278 on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.10% when pulling eba364d on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling 2ff5cc1 on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling 5816d01 on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@coveralls
Copy link

coveralls commented May 24, 2016

Coverage Status

Coverage increased (+0.03%) to 82.197% when pulling 5816d01 on joshwalters:dttm_format into eb3bfb5 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling 7360fd2 on joshwalters:dttm_format into dee4c34 on airbnb:master.

@coveralls
Copy link

coveralls commented May 24, 2016

Coverage Status

Coverage increased (+0.03%) to 82.197% when pulling 7360fd2 on joshwalters:dttm_format into dee4c34 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling adcafad on joshwalters:dttm_format into 7d27692 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling adcafad on joshwalters:dttm_format into 7d27692 on airbnb:master.

This reverts commit 6897c38.
@coveralls
Copy link

coveralls commented May 24, 2016

Coverage Status

Coverage increased (+0.03%) to 82.197% when pulling 7218303 on joshwalters:dttm_format into 7d27692 on airbnb:master.

@landscape-bot
Copy link

Code Health
Repository health increased by 0.14% when pulling 7218303 on joshwalters:dttm_format into 7d27692 on airbnb:master.

@joshwalters
Copy link
Contributor

All fixes are done, ready for review.

A new input area is present when adding a table. You can use it to provide the date time format to use when generating queries and when rendering the UI.



def upgrade():
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

img
failing would result in a broken app, let's fail early in the migration script if anywhere

@mistercrunch
Copy link
Member

Wait should this be a table attribute or a column attribute? Also, I don't think we need it on the Druid side. If formatting is assumed to be heterogenous across tables it can be just as much across date like columns within a table.

@joshwalters
Copy link
Contributor

For some reason, Landscape check is not running. This seems to affecting all pull requests right now.

@coveralls
Copy link

coveralls commented Jun 8, 2016

Coverage Status

Coverage increased (+0.03%) to 81.887% when pulling 9c5419a on joshwalters:dttm_format into ad5507c on airbnb:master.

@coveralls
Copy link

coveralls commented Jun 9, 2016

Coverage Status

Coverage increased (+0.03%) to 81.893% when pulling 646542a on joshwalters:dttm_format into 8a579e2 on airbnb:master.

@tarekrached
Copy link

FWIW, we have tables with multiple, differently formatted dttm columns, so I'd love to see this at the column level. Seems like this could sit beside the is_dttm flag on table_columns.

@mistercrunch
Copy link
Member

I think there's still a glitch where depending on the type in the database, things might need to be casted by a database function or not. Say for Presto, I could have:

  1. a column that has a text date formatted as iso (that works with the current code)
  2. a column that has a text date formatted as MM/DD/YYYY (that works with the current code, yippee)
  3. a DATETIME column, so that somehow it has to look something like date_col = to_date(...) (this does not work with the current code)
  4. an INT column with an epoch (does that work currently?)

Somehow it would be great to be able to make it work in all these situations. 3 is important.

@alanmcruickshank
Copy link
Contributor

The last one would be really useful for us. Especially if it could be and
INT or DECIMAL with an epoch. Currently we do a lot of casting to get round
that.

On Thu, 16 Jun 2016 06:27 Maxime Beauchemin, notifications@github.com
wrote:

I think there's still a glitch where depending on the type in the
database, things might need to be casted by a database function or not. Say
for Presto, I could have:

  1. a column that has a text date formatted as iso (that works with the
    current code)
  2. a column that has a text date formatted as MM/DD/YYYY (that works with
    the current code, yippee)
  3. a DATETIME column, so that somehow it has to look something like date_col
    = to_date(...) (this does not work with the current code)
  4. an INT column with an epoch (does that work currently?)

Somehow it would be great to be able to make it work in all these
situations. 3 is important.


You are receiving this because you are subscribed to this thread.

Reply to this email directly, view it on GitHub
#506 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AEdFuGZ_1Dijowa2MsAJDzyOHlKrH2h-ks5qMN7ZgaJpZM4Ik_ru
.


Alan Cruickshank
E: alanmcruickshank@gmail.com
T: +44 770 6851825

@mistercrunch
Copy link
Member

For the record, we want the casting or transformation to be applied to the constant and not the column. It's possible today to create whatever column expression and have Caravel do the time filtering against that expression. This isn't ideal because the database optimizer gets thrown off and can't do partition pruning or use indexes on the column.

The ideal scenario is where the user can tell Caravel knows how to prepare the right side of the equation to use against that time-like column.

@tarekrached
Copy link

@mistercrunch, that sounds solid. I'm a bit confused about your earlier comment regarding dates formatted as MM/DD/YYYY. I can see that you can compare those without raising an error (lexical string to string), but short of explicitly declaring a dttm_format, I don't understand how the comparisons could be valid.

self.results = self.datasource.query(**query_obj)
self.query = self.results.query
df = self.results.df
if df is None or df.empty:
raise Exception("No data, review your incantations!")
else:
if 'timestamp' in df.columns:
df.timestamp = pd.to_datetime(df.timestamp, utc=False)
df.timestamp = pd.to_datetime(
df.timestamp, utc=False, format=timestamp_format)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this here as opposed to relying on pandas's to_datetime magic, but it breaks down if we need to use a database side CAST or TO_DATE of some form

"""Returns a string that the database flavor understands as a date"""
default = "'{}'".format(dttm.strftime('%Y-%m-%d %H:%M:%S.%f'))
tf = tf or '%Y-%m-%d %H:%M:%S.%f'
default = "'{}'".format(dttm.strftime(tf))
iso = dttm.isoformat()
d = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think this previous approach had shortcomings to start with. It assumed that the casting required could be defined on a per-database basis where it cannot since even within a single table you could represent dates in different fashion (epoch, string, native DATETIME, DATE or TIMESTAMP types...).

@mistercrunch
Copy link
Member

What about two fields instead of one in Column model:

  • python_date_format : if the date is stored as a string in the database, this field stores a python date format expression as defined by strftime (link to it!) to be able to interact with it
  • database_expression: if the Caravel datetime constants need to be altered or casted on the database server side, this field should contain the casting expression as in TO_DATE('{dttm}', 'YYYYMMDD'), where {dttm} will contain the datetime string, formatted based on the python_date_format specified above

Does that cover all scenarios?

Note that we'll need to maintain the current behavior as much as possible, either by pre-populating these fields based on current database defaults in the db migration script, or by leaving the field empty and applying the current default at runtime where the fields are empty.

@tarekrached
Copy link

Intriguing - I like the way this is going. Wouldn't the database_expression be more appropriate at the Database level?

Thank you for your time on this, @mistercrunch!

@yxjames
Copy link
Contributor Author

yxjames commented Jun 16, 2016

@tarekrached Yeah I think database_expression should be added into Database model

@axeisghost
Copy link
Contributor

Thanks! The reply is valuable and we are now working on the problems according to the suggestion from @mistercrunch . We are trying to make changes base on the modification we made before. The python_date_format will work with the way similar to the dttm_format before and the datebase_expression will be added soon. I guess this solution should solve most of the senarios.

@mistercrunch
Copy link
Member

Superseeded by #652

zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 17, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 24, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 25, 2021
zhaoyongjie pushed a commit to zhaoyongjie/incubator-superset that referenced this pull request Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants