Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Time grain support for unix-timestamp columns #1093

Merged
merged 3 commits into from
Sep 13, 2016
Merged

Time grain support for unix-timestamp columns #1093

merged 3 commits into from
Sep 13, 2016

Conversation

yejianye
Copy link
Contributor

Caravel has basic support for using unix-timestamp as time column when using 'epoch_s' or 'epoch_ms' in python date format. But those unix-timestamp columns don't work well with time grains such as hour/day/week/month, which makes them almost useless.

This PR adds time grain support for unix-timestamp columns. If a integer/real column stores unix timestamp data, you could toggle is temporal flag and set Python date format as epoch_s (or epoch_ms if in milliseconds). And now you could use it as time column in any time series visualization, and could select any time grain.

Below is a screenshot of using epoch_s to render a calendar view with example multi_format_time_series table
image

@yejianye
Copy link
Contributor Author

Also worth mentioning that, the other way to work with unix timestamps in Caravel is to create a custom column with a SQL expression converting timestamp to datetime. The main drawback for that approach is performance. We usually have index created on time columns, but It's impossible for DB to use index with the SQL function invoked on the left side of comparison in where clauses.

@xrmx
Copy link
Contributor

xrmx commented Sep 12, 2016

Patch looks good to me, i'd move the reference to the sqlite3 issue to the commit message instead of code as hopefully we can fix that in a timely way.

@yejianye
Copy link
Contributor Author

Thanks, @xrmx. I changed the code to use another datetime conversion function which won't be affected by literal_column escaping bug.

@@ -172,12 +172,9 @@ def get_df(self, query_obj=None):
raise Exception("No data, review your incantations!")
else:
if 'timestamp' in df.columns:
if timestamp_format == "epoch_s":
if timestamp_format in ("epoch_s", "epoch_ms"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the previous approach was more explicit and not relying on magic as much. If someone was to plot historical event in the far past or close to 1970 the magic (or pandas guessing the unit) might not hold as well.

I'd leave the file as is was if it works.

Copy link
Contributor Author

@yejianye yejianye Sep 12, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old way is more explicit, but won't work. That's because unix timestamp has been converted to a datetime type, so either unit='s' or unit='ms' will throw error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't explain very clearly in my previous comment.

Pandas doesn't need to guess whether it's in seconds or milliseconds here after my change, because the data has already be converted to datetime type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha

@mistercrunch
Copy link
Member

Apart from the inline comment I made this LGTM.

[outside the scope of this PR sidenote] we'll need to bring all the logic around different db engines in one place eventually as we're starting to have a fair amount of it all over the code base...

@yejianye
Copy link
Contributor Author

@mistercrunch I've replied your inline comments. There doesn't seem any further actions I need take. If my reply looks good to you, could you merge this branch?

Since we've already converted unix epoch to datetime type,
we shouldn't specify 'unit' parameter in pandas.to_datetime
@mistercrunch
Copy link
Member

Thanks for the PR!

@mistercrunch mistercrunch merged commit 2e6b4b1 into apache:master Sep 13, 2016
dennisobrien pushed a commit to dennisobrien/caravel that referenced this pull request Sep 19, 2016
* Add time grain support for time columnd in unix timestamp

* Fix datetime parsing for unix epoch

Since we've already converted unix epoch to datetime type,
we shouldn't specify 'unit' parameter in pandas.to_datetime

* Fix SQLite timestamp to datetime conversion
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.11.0 labels Feb 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants