Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IBIS] FEAT: Need to add Ibis support for various timestamp component operations in OmniSci #56

Open
xmnlab opened this issue Aug 13, 2019 · 8 comments

Comments

@xmnlab
Copy link
Contributor

xmnlab commented Aug 13, 2019

issue: ibis-project/ibis#1916

@xmnlab xmnlab self-assigned this Aug 13, 2019
@xmnlab xmnlab removed the blocked label Oct 29, 2019
@xmnlab xmnlab added the todo label Jan 22, 2020
@xmnlab xmnlab changed the title FEAT: Need to add Ibis support for various timestamp component operations in OmniSci [IBIS] FEAT: Need to add Ibis support for various timestamp component operations in OmniSci Mar 11, 2020
@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 23, 2020

WIP PR: ibis-project/ibis#2149

@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 27, 2020

EPOCH Extraction

it seems OmniSciDB and PostgreSQL Epoch function doesn't have the same result:

================================== omniscidb ===================================
SELECT EXTRACT(EPOCH FROM "timestamp_of_birth") AS tmp
FROM salary
................................................................................
result: [  868410120  1634067208 -1895243000 ... -1189941608  -424284520
 -1515721064]
=================================== postgres ===================================
SELECT CAST(EXTRACT(epoch FROM t0.timestamp_of_birth) AS INTEGER) AS tmp 
FROM salary AS t0
................................................................................
result: [1262307661 1293843661 1325379661 ... 1546258332 1577794332 1609416732]

but it seems OmniSciDB DATEEPOCH has a similar result (I am assuming that it truncate the datetime by the date part) :

================================== omniscidb ===================================
SELECT EXTRACT(DATEEPOCH FROM "timestamp_of_birth") AS tmp
FROM salary
................................................................................
result: [1262304000 1293840000 1325376000 ... 1546214400 1577750400 1609372800]

@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 27, 2020

WEEK Extraction

It seems OmniSciDB Extraction by WEEK is not a ISOWEEK, as used by PostgreSQL and Pandas.

I have the same result for a ISO WEEK using the follow expression:

================================== omniscidb ===================================
SELECT
  CASE
    WHEN CAST(floor(((EXTRACT(DOY FROM "timestamp_of_birth") - (EXTRACT(ISODOW FROM "timestamp_of_birth") - 1 + 1)) + 10) / 7) AS SMALLINT) < 1 THEN CASE WHEN ((mod((((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) + floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 4)) - floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 100)) + floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 400), 7)) = 4) OR ((mod(((((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) - 1) + floor(((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) - 1) / 4)) - floor(((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) - 1) / 100)) + floor(((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) - 1) / 400), 7)) = 3) THEN 1 ELSE 0 END + 52
    WHEN CAST(floor(((EXTRACT(DOY FROM "timestamp_of_birth") - (EXTRACT(ISODOW FROM "timestamp_of_birth") - 1 + 1)) + 10) / 7) AS SMALLINT) > (CASE WHEN ((mod(((EXTRACT(YEAR FROM "timestamp_of_birth") + floor(EXTRACT(YEAR FROM "timestamp_of_birth") / 4)) - floor(EXTRACT(YEAR FROM "timestamp_of_birth") / 100)) + floor(EXTRACT(YEAR FROM "timestamp_of_birth") / 400), 7)) = 4) OR ((mod((((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) + floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 4)) - floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 100)) + floor((EXTRACT(YEAR FROM "timestamp_of_birth") - 1) / 400), 7)) = 3) THEN 1 ELSE 0 END + 52) THEN 1
    ELSE CAST(floor(((EXTRACT(DOY FROM "timestamp_of_birth") - (EXTRACT(ISODOW FROM "timestamp_of_birth") - 1 + 1)) + 10) / 7) AS SMALLINT)
  END AS tmp
FROM salary
................................................................................
result: [53 52 52 ...  1  1 53]
expected: [53 52 52 ...  1  1 53]
=================================== postgres ===================================
SELECT CAST(EXTRACT(week FROM t0.timestamp_of_birth) AS SMALLINT) AS tmp 
FROM salary AS t0
................................................................................
result: [53 52 52 ...  1  1 53]
expected: [53 52 52 ...  1  1 53]

using WEEK on omniscidb the result is different:

sql = '''
SELECT EXTRACT(WEEK FROM "timestamp_of_birth") AS tmp
FROM salary
'''
cursor2df(con['omniscidb'].con.execute(sql)).tmp.values

array([ 1, 1, 1, ..., 53, 53, 53])

another example: https://github.com/Quansight/ivan-sandbox/blob/master/ibis/poc/omniscidb-woy-pandas-comparison.ipynb

@xmnlab
Copy link
Contributor Author

xmnlab commented Mar 27, 2020

NANOSECOND Extraction

It seems, currently, pymapd use datetime object to return a date time data. So as datetime doesn't store nanosecond information, the result from pymapd lost the nanosecond information.

Maybe it could use np.datetime64 for that.

@kcpevey
Copy link

kcpevey commented Jul 24, 2020

@xmnlab The PR referenced above ibis-project/ibis#2149 is closed. Can you provide an update with links to the relevant PRs or upstream issues? Thanks!

@xmnlab
Copy link
Contributor Author

xmnlab commented Jul 26, 2020

@kcpevey
Copy link

kcpevey commented Jul 27, 2020

For this issue, the last thing to be implemented is nanoseconds. Ibis uses pymapd==0.23, this needs pymapd==0.24.0, but this is breaking the CI due to packaging issues. This is blocked until ibis-project/ibis#2256 is merged. Also, 0.24 changes the sql operation on omniscid.

@xmnlab
Copy link
Contributor Author

xmnlab commented Aug 13, 2020

it is still blocked until new pymapd is used by ibis. it depends on ibis-project/ibis#2256 (but the PR is closed)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants