Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for TIMESTAMPTZ #9220

Closed
1 task done
djouallah opened this issue May 21, 2024 · 4 comments
Closed
1 task done

add support for TIMESTAMPTZ #9220

djouallah opened this issue May 21, 2024 · 4 comments
Labels
bug Incorrect behavior inside of ibis deltalake Issues or PRs related to deltalake support io Issues related to input and/or output

Comments

@djouallah
Copy link

What happened?

DUNIT = DUNIT.cast({"SETTLEMENTDATE": "TIMESTAMPTZ"})
generate an error

What version of ibis are you using?

10.0.0.dev71

What backend(s) are you using, if any?

DuckDB and pyspark

Relevant log output

SignatureValidationError: Schema({'SETTLEMENTDATE': 'TIMESTAMPTZ'}) has failed due to the following errors:
  `fields`: {'SETTLEMENTDATE': 'TIMESTAMPTZ'} is not matching GenericMappingOf(key=InstanceOf(type=<class 'str'>), value=CoercedTo(type=<class 'ibis.expr.datatypes.core.DataType'>, func=<bound method DataType.__coerce__ of <class 'ibis.expr.datatypes.core.DataType'>>), type=CoercedTo(type=<class 'ibis.common.collections.FrozenOrderedDict'>, func=<class 'ibis.common.collections.FrozenOrderedDict'>))

Expected signature: Schema(fields: FrozenOrderedDict[str, DataType])

Code of Conduct

  • I agree to follow this project's Code of Conduct
@djouallah djouallah added the bug Incorrect behavior inside of ibis label May 21, 2024
@ncclementi
Copy link
Contributor

Hi @djouallah thanks for opening this issue.

Can you give us more context on what are you trying to achieve? If possible a minimal reproducible example?

What's the original type of the SETLEMENTDATE column, and what are you trying to cast it too?

You can create an example using a memtable, this way we can reproduce it and help you out.

import ibis
t = ibis.memtable([{"a": 1}, {"a": 2}])

@djouallah
Copy link
Author

alright, I am trying ibis for the first time, my approach, load some data, do the transformation and write back to delta table, which works fine except that there is a subtle difference how arrow write timestamp and pyspark, i endup with two data type one with time zone the other not

see example here : AnalysisException: [DELTA_FAILED_TO_MERGE_FIELDS] Failed to merge fields 'SETTLEMENTDATE' and 'SETTLEMENTDATE'

https://colab.research.google.com/drive/18JGPcj3VaJ2tcxDZx1XB7UufTnEPjRfu#scrollTo=51jIBhgTXzrS

@jcrist jcrist added deltalake Issues or PRs related to deltalake support io Issues related to input and/or output labels May 22, 2024
@ncclementi
Copy link
Contributor

@djouallah I see there are two different issues here.

The one initially reported, about "TIMESTAMPTZ". To proved a timezone you need to use the Ibis timestamp dtype.

import ibis.expr.datatypes as dt
#then you can do something like
DUNIT = DUNIT.cast({"SETTLEMENTDATE": dt.Timestamp(timezone="UTC")})

The other issue, you are reporting on the second comment might be a bug. What seems to be happening is that at the moment the disk type that we are writing with is backend specific, and there seems to be backend inconsistencies when writing to disk. But the code provided, is not a minimal reproducible example as as it requires downloading some data from the internet. Would you be able to provide a
minimal reproducer (ideally with ibis.memtable()) so we can track this down.

@djouallah
Copy link
Author

@ncclementi thanks a lot, it was just me not reading the documentation, problem solved all good thanks for your help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis deltalake Issues or PRs related to deltalake support io Issues related to input and/or output
Projects
Archived in project
Development

No branches or pull requests

3 participants