Skip to content

fix(databricks/pyspark): unify timestamp/timestamp_ntz behavior #11142

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cpcloud
Copy link
Member

@cpcloud cpcloud commented Apr 21, 2025

Closes #11137.
Closes #11062.

@github-actions github-actions bot added tests Issues or PRs related to tests pyspark The Apache PySpark backend labels Apr 21, 2025
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch from 3e06c79 to e0beb04 Compare April 21, 2025 13:26
@github-actions github-actions bot added the polars The polars backend label Apr 21, 2025
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch from e0beb04 to 479f888 Compare April 21, 2025 13:35
@github-actions github-actions bot added the sql Backends that generate SQL label Apr 21, 2025
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch from 479f888 to 8558f8b Compare April 21, 2025 13:42
@github-actions github-actions bot added ci Continuous Integration issues or PRs dependencies Issues or PRs related to dependencies nix Issues or PRs related to nix labels Apr 21, 2025
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch from 39f33fa to 64c5194 Compare April 21, 2025 16:49
@github-actions github-actions bot added postgres The PostgreSQL backend bigquery The BigQuery backend labels Apr 22, 2025
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch 2 times, most recently from 39a1e8d to 0ead096 Compare June 2, 2025 17:30
@cpcloud cpcloud force-pushed the databricks-pyspark-timezones branch from 0ead096 to 98df77b Compare June 2, 2025 18:01
@@ -559,7 +559,7 @@ def _read_in_memory(source: Any, table_name: str, _conn: Backend, **kwargs: Any)

@_read_in_memory.register("ibis.expr.types.Table")
def _table(source, table_name, _conn, **kwargs: Any):
_conn._add_table(table_name, source.to_polars())
_conn._add_table(table_name, _conn.to_polars(source))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a hidden use of the default backend.

@@ -525,6 +525,7 @@ def test_roundtrip_delta(backend, con, alltypes, tmp_path, monkeypatch):
["databricks"], raises=AssertionError, reason="Only the devil knows"
)
@pytest.mark.notyet(["athena"], raises=PyAthenaOperationalError)
@pytest.mark.xfail_version(pyspark=["pyspark<3.4"], raises=AssertionError)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This indicates the biggest behavior change here which is that timestamps in PySpark 3.3 are now always without a timezone, because that's all that was supported in that version of PySpark. PySpark >=3.4 supports timestamps with and without timezones.

@cpcloud
Copy link
Member Author

cpcloud commented Jun 2, 2025

I'm going to consider this a breaking change and merge it in for 11.0.

I'll release 10.6.0 this week as the last release of the 10.x series, and I'll start to merge the breakages after that.

@cpcloud cpcloud added the breaking change Changes that introduce an API break at any level label Jun 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bigquery The BigQuery backend breaking change Changes that introduce an API break at any level ci Continuous Integration issues or PRs dependencies Issues or PRs related to dependencies nix Issues or PRs related to nix polars The polars backend postgres The PostgreSQL backend pyspark The Apache PySpark backend sql Backends that generate SQL tests Issues or PRs related to tests
Projects
None yet
1 participant