Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable passing Datafusion session state to WriteBuilder #1187

Merged
merged 1 commit into from
Mar 1, 2023

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Feb 27, 2023

Description

The session state can keep some context pertinent to the input plan, so passing it allows for building useful TaskContexts for executing the plan.

This is especially the case when the input plan references another Delta table, which has previously registered it's object store (that needs to be accessible during the execution of the physical plan).

Related Issue(s)

Closes #1186

Curiously, if in the test I use delta-0.8.0-partitioned the final SELECT query fails with:

External(Execution("Failed to map column projection for field year. Incompatible data types Dictionary(UInt16, Utf8) and Utf8"))

I'm not sure yet why this happens, but I think it has nothing to do with this change.

Documentation

Copy link
Collaborator

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks sensible to me 👍

@wjones127
Copy link
Collaborator

I'm not sure yet why this happens, but I think it has nothing to do with this change.

Yeah that seems to be another issue. I created #1194 so we can look into it further.

@wjones127 wjones127 merged commit b4d4c12 into delta-io:main Mar 1, 2023
@gruuya gruuya deleted the write-from-plan-task-context branch March 1, 2023 08:02
@roeap
Copy link
Collaborator

roeap commented Mar 1, 2023

This is in fact sensible, one word of caution though. The conflict resolution requires us to gain some insights into the query execution that we may to collect as metrics.

So while we will have to manage sessions and plans, there is a good chance we to think more about the APIs.

chitralverma pushed a commit to chitralverma/delta-rs that referenced this pull request Mar 17, 2023
# Description
The session state can keep some context pertinent to the [input
plan](https://github.com/delta-io/delta-rs/blob/main/rust/src/operations/write.rs#L153-L157),
so passing it allows for building useful `TaskContext`s for executing
the plan.

This is especially the case when the input plan references another Delta
table, which has previously registered it's object store (that needs to
be accessible during the execution of the physical plan).

# Related Issue(s)
Closes delta-io#1186 

Curiously, if in the test I use `delta-0.8.0-partitioned` the final
`SELECT` query fails with:
```
External(Execution("Failed to map column projection for field year. Incompatible data types Dictionary(UInt16, Utf8) and Utf8"))
```
I'm not sure yet why this happens, but I think it has nothing to do with
this change.

# Documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/rust Issues for the Rust crate rust
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Writing from a Delta table scan using WriteBuilder fails due to missing object store
3 participants