Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(pyspark): unwind catalog/database settings in same order they were set #9067

Merged
merged 2 commits into from
Apr 30, 2024

Conversation

gforsyth
Copy link
Member

We set the catalog using a context manager, and we set the database also
using a context manager. There is a weird edge case:

set catalog to comms_media_dev (succeeds)
set database to dart_extensions within comms_media_dev (succeeds)

then we write the table, great! Now we try to change catalog and database back in reverse order and...

set database to default (the previous value that we saved) within comms_media_dev (fails, you do not have permission to access that)

set catalog back to spark_catalog (or previous value, but we never get here because of the previous error.

So what we need to do is instead:

set catalog
set database
write table
set catalog back
set database back

It would be really great if spark would allow for setting both of these values at the same time, but that is apparently not a thing.

xref #9038

…e set

We set the catalog using a context manager, and we set the database also
using a context manager. There is a weird edge case:

set catalog to comms_media_dev (succeeds)
set database to dart_extensions within comms_media_dev (succeeds)

then we write the table, great! Now we try to change catalog and database back in reverse order and...

set database to default (the previous value that we saved) within comms_media_dev (fails, you do not have permission to access that)

set catalog back to spark_catalog (or previous value, but we never get here because of the previous error.

So what we need to do is instead:

set catalog
set database
write table
set catalog back
set database back

It would be really great if spark would allow for setting both of these values at the same time, but that is apparently not a thing.
@gforsyth
Copy link
Member Author

I forgot to include catalog in the arguments to self.table so we were looking for a table in a different DB that we hadn't just created.

@gforsyth gforsyth force-pushed the pyspark_catalog_horribleness branch from 567cbc0 to bf52f9c Compare April 29, 2024 20:35
@cpcloud cpcloud added this to the 9.0 milestone Apr 30, 2024
@cpcloud cpcloud merged commit 962ee00 into ibis-project:main Apr 30, 2024
82 checks passed
@markdruffel-8451
Copy link

@gforsyth Just a heads up this is now working for my use case (multiple catalogs). Thanks so much for all your help and definitely feel free to hit me up if you ever need help testing pyspark backend 👍

@gforsyth gforsyth deleted the pyspark_catalog_horribleness branch April 30, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants