Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove the get_data_catalog() function #1941

Merged
merged 2 commits into from
Dec 11, 2023

Conversation

rtyler
Copy link
Member

@rtyler rtyler commented Dec 5, 2023

This is related to #1713 and #1860 insofar that it removes the get_data_catalog() function. This is being done separately from the sub-crates work since it's easy enough to remove and frankly doesn't work properly anyways for Python users, which is the only caller we currently have in our codebase

Fixes #1860

This is related to delta-io#1713 and delta-io#1860 insofar that it removes the
`get_data_catalog()` function. This is being done separately from the
sub-crates work since it's easy enough to remove and frankly doesn't
work properly anyways for Python users, which is the only caller we
currently have in our codebase

Fixes delta-io#1860
@github-actions github-actions bot added binding/python Issues for the Python package binding/rust Issues for the Rust crate crate/core labels Dec 5, 2023
Copy link

github-actions bot commented Dec 5, 2023

ACTION NEEDED

delta-rs follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Dec 5, 2023

It was also quite limited in functionality.. for example unity catalog basically meant you still need storage level access which in most situations you won't get.

@rtyler are we going to plan to support this in the future again?

Before we can merge we probably should add a bunch of deprecation messages in Python since it was working before at least for unity catalog

@rtyler rtyler changed the title Remove the get_data_catalog() function fix remove the get_data_catalog() function Dec 5, 2023
@rtyler rtyler marked this pull request as ready for review December 5, 2023 17:14
@rtyler rtyler changed the title fix remove the get_data_catalog() function fix: remove the get_data_catalog() function Dec 5, 2023
@rtyler
Copy link
Member Author

rtyler commented Dec 5, 2023

@ion-elgreco I dunno, I strongly doubt people are using that functionality. The Unity catalog functionality is practically worthless IMHO since you have to have credentials for Unity catalog to lookup a table's location, and then a second set of credentials to actually access that storage.

IMHO it is not worth slowing changes for a couple reasons:

  • I think Unity is highly unlikely to be used by Python users in this way
  • We're not a 1.0 crate or wheel yet

I however defer to @wjones127 here since he's much more familiar with the Python ecossytem than I am.

I do need this to merge before #1825 because there's this presumption in this API that all available catalogs will be compiled into deltalake-core which is I need to remove for sub-crates.

Copy link
Collaborator

@roeap roeap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

as you mentioned the catalogs in python are somewhat non functional right now anyhow ...

@wjones127
Copy link
Collaborator

Given they are not functional and don't have a maintainer right now, I'm fine removing them for now.

#1466

@rtyler rtyler enabled auto-merge (rebase) December 11, 2023 17:13
@ion-elgreco
Copy link
Collaborator

We need to remove the python code paths as well.

@rtyler rtyler merged commit ef84cff into delta-io:main Dec 11, 2023
24 checks passed
@dhirschfeld
Copy link
Contributor

I'm trying/struggling to understand the current state of Unity Catalog support.

  • Is it true that UC support has been removed from 0.15?
  • If so, is it planned to bring it back in future so I could connect to a table via UC without having direct access to the S3 bucket?
  • If it is planned to bring it back, are there any estimates of when that will be - e.g. 2024? Or are there no firm plans to bring back UC support at this stage?

@ion-elgreco
Copy link
Collaborator

@dhirschfeld I would love to get UC support but this is all in the hands of databricks. They simply made it impossible to read data that is behind a catalog outside of databricks...

@dhirschfeld
Copy link
Contributor

dhirschfeld commented Jan 7, 2024

@dhirschfeld I would love to get UC support but this is all in the hands of databricks. They simply made it impossible to read data that is behind a catalog outside of databricks...

Well, that's disappointing 😞
I thought it might be possible as they have a lot of docs around managing storage credentials - e.g.

...but maybe that just gives access from within the DataBricks UI and doesn't actually expose the underlying storage token 😞

@ion-elgreco
Copy link
Collaborator

ion-elgreco commented Jan 7, 2024

@dhirschfeld One of the requirements of grabbing the storage credential is: have some permission on the storage credential. If you have access to databricks and a managed credential you can try it out and let me know.

But I am not sure that would be the same as read access to a certain table, so you may still need some form of high level access, and I assume the people who have access to the managed credential they likely already have access directly to the object store.

If you find another way, we can definetely add it! :) I'm personally also hampered by the fact that I can't read delta tables in UC with delta-rs

@dhirschfeld
Copy link
Contributor

If you find another way, we can definetely add it! :)

I don't know much about Databricks, and less about Unity Catalog, but I'll have a poke about to see what I can find. If you have been unable to find a way to do it, I'm not that hopeful, but we'll see...

I opened #2066 to track my progress! 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
binding/python Issues for the Python package binding/rust Issues for the Rust crate crate/core
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add DataCatalog support
5 participants