-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document Azure and GCP integration #8662
Comments
I am clearly not the expert here on this to add documentation. |
I was confused with where to start on GCP. I haven't dug too deep because I know there is no write support yet for pyiceberg (and I do not have Spark installed at all), but I did want to test a few things. I know I need a catalog and it looks like the easiest one might be my PostgreSQL cloud SQL database:
But I did not see a description of how to create the actual table, if the database needs to be named specifically, what is the schema of the database table? Does it get created automatically if I just create a "iceberg" database with a iceberg user and password? Since I am on GCP, I know of Dataproc as well and I have explored Dataplex before, but I am not sure if I can use that as a catalog for pyiceberg. If I can use this, I am not sure if there is a benefit compared to the PostgreSQL approach above either way. I might be missing something - I am basing this information off of this documentation: https://py.iceberg.apache.org/configuration/ |
This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible. |
This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale' |
We have a nice documentation for AWS (https://iceberg.apache.org/docs/latest/aws/) which explains if the warehouse path is S3, what configurations are needed for catalog and what dependencies are needed.
Expecting a similar documentation (which also covers dependent libraries and all the authentication methods. For example Azure's accesskey, SAS token, principal)
The text was updated successfully, but these errors were encountered: