Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Big Query data source #638

Merged
merged 7 commits into from
Feb 28, 2022
Merged

Big Query data source #638

merged 7 commits into from
Feb 28, 2022

Conversation

gruuya
Copy link
Contributor

@gruuya gruuya commented Feb 25, 2022

Add another remote data source plugin, this time for GCP's Big Query.

CU-26udw0h

@gruuya gruuya requested a review from mildbyte February 25, 2022 14:34
@gruuya gruuya self-assigned this Feb 25, 2022
credentials_schema: Dict[str, Any] = {
"type": "object",
"properties": {
"credentials_path": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should support passing in credentials as a JSON string instead of a file path, since we won't be able to use a path when e.g. adding this data source from a Web form (and this schema is used to generate it).

The commandline ergonomics will be awkward but we can add a special from_commandline method that loads and injects the JSON file when invoked from sgr mount, e.g. https://github.com/splitgraph/splitgraph/blob/c37291267ad60d085703b4a3068a8f39a70d2d7d/splitgraph/ingestion/csv/__init__.py#L299-L305

Copy link
Contributor Author

@gruuya gruuya Feb 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've now added the JSON string optional credentials parameter, and implemented from_commandline conversion.

},
"dataset_name": {
"type": "string",
"title": "Big Query dataset",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's branded as BigQuery -- can you change it in the descriptions, as well as change the plugin name / package names to bigquery instead of big_query?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly, I was split about that as well.

Convert json file creds parameter to the raw param when present. Also, align all entity names to bigquery, without underscore.

@classmethod
def get_name(cls) -> str:
return "Google Big Query"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "Google Big Query"
return "Google BigQuery"


@classmethod
def get_description(cls) -> str:
return "Query data in GCP Big Query datasets"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return "Query data in GCP Big Query datasets"
return "Query data in GCP BigQuery datasets"

credentials_schema: Dict[str, Any] = {
"type": "object",
"properties": {
"credentials": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's (currently) no point in letting users of the JSONSchema (which is used in form generation) to pass credentials via a path. I think this could be simplified to treat the commandline-passed credential string as a path and the one passed via __init__ as a JSON-serialized credential.

JSONSchema:

"credentials": {
    "type": "string",
    "title": "GCP credentials",
    "description": "GCP credentials in JSON format",
}

commandline:

$ sgr mount bigquery bq -o@- <<EOF
{
    "credentials": "/path/to/my/creds.json",
    "project": "my-project-name",
    "dataset_name": "my_dataset"
}
EOF

...

credentials = Credentials({})

with open(params.pop("credentials"), "r") as credentials_file:
    credentials_str = credentials_file.read()

params.pop("credentials")
credentials["credentials"] = credentials_str

@gruuya gruuya merged commit b4a7e77 into master Feb 28, 2022
@gruuya gruuya deleted the add-big-query-data-source-cu-26udw0h branch February 28, 2022 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants