Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Add recommended GCP roles and privileges #220

Merged
merged 8 commits into from
Jul 19, 2024

Conversation

nj1973
Copy link
Collaborator

@nj1973 nj1973 commented Jul 18, 2024

No description provided.

@nj1973 nj1973 linked an issue Jul 18, 2024 that may be closed by this pull request
@nj1973 nj1973 marked this pull request as draft July 18, 2024 12:34
@nj1973 nj1973 marked this pull request as ready for review July 19, 2024 12:16
@nj1973 nj1973 requested a review from abb9979 July 19, 2024 12:16

### Service Account

A service account should be provisioned from the GCP project. This service account can be used by any service that will execute GOE commands, for example it could be attached to a GCE virtual machine.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumes that readers know what GCP, GCE etc stand for (a fair assumption really). But is it worth defining them?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed, thanks

README.md Outdated
@@ -78,11 +78,11 @@ alter user goe_adm identified by ...;
alter user goe_app identified by ...;
```

# Building a custom package
# Building a Custom Package

If you want to test with the latest commits that have not yet been included in a Github release you can build a custom package from the repository.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GitHub

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

README.md Outdated
@@ -37,7 +37,7 @@ GOE_WHEEL="goe_framework-<goe-version>-py3-none-any.whl"
python3 -m pip install lib/${GOE_WHEEL}
```

## Configuration file
## Configuration File
Create `offload.env` in the Offload Home, this file contains the necessary configuration specific to your environment:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Offload Home. This file contains...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


### Cloud Storage Bucket

A cloud storage bucket is required to stage data before ingesting it into BigQuery. Ensure the bucket is in a location compatible with the target BigQuery dataset.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use GCP and GCE above, do we need to use GCS given that this is BigQuery, and therefore GCS, specific? Else, "cloud storage" should be "Cloud Storage".

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


### Dataproc (Spark)

For non-trivially sized tables GOE uses Spark to copy data from the source database to cloud storage. In a GCP setting this is likely to be provided by one of two services:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For tables of a non-trivial size, GOE uses...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

--add-permissions=bigquery.tables.create,bigquery.tables.update
```

If GOE is permitted to drop tables then add these extra privileges:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extra

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

```

#### goe_bq_stg_role
Note that the role grant is bound to the staging BigQuery dataset (which has the same name as the target dataset but with an "_load" suffix), no project wide access is granted. The `bq` utility is used to grant the role because `gcloud` does not support these granular grants.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. typo: an "_load"
  2. ...suffix). No project-wide access...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

```
gcloud iam roles create goe_bq_app_role --project ${PROJECT} \
--title="GOE Data Update Access" \
--description="Grants GOE permissions to read and modify data in an application schema" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to say "dataset" instead of "schema"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

```
gcloud iam roles create goe_bq_stg_role --project ${PROJECT} \
--title="GOE Data Staging Access" \
--description="Grants GOE permissions to manage objects in a staging schema" \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to say "dataset" instead of "schema"?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

```

## Compute Engine Virtual Machine
Values supplied below are examples only, changes will likely be required for each use case:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...examples only. Changes are likely to be...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@nj1973
Copy link
Collaborator Author

nj1973 commented Jul 19, 2024

Ready for re-review. Thanks

@nj1973 nj1973 merged commit 202f9bb into main Jul 19, 2024
7 checks passed
@nj1973 nj1973 deleted the 208-recommended-gcp-roles-and-privileges branch July 19, 2024 15:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document recommended GCP roles and privileges
2 participants