Skip to content

Gwright99/1338 isolate fusion permissions for gcp batch#1341

Open
gwright99 wants to merge 8 commits intomasterfrom
gwright99/1338-isolate-fusion-permissions-for-gcp-batch
Open

Gwright99/1338 isolate fusion permissions for gcp batch#1341
gwright99 wants to merge 8 commits intomasterfrom
gwright99/1338-isolate-fusion-permissions-for-gcp-batch

Conversation

@gwright99
Copy link
Copy Markdown
Member

This PR provides updated, more granular, guidance re: storage permissions in GCP.

Effort originally focused on fixing GCP Batch documentation (in response to customer bug report) but the scope grew to also encompass Google Cloud (single VM) and GKE once their content was also reviewed.

NOTE:

  • Updated permissions were verified against minimal access configurations used by @ejseqera during rounds of testing in GCP.

  • Existing structure does not offer a mechanism to define common guidance once, so I sacrificed being DRY to stay in alignment with the existing overall structure of the docs site. This means the same language is repeated in multiple places, with some minor differences where relevant.

- Fixes #1338
- Made permission requirements more granular for GCS Fuse & Fusion runs, and deprioritized suggest to use overly-powerful 'roles/storage.admin' role.
@netlify
Copy link
Copy Markdown

netlify Bot commented Apr 22, 2026

Deploy Preview for seqera-docs ready!

Name Link
🔨 Latest commit 3f83d81
🔍 Latest deploy log https://app.netlify.com/projects/seqera-docs/deploys/69f3a0c4448b490008384478
😎 Deploy Preview https://deploy-preview-1341--seqera-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@gwright99
Copy link
Copy Markdown
Member Author

@schaluva -- Can you please carve out some time to do a few tests runs with the permissions I've laid out? Based on the sources I've pulled from, I'm pretty confident in the changes within this PR but it would be good to have independent, up-to-date verification.

@gwright99
Copy link
Copy Markdown
Member Author

Reviewers (other than @schaluva ) please hold off on review for now -- I may have mixed up some of the content between "Permissions needed by Platform" vs "Permissions needed by Fusion when running in the GCP CE". We're sorting this out now.

@schaluva
Copy link
Copy Markdown

The documentation only defines a single "custom service account" and tells users to assign it a set of permissions.

However, it should be defined as two separate service account identities in GCP Batch. Permissions needed by Platform maps to the credentials SA (the JSON key in Platform Credentials) — that's the orchestration layer: Batch API, Compute API, Logging read, and reading the pipeline log file back from GCS to display in the UI. It has no object-level storage permissions.

Permissions needed by Fusion/GCSFuse maps to the head job SA (the "Head Job Service Account" field in the CE form) — that's the runtime layer. The VM and Nextflow both run as this SA. All GCS data permissions live here, not on the Platform credential.

The one Fusion-specific addition to GCSFuse is storage.buckets.get — Fusion probes bucket metadata at mount time on every bucket it touches (work-dir, inputs, publishDir).

One gap worth flagging: roles/storage.objectUser is missing storage.buckets.get, so it's not sufficient for the Fusion work-dir case on its own. No predefined GCP role covers all five Fusion permissions cleanly — a custom role might be the right call here.

Data Explorer is a third functional permission set and should be documented separately. For Data Explorer to work, the principal needs: storage.buckets.list, storage.buckets.get, storage.objects.list, storage.objects.get, and storage.objects.create if file upload is required.

@gwright99
Copy link
Copy Markdown
Member Author

Findings

The Terraform solution (TODO: add link) I used to verify the permutations revealed that roles/storage.bucketViewer was not required by either Fusion or GCS Fuse to successfully publish to a standalone publication bucket.

I will remove those lines from the PR.

Evidence

Launchpad entries were run sequentially from the top.

image image

@gwright99
Copy link
Copy Markdown
Member Author

@schaluva -- Made a few updates but there are a few outstanding things remaining.

  1. Removed roles/storage.bucketViewer from publish dir permissions guidance in GCP Batch and GKE.
  2. Split out Platform Service Account from (optional) Nextflow Service Account in GCP Batch.

Todo:

  1. Confirm the minimum GCS Bucket permissions for Platform to work.
  2. Map the list of API permissions you compiled today to corresponding IAM roles.

@gwright99
Copy link
Copy Markdown
Member Author

gwright99 commented Apr 24, 2026

(Semi-verified) storage permissions for Seqera Platform SA:

CE Creation Screen (GCP Batch, GCP Cloud) -- Semi-Validated

  1. Google Batch -- Validated
    • Controlled testing reveals that no storage.* permissions, nor serviceusage.services.use permissions are required to successfully create a new Batch CE and run hello-world.
    • QOL:
      • storage.buckets.list (for Work directory dropdown),
      • compute.zones.list
  2. GKE
    • Need permission to get GKE cluster details. Suspect:
      • container.clusters.list
      • container.clusters.get
      • serviceusage.service.use
    • TBD storage
  3. Single VM -- Validated:

Pipeline Run Screen -- Investigation Required

Permission Fulfills Scope
storage.objects.get Download task logs and report TSVs work-dir bucket & every bucket the pipeline emits to
storage.objects.list Enumerate workdir subfolders for task log resolution work-dir bucket & every bucket the pipeline emits to
storage.buckets.get Resolve bucket location work-dir bucket & every bucket the pipeline emits to

Data Explorer (Auto-discovery) -- Validated

Permission Fulfills Scope
storage.buckets.list Enumerate all buckets in project Project
serviceusage.services.use Authorize project API-quota consumption Project

Data Explorer (Manual) -- Validated

Permission Fulfills Scope
storage.buckets.get Validate bucket exists and read location Project or Bucket

Data Explorer (Read / Navigate ) -- Validated

Permission Fulfills Scope
storage.objects.get Fetch object metadata, previews, download content Bucket(s)
storage.objects.list Browse folders / paginate Bucket(s)

Assumes credential also supports working Data Explorer Auto-discovery / Manual mechanism.

Data Explorer (Create / Download) -- Validated

Permission Fulfills Scope
storage.objects.create Upload Bucket(s)

Assumes credential also supports working Data Explorer Auto-discovery / Manual mechanism, and Data Explorer Read / Navigate mechanism.

Studios -- Requires Validation

Piggybacks on Data-Link permissions. Ensure storage.objects.create and storage.objects.delete to allow notebooks to be saved to buckets.

Match to Pre-Existing Roles -- Requires Validation

roles/storage.objectUser + roles/storage.legacyBucketReader is a close approximation of necessary permissions but still grants slightly more than needed (e.g. storage.objects.update). A custom role is cleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GCP Batch permissions commingle GCS permissions and Fusion permissions

3 participants