Skip to content

feat: lazy S3 bucket provisioning on first upload#969

Merged
pyramation merged 4 commits intomainfrom
feat/lazy-s3-bucket-provisioning
Apr 9, 2026
Merged

feat: lazy S3 bucket provisioning on first upload#969
pyramation merged 4 commits intomainfrom
feat/lazy-s3-bucket-provisioning

Conversation

@pyramation
Copy link
Copy Markdown
Contributor

@pyramation pyramation commented Apr 9, 2026

Summary

Instead of requiring S3 buckets to exist before any upload can happen, the presigned URL plugin now lazily provisions S3 buckets on the first requestUploadUrl call for each database.

How it works:

  • An in-memory Set<string> (provisionedBuckets) tracks which S3 bucket names have been seen this server process
  • Before generating a presigned PUT URL, ensureS3BucketExists() checks the set → if absent, calls the configured ensureBucketProvisioned callback → marks the bucket as provisioned
  • The callback (createEnsureBucketProvisioned in presigned-url-resolver.ts) uses BucketProvisioner.provision() to create the bucket with correct CORS, privacy policies, etc.
  • Subsequent requests for the same bucket skip provisioning entirely (no S3 round-trip)
  • On server restart the set resets, but provision() is idempotent (handles "already exists")

This closes the gap where the SQL seed trigger creates logical bucket rows (public/private/temp) per user but doesn't create actual S3 buckets — those are now created transparently on first upload.

Follows up on PR #968 (per-database bucket support).

CORS origin resolution (per-database)

CORS allowedOrigins for lazy provisioning follows a 2-tier hierarchy matching the bucket provisioner plugin pattern:

  1. Per-database allowed_origins from the storage_module table row
  2. Global fallback from SERVER_ORIGIN env var (via getAllowedOrigins() helper, defaults to ['http://localhost:3000'] for local dev)

The StorageModuleConfig type and SQL query now include allowedOrigins, which is passed through ensureS3BucketExistsensureBucketProvisionedBucketProvisioner.provision(). Each provision() call receives the resolved origins, so different databases can have different CORS configurations.

The pre-existing hardcoded ['http://localhost:3000'] in BucketProvisionerPreset was also replaced with getAllowedOrigins().

Review & Testing Checklist for Human

  • Concurrent upload race condition: Two simultaneous requestUploadUrl calls for a new database will both attempt provision(). The in-memory Set has no mutex, so both will pass the isS3BucketProvisioned check. Verify that BucketProvisioner.provision() truly handles "BucketAlreadyOwnedByYou" / "BucketAlreadyExists" gracefully without throwing.
  • Error propagation: If ensureBucketProvisioned throws (e.g., IAM permissions missing), confirm the error surfaces clearly to the caller and that the bucket is NOT marked as provisioned in the Set (so retries work). The current code calls markS3BucketProvisioned after the await, so a throw should skip the mark — but verify.
  • Per-database allowed_origins: The lazy provisioner reads allowed_origins from the storage_module table. Verify that this column is populated correctly for your databases. When NULL, falls back to SERVER_ORIGIN env var. Note: the bucket provisioner plugin has a 3-tier hierarchy (bucket → storage_module → global) but the lazy provisioner only uses 2 tiers (storage_module → global) since we're creating a new bucket, not updating an existing one.
  • SERVER_ORIGIN env var: Confirm this is set in production/staging. When unset, falls back to ['http://localhost:3000']. The helper wraps the single server.origin string into an array — verify this is sufficient if you need multiple origins.
  • Test plan: Exercise the upload flow (requestUploadUrl) against a database that has never had an upload. Confirm the S3 bucket gets created automatically and the presigned URL works. Then upload again and verify no provisioning call is made (check logs for absence of "Lazy-provisioning" message).

Notes

  • Added @constructive-io/bucket-provisioner as a direct dependency of graphile-settings (was already an indirect dep via graphile-bucket-provisioner-plugin).
  • Only requestUploadUrl gets lazy provisioning — confirmUpload and download URL generation are unchanged, per request.
  • The clearStorageModuleCache() function also clears the provisioned-buckets set, which is useful for testing.
  • No new tests were added for the lazy provisioning path. The existing presigned URL plugin tests pass, but integration testing against a real S3/MinIO instance is recommended.

Link to Devin session: https://app.devin.ai/sessions/e47513cf8b974ae6985c42c0a657e4d7
Requested by: @pyramation

When a presigned PUT URL is requested for a database whose S3 bucket
doesn't exist yet, the plugin now automatically provisions it with
the correct privacy policies, CORS rules, and lifecycle settings.

Changes:
- Add EnsureBucketProvisioned callback type to plugin options
- Add in-memory Set cache (provisionedBuckets) in storage-module-cache
  to track which S3 buckets are known to exist — no TTL needed since
  buckets are never deleted; resets on server restart (idempotent)
- Wire ensureS3BucketExists into requestUploadUrl before generating
  the presigned PUT URL — first request provisions, subsequent skip
- Add createEnsureBucketProvisioned factory in presigned-url-resolver
  that uses BucketProvisioner for full bucket setup
- Wire into ConstructivePreset with same allowedOrigins as bucket
  provisioner plugin
@devin-ai-integration
Copy link
Copy Markdown
Contributor

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@socket-security
Copy link
Copy Markdown

socket-security bot commented Apr 9, 2026

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Added@​tanstack/​react-query@​5.90.219910088100100

View full report

…ardcoding

Replace hardcoded ['http://localhost:3000'] with getAllowedOrigins() helper
that reads SERVER_ORIGIN from the env/config system. Falls back to
localhost for local dev when SERVER_ORIGIN is not set.

Also fixes the pre-existing hardcoded value in BucketProvisionerPreset.
The lazy provisioner now follows the same CORS resolution as the
bucket provisioner plugin:
  1. Per-database allowed_origins from storage_module table
  2. Global fallback from SERVER_ORIGIN env var

Changes:
- Add allowedOrigins to StorageModuleConfig type + SQL query
- Add allowedOrigins param to EnsureBucketProvisioned callback
- Pass storageConfig.allowedOrigins through ensureS3BucketExists
- In createEnsureBucketProvisioned, use per-database origins when
  available, fall back to getAllowedOrigins() (SERVER_ORIGIN env)
@pyramation pyramation merged commit a500e3b into main Apr 9, 2026
49 checks passed
@pyramation pyramation deleted the feat/lazy-s3-bucket-provisioning branch April 9, 2026 22:52
devin-ai-integration bot pushed a commit that referenced this pull request Apr 10, 2026
Adds a new integration test that exercises the full upload pipeline
for both public and private files using real MinIO:

- requestUploadUrl → PUT to presigned URL → confirmUpload
- Tests public bucket (is_public=true) and private bucket (is_public=false)
- Tests content-hash deduplication
- Uses lazy S3 bucket provisioning (from PR #969)
- Uses per-database bucket naming (from PR #968)

Includes seed fixtures (simple-seed-storage) that create:
- jwt_private schema with current_database_id() function
- metaschema tables (database, schema, table, field)
- services tables (apis, domains, api_schemas)
- storage_module config row
- storage tables (buckets, files, upload_requests)
- Two buckets: public and private
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant