Skip to content

hosting-cli(gcp): add --max-instances, --allow-unauthenticated, --env, --envfile#6557

Open
Kastier1 wants to merge 3 commits into
mainfrom
simon/gcp-cloud-run-max-instances-auth
Open

hosting-cli(gcp): add --max-instances, --allow-unauthenticated, --env, --envfile#6557
Kastier1 wants to merge 3 commits into
mainfrom
simon/gcp-cloud-run-max-instances-auth

Conversation

@Kastier1
Copy link
Copy Markdown
Contributor

@Kastier1 Kastier1 commented May 22, 2026

Summary

Four new flags for `reflex deploy --gcp`:

`--max-instances` (`IntRange(min=1)`, default `100`): caps autoscaling against Cloud Run's 100-instance default. CLI-level check rejects `max < min` so users get a clear error instead of an opaque gcloud one.

`--allow-unauthenticated / --no-allow-unauthenticated` (default true): today the deploy script unconditionally publishes the service to `allUsers`. The negated form makes the service private — callers then need `roles/run.invoker` to reach it (or front with IAP / a load balancer).

`--env KEY=VALUE` (`multiple=True`): set environment variables on the Cloud Run service. Parsed by the existing `hosting.process_envs` helper (validates key format). Repeat for multiple, matching the existing `reflex deploy` and `reflex secrets update` UX.

`--envfile PATH`: reads a .env file via `dotenv_values` (lazy import + install hint, same pattern as `secrets.py`). When both `--envfile` and `--env` are passed, `--envfile` wins with a warning — same precedence as the existing flows.

How env vars get to Cloud Run

The parsed dict is written to a YAML tempfile (`json.dumps` per value so quotes/backslashes/newlines round-trip), and the path is forwarded to the deploy script as `REFLEX_ENV_VARS_FILE`. The script hands it to `gcloud run deploy --env-vars-file=...`. Tempfile lifecycle is bound to `contextlib.ExitStack` — created only when envs are present, cleaned up afterward.

Help text on `--env` calls out that these become plain Cloud Run env vars (visible to anyone with `roles/run.viewer`) and points at Secret Manager for sensitive values — matches the security posture of the existing Reflex/Fly deploy flow.

Dry-run output now shows the env-vars YAML body so users can preview what's about to ship.

Requires the matching backend change so the deploy script honors the new env vars: reflex-dev/flexgen#3748

Test plan

  • `pytest tests/units/reflex_cli/v2/test_gcp.py` — 33 passed. New tests cover: max value forwarded, max default, max<min rejection, allow-unauth default, --no-allow-unauthenticated, --env tempfile, --envfile via dotenv, --envfile precedence over --env, YAML escaping of quotes/backslashes/newlines/empty, tempfile cleanup after run.
  • After flexgen PR merges: `reflex deploy --gcp --gcp-project p --max-instances 3 --no-allow-unauthenticated --env DB_URL=... --env FEATURE=on`; verify in the Cloud Run console that the service is private, capped at 3, and has the env vars.

🤖 Generated with Claude Code

Two new flags for `reflex deploy --gcp`:

- --max-instances (IntRange(min=1), default 100): caps autoscaling so
  cost-conscious deploys don't run open-ended against Cloud Run's 100-
  instance default. CLI-level validation rejects max < min so users
  get a clear error instead of an opaque gcloud one.

- --allow-unauthenticated / --no-allow-unauthenticated (default true):
  today the deploy script unconditionally publishes the service to
  allUsers. The negated form makes the service private — callers then
  need a roles/run.invoker IAM binding to reach it (or front it with
  IAP / a load balancer with IAM auth). Help text calls this out.

Forwarded as CLOUD_RUN_MAX_INSTANCES (string int) and
CLOUD_RUN_ALLOW_UNAUTHENTICATED ("true"/"false"). Requires the
matching backend change so the deploy script honors them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Kastier1 Kastier1 requested a review from a team as a code owner May 22, 2026 16:44
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 22, 2026

Greptile Summary

This PR adds four new flags to reflex deploy --gcp: --max-instances, --allow-unauthenticated/--no-allow-unauthenticated, --env, and --envfile. Env vars are serialized to a YAML tempfile managed by contextlib.ExitStack and forwarded to the deploy script via a new env var pointing to the file path.

  • --max-instances adds a cross-validation guard (max >= min) and defaults to 100, matching Cloud Run's own default.
  • --allow-unauthenticated defaults to true (preserving existing public behavior); the negated form sets the service private at the backend.
  • --env/--envfile mirror the existing reflex deploy precedence: envfile wins over env flags when both are supplied, with a warning. Values are JSON-encoded in the YAML output so quotes, backslashes, and newlines round-trip safely.

Confidence Score: 5/5

Safe to merge; the new flags are additive, defaults preserve existing behavior, and the tempfile lifecycle is correctly scoped.

All new code is additive. Defaults match Cloud Run's own defaults and don't change any existing behavior. The env-vars YAML tempfile is created only when needed and cleaned up reliably via ExitStack. The one inconsistency (envfile keys not validated against the env-var name regex) would produce a gcloud error rather than silent bad state.

packages/reflex-hosting-cli/src/reflex_cli/v2/gcp.py — specifically the _parse_envs envfile branch which lacks key-name validation.

Important Files Changed

Filename Overview
packages/reflex-hosting-cli/src/reflex_cli/v2/gcp.py Adds --max-instances, --allow-unauthenticated, --env, and --envfile flags; wires parsed env vars through a YAML tempfile managed by ExitStack. Minor inconsistency: --envfile keys are not validated against the env-var name regex the way --env keys are.
tests/units/reflex_cli/v2/test_gcp.py Adds 11 new test cases covering all new flags, tempfile lifecycle, YAML escaping, envfile precedence, and edge cases. Coverage looks thorough.

Reviews (2): Last reviewed commit: "hosting-cli(gcp): add --env / --envfile ..." | Re-trigger Greptile

Comment on lines +173 to +174
type=click.IntRange(min=1),
help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No upper bound is enforced on --max-instances. Cloud Run's documented maximum is 1000 instances; values above that will produce an opaque error from gcloud at deploy time rather than a clear CLI message — inconsistent with the min/max cross-validation added in this PR. Adding max=1000 keeps the UX consistent.

Suggested change
type=click.IntRange(min=1),
help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.",
type=click.IntRange(min=1, max=1000),
help="Maximum number of Cloud Run instances during autoscaling (sets CLOUD_RUN_MAX_INSTANCES). Caps cost under traffic spikes.",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Declining this one. 1000 is Cloud Run's default per-service cap, but it's a soft quota — customers can request it raised via Cloud Quotas. Hard-coding max=1000 in the CLI would lock anyone with a raised quota out of using the CLI for higher counts. IntRange(min=1) keeps the floor sane; let gcloud be the authority on the ceiling.

Comment thread packages/reflex-hosting-cli/src/reflex_cli/v2/gcp.py
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 22, 2026

Merging this PR will not alter performance

✅ 24 untouched benchmarks


Comparing simon/gcp-cloud-run-max-instances-auth (15ceb22) with main (d611a5d)

Open in CodSpeed

Add user-supplied env vars to `reflex deploy --gcp`, mirroring the
existing `reflex deploy` and `reflex secrets update` flows:
- `--env KEY=VALUE` (multiple=True): repeatable; parsed by
  `hosting.process_envs` (validates key format).
- `--envfile PATH`: reads a .env file via `dotenv_values`; lazy
  import with the same install-hint as secrets.py.
- When both are passed, --envfile wins with a warning (same precedence
  as the existing flows).

Implementation: the parsed dict is written to a YAML tempfile (via
json.dumps per value, so any string round-trips safely) and the path
is forwarded to the deploy script as REFLEX_ENV_VARS_FILE. The script
hands it to `gcloud run deploy --env-vars-file=...`. The tempfile's
lifecycle is bound to a contextlib.ExitStack so it's only created when
envs are present and always cleaned up afterward.

Dry-run output now shows the env-vars YAML body so users can preview
what's about to ship to Cloud Run.

Help text calls out that these become plain Cloud Run env vars
(visible to roles/run.viewer) and points at Secret Manager for
sensitive values — matches existing Reflex/Fly deploy semantics.

Companion backend PR (the script-side --env-vars-file support):
reflex-dev/flexgen#3748.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Kastier1 Kastier1 changed the title hosting-cli(gcp): add --max-instances and --allow-unauthenticated toggle hosting-cli(gcp): add --max-instances, --allow-unauthenticated, --env, --envfile May 22, 2026
@Kastier1
Copy link
Copy Markdown
Contributor Author

@greptileai please re-review — the PR has expanded to include --env/--envfile flags (matching the existing reflex deploy / reflex secrets update UX, reusing hosting.process_envs and dotenv_values), wired through a YAML tempfile to the companion backend's new REFLEX_ENV_VARS_FILE support (reflex-dev/flexgen#3748).

Per Greptile review on #6557 (P1 + security): if a user upgrades the
CLI before the matching flexgen backend ships, passing
--no-allow-unauthenticated would be silently no-op'd by the older
deploy script (which still hard-codes --allow-unauthenticated),
producing a PUBLIC service when the user explicitly asked for a
private one. That's exactly the fail-silent privacy flip we defended
against on the script side.

Add a CLI-side check: after fetching the manifest, if the user passed
--no-allow-unauthenticated but the fetched deploy_script doesn't
reference CLOUD_RUN_ALLOW_UNAUTHENTICATED, abort with a clear error
naming the missing backend support.

Declining the companion P2 (IntRange max=1000 on --max-instances):
1000 is a soft default per-service cap that customers can raise via
quota request; hard-coding it client-side would lock out users with
raised quotas. Let gcloud be the authority.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant