Skip to content

perf: Cloud Run CPU optimization, server-side pagination, GCS cache#4710

Merged
MarkusNeusinger merged 5 commits intomainfrom
perf/cloud-run-optimization
Mar 8, 2026
Merged

perf: Cloud Run CPU optimization, server-side pagination, GCS cache#4710
MarkusNeusinger merged 5 commits intomainfrom
perf/cloud-run-optimization

Conversation

@MarkusNeusinger
Copy link
Copy Markdown
Owner

Summary

  • Cloud Run CPU 2→1: Both frontend (nginx) and backend (uvicorn async) don't need 2 cores. Backend upgraded to gen2 execution environment for better CPU/network performance.
  • Server-side pagination: /plots/filter now accepts limit and offset query params. Counts and totals are still computed from all filtered images. Fully backward-compatible — no params = all images.
  • GCS Cache-Control 1h→1d: All gsutil cp commands in impl-generate, impl-merge, and impl-repair workflows now set Cache-Control: public, max-age=86400.

Changed files

File Change
app/cloudbuild.yaml CPU 2→1
api/cloudbuild.yaml CPU 2→1, add --execution-environment=gen2
api/schemas.py Add offset/limit fields to FilteredPlotsResponse
api/routers/plots.py Add pagination query params, cache key update, slice logic
tests/unit/api/test_routers.py 4 new pagination tests + cached mock update
.github/workflows/impl-generate.yml Cache-Control header on gsutil cp
.github/workflows/impl-merge.yml Cache-Control header on gsutil cp
.github/workflows/impl-repair.yml Cache-Control header on gsutil cp

Test plan

  • uv run ruff check passes on all changed Python files
  • uv run pytest tests/unit/api/test_routers.py — 101 tests pass
  • Verify /plots/filter?limit=10&offset=0 returns 10 images with correct total
  • Verify /plots/filter without params returns all images (backward compat)
  • Verify Cloud Run deploys successfully with new CPU/gen2 settings

🤖 Generated with Claude Code

MarkusNeusinger and others added 3 commits March 8, 2026 21:11
… API preconnect

- Lazy-load CatalogPage, LegalPage, McpPage, InteractivePage, DebugPage
- Switch react-syntax-highlighter to PrismLight with only Python grammar
- Add manualChunks to split vendor (React) and MUI into separate cached chunks
- Add preconnect hint for api.pyplots.ai

Closes #4704

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Added LazyFallback component for better user experience during lazy loading
- Updated router configuration to include HydrateFallback for lazy-loaded pages
- Improved manual chunking strategy in Vite config for better performance
- Reduce frontend Cloud Run CPU from 2 to 1 (nginx doesn't need 2 cores)
- Reduce backend Cloud Run CPU from 2 to 1, enable gen2 execution environment
- Add server-side pagination (limit/offset) to /plots/filter endpoint
- Increase GCS image Cache-Control from 1h to 1 day (86400s)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 8, 2026 20:52
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 8, 2026

Codecov Report

❌ Patch coverage is 80.64516% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
api/routers/plots.py 82.14% 5 Missing ⚠️
app/src/components/SpecTabs.tsx 0.00% 1 Missing ⚠️

📢 Thoughts on this report? Let us know!

- Replace .fn attribute access on decorated functions (no longer wrapped)
- Use list_tools() instead of removed get_tools() method

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on performance optimizations across the deployment pipeline and runtime: reducing Cloud Run CPU allocation, adding pagination to the plots filtering endpoint, and improving caching behavior for generated assets uploaded to GCS.

Changes:

  • Reduced Cloud Run CPU allocation for frontend and backend; backend deployment now targets Gen2 execution environment.
  • Added limit/offset pagination support to /plots/filter (with response schema updates and new unit tests).
  • Increased GCS object cacheability by setting Cache-Control: public, max-age=86400 during workflow uploads/copies, and introduced frontend bundle-splitting / lazy-loading adjustments.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
app/cloudbuild.yaml Reduce frontend Cloud Run CPU from 2 → 1.
api/cloudbuild.yaml Reduce backend Cloud Run CPU from 2 → 1 and add --execution-environment=gen2.
api/schemas.py Extend FilteredPlotsResponse with offset and limit.
api/routers/plots.py Add limit/offset query params, pagination slicing, and cache key changes.
tests/unit/api/test_routers.py Add unit tests covering pagination behavior and update cached response mock shape.
app/vite.config.ts Add Rollup manualChunks configuration for bundle splitting.
app/src/router.tsx Convert several routes to lazy-loaded route modules with a shared fallback.
app/src/components/SpecTabs.tsx Switch syntax highlighter import to prism-light and register Python language explicitly.
app/index.html Add <link rel="preconnect"> for the API domain.
.github/workflows/impl-generate.yml Upload artifacts with Cache-Control: public, max-age=86400.
.github/workflows/impl-merge.yml Promote staging → production with Cache-Control: public, max-age=86400.
.github/workflows/impl-repair.yml Upload repair artifacts with Cache-Control: public, max-age=86400.

You can also share your feedback on Copilot code review. Take the survey.

Comment thread api/routers/plots.py Outdated
Comment on lines 413 to 416
cache_key = _build_cache_key(filter_groups, offset=offset, limit=limit)
try:
cached = get_cache(cache_key)
if cached:
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including offset/limit in the cache key means each paginated request creates a distinct entry in the in-process TTLCache (maxsize=1000). With real pagination (offset stepping through pages), this can quickly thrash/evict other cache entries and reduce overall cache effectiveness. Consider caching the unpaginated filtered result per filter_groups (or caching just the computed lookups/counts), then apply slicing after a cache hit so pages reuse the same cached payload.

Copilot uses AI. Check for mistakes.
Comment thread api/routers/plots.py Outdated
return "filter:all"
base = "filter:all"
else:
cache_parts = [f"{g['category']}={','.join(sorted(g['values']))}" for g in filter_groups]
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache key construction currently depends on the incoming query parameter order because filter_groups is not normalized (groups/categories aren’t sorted before joining). That makes lib=matplotlib&plot=scatter and plot=scatter&lib=matplotlib produce different cache keys for the same logical filter, reducing cache hit rate. Consider sorting a copy of filter_groups by category (and optionally de-duping identical groups) when building the cache key so cache behavior is stable without changing response semantics.

Suggested change
cache_parts = [f"{g['category']}={','.join(sorted(g['values']))}" for g in filter_groups]
# Normalize group ordering so cache keys do not depend on query param order.
normalized_groups = sorted(filter_groups, key=lambda g: g.get("category", ""))
cache_parts = [f"{g['category']}={','.join(sorted(g['values']))}" for g in normalized_groups]

Copilot uses AI. Check for mistakes.
Comment thread app/vite.config.ts
Comment on lines +14 to +21
build: {
rollupOptions: {
output: {
manualChunks(id) {
if (id.includes('node_modules/@mui/')) return 'mui';
if (id.includes('node_modules/react/') || id.includes('node_modules/react-dom/') || id.includes('node_modules/react-router-dom/')) return 'vendor';
},
},
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description’s “Changed files” list and summary don’t mention these frontend bundle-splitting changes (manual Rollup chunking) even though they can affect load behavior and caching in production. Please update the PR description (and ideally the test plan) to include these frontend perf changes so reviewers know they’re intentional and can validate them during deploy smoke tests.

Copilot uses AI. Check for mistakes.
Address Copilot review feedback:
- Cache the full filtered result, apply pagination after cache hit
  (avoids cache thrashing with per-page entries)
- Sort filter groups by category in cache key so param order
  doesn't affect cache hits

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@MarkusNeusinger MarkusNeusinger merged commit b242d4f into main Mar 8, 2026
7 of 8 checks passed
@MarkusNeusinger MarkusNeusinger deleted the perf/cloud-run-optimization branch March 8, 2026 21:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants