Skip to content

fix(seo): correct canonical URLs, compress oversized images, add cache headers#4168

Merged
emir-karabeg merged 2 commits intostagingfrom
improvement/seo
Apr 14, 2026
Merged

fix(seo): correct canonical URLs, compress oversized images, add cache headers#4168
emir-karabeg merged 2 commits intostagingfrom
improvement/seo

Conversation

@emir-karabeg
Copy link
Copy Markdown
Collaborator

Summary

  • Fix wrong canonical URLs across all public pages — https://sim.aihttps://www.sim.ai
  • Compress 6 oversized blog/landing images from 2.6MB to 300KB total
  • Add Cache-Control headers for static assets
  • Add SEO regression test to prevent future canonical URL violations

Context

A third-party SEO audit flagged three issues: wrong canonical URLs making pages unindexable, oversized images (up to 1.1MB), and poor Core Web Vitals. This PR addresses all three.

Changes

Canonical URLs

  • Add SITE_URL constant (https://www.sim.ai) to lib/core/utils/urls.ts
  • Replace all hardcoded https://sim.ai references across 30+ files (layouts, pages, blog frontmatter, JSON-LD, RSS feeds, redirects)
  • Migrate models, integrations, and homepage metadata from getBaseUrl() to SITE_URL

Image Compression

  • mothership/cover: 1,122KB → 99KB (converted PNG → JPEG for photographic content)
  • v0-5/cover.png: 282KB → 66KB
  • executor/cover.png: 184KB → 81KB
  • series-a/cover.png: 123KB → 26KB
  • workflow.png: 109KB → 31KB
  • multiplayer-cover.png: 784KB → 18KB

Performance

  • Add Cache-Control: public, max-age=86400, stale-while-revalidate=604800 for static assets (images, fonts, icons)

Regression Test

  • New seo.test.ts scans all public-facing directories for:
    • Hardcoded https://sim.ai without www
    • getBaseUrl() used in page metadata (should use SITE_URL)
    • SITE_URL constant value correctness

Type of Change

  • Bug fix
  • Improvement

Testing

  • All 285 test files, 5,075 tests passing
  • Verified canonical URLs via curl on all flagged pages
  • Verified all compressed images render correctly in dev

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

…e headers

- Replace all hardcoded https://sim.ai with https://www.sim.ai via SITE_URL constant
- Migrate models, integrations, and homepage metadata from getBaseUrl() to SITE_URL
- Compress 6 blog/landing images from 2.6MB to 300KB total
- Convert mothership cover from PNG to JPEG (1.1MB → 99KB)
- Add Cache-Control headers for static assets (1d max-age, 7d stale-while-revalidate)
- Add SEO regression test scanning all public pages for canonical URL violations
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 14, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Apr 14, 2026 11:14pm

Request Review

@cursor
Copy link
Copy Markdown

cursor bot commented Apr 14, 2026

PR Summary

Medium Risk
Touches SEO metadata/structured data across many public routes and content, which could affect indexing if any URL is miscomputed. Changes are mostly straightforward constant substitutions plus static-asset cache headers and should be easy to validate in production.

Overview
Standardizes all public-facing SEO URLs to use a single canonical base (SITE_URL = https://www.sim.ai) instead of hardcoded https://sim.ai or getBaseUrl() in metadata, updating Next.js metadataBase, canonical alternates, OpenGraph/Twitter URLs, JSON-LD, RSS, sitemap routes, and legacy redirects.

Adds a Vitest regression test (app/(landing)/seo.test.ts) that scans marketing/SEO surfaces to prevent reintroducing bare https://sim.ai and to ensure getBaseUrl() isn’t used for SEO metadata, and configures Cache-Control headers for static assets in next.config.ts.

Reviewed by Cursor Bugbot for commit 7d60d5e. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 14, 2026

Greptile Summary

This PR fixes canonical URL mismatches (https://sim.aihttps://www.sim.ai) across 30+ public-facing files by introducing a SITE_URL constant, compresses six oversized blog/landing images, adds Cache-Control headers for static assets, and adds an SEO regression test. The changes are thorough and well-scoped — all changed files import and use SITE_URL correctly, and the MDX frontmatter canonical fields are updated to match.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style suggestions with no correctness or security impact.

The PR is a thorough, well-executed SEO fix. SITE_URL is correctly introduced and applied across every affected file. The only finding is a P2 precision nit on the Cache-Control source glob pattern (missing escaped dot), which has no practical security impact given that authenticated file-serving routes always set their own Cache-Control headers explicitly.

apps/sim/next.config.ts — minor glob pattern precision for the Cache-Control source rule.

Important Files Changed

Filename Overview
apps/sim/lib/core/utils/urls.ts Adds SITE_URL = 'https://www.sim.ai' constant; clean, well-documented addition alongside existing getBaseUrl().
apps/sim/next.config.ts Adds Cache-Control: public, max-age=86400, stale-while-revalidate=604800 for static asset extensions; the source glob pattern lacks a literal . before the extension group (minor precision concern), and redirects updated to www.sim.ai.
apps/sim/app/(landing)/seo.test.ts New regression test scanning SEO-relevant directories for bare https://sim.ai references and getBaseUrl() in metadata exports; covers all scan paths from the PR and correctly excludes the test file itself.
apps/sim/lib/blog/seo.ts All JSON-LD url and breadcrumb item fields migrated from hardcoded https://sim.ai strings to SITE_URL template literals; complete and consistent.
apps/sim/app/(landing)/components/structured-data.tsx All @id, url, installUrl, and breadcrumb fields switched from hardcoded https://sim.ai to SITE_URL; thorough sweep of the large JSON-LD graph.
apps/sim/app/(landing)/models/[provider]/[model]/page.tsx Switches baseUrl from getBaseUrl() to SITE_URL; also replaces a hardcoded href='https://sim.ai' with href='/' so the "Build with this model" CTA stays on the current environment in staging/dev.
apps/sim/ee/whitelabeling/metadata.ts Keeps getBaseUrl for dynamic whitelabel tenant metadata while replacing the hardcoded https://sim.ai creator URL with SITE_URL; correct separation of concerns.
apps/sim/content/blog/mothership/index.mdx Canonical URL updated to https://www.sim.ai/blog/mothership; ogImage switched from .png to .jpg to match the new compressed JPEG cover.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[lib/core/utils/urls.ts\nSITE_URL = 'https://www.sim.ai'] --> B[Metadata / metadataBase]
    A --> C[JSON-LD structured data\n@id, url, breadcrumbs]
    A --> D[Open Graph / alternates\ncanonical, og:url]
    A --> E[RSS & Sitemap feeds\nchannel link, loc]
    A --> F[next.config.ts\nredirects to www.sim.ai]

    B --> G[Next.js pages\nlayout.tsx, page.tsx]
    C --> H[structured-data.tsx\nblog/seo.ts]
    D --> I[blog / integrations\nmodels / partners pages]
    E --> J[rss.xml/route.ts\nsitemap-images.xml]

    K[Cache-Control headers\nnext.config.ts] --> L[Static assets\n.svg .jpg .png .woff etc.\nmax-age=86400 / swr=604800]

    M[seo.test.ts\nregression test] --> N{Scans SEO dirs}
    N --> O[Assert: no bare https://sim.ai]
    N --> P[Assert: no getBaseUrl in metadata]
    N --> Q[Assert: SITE_URL = https://www.sim.ai]
Loading

Reviews (2): Last reviewed commit: "fix(seo): replace hardcoded URLs with SI..." | Re-trigger Greptile

line.includes('"https://sim.ai/') ||
line.includes('`https://sim.ai/') ||
line.includes('`https://sim.ai`') ||
line.includes('canonical: https://sim.ai/')
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SEO test misses bare URLs in MDX links

Low Severity

The regression test scans .mdx files for hardcoded https://sim.ai but only checks for URLs enclosed in JS quotes, backticks, or YAML canonical: prefix. It misses Markdown link syntax like (https://sim.ai) and bare URLs in template-literal XML. Three existing blog posts contain (https://sim.ai) in body content, demonstrating the blind spot is already producing false negatives on the first run.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 67a91e4. Configure here.

- Replace hardcoded https://www.sim.ai with SITE_URL in academy, changelog.xml, and whitelabeling
- Broaden getBaseUrl() detection in SEO test to match any variable name assignment
- Add ee/whitelabeling/metadata.ts to SEO test scan scope
@emir-karabeg
Copy link
Copy Markdown
Collaborator Author

@cursor review

@emir-karabeg
Copy link
Copy Markdown
Collaborator Author

@greptile

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

1 issue from previous review remains unresolved.

Fix All in Cursor

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 7d60d5e. Configure here.

@emir-karabeg emir-karabeg merged commit cbf0a13 into staging Apr 14, 2026
14 checks passed
@emir-karabeg emir-karabeg deleted the improvement/seo branch April 14, 2026 23:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant