redesign: replace docs with new IA from Pixee-Marketing-OS PR #117#256
redesign: replace docs with new IA from Pixee-Marketing-OS PR #117#256
Conversation
Migrates 71 markdown pages from Pixee-Marketing-OS/10_execute_short_term/pixee_docs/ into a new 10-section information architecture, replacing the previous ~9 thin pages. Content changes: - 71 new pages across 10 sections: getting-started, platform, how-it-works, integrations, configuration, enterprise, languages, api, open-source, faq. - New /integrations/contrast page authored to preserve coverage from the old /code-scanning-tools/contrast (PR #117 had no Contrast equivalent). - Removes 9 stale top-level pages and the /code-scanning-tools/ section. - Removes leftover Docusaurus demo src/pages/markdown-page.md. Sidebar: - Autogen sidebar with per-section _category_.json files providing track badges ([DEV] / [LEADER] / [BOTH]) in category labels and curated section ordering. - Each section's overview page is set as the category landing via link.id. - how-it-works uses a generated-index card list. Frontmatter normalization (handled in migration): - Numeric file prefixes dropped, replaced with sidebar_position in frontmatter. - track field case normalized to lowercase across 71 files. - Duplicate frontmatter keys deduped in 29 files (last-wins). - meta_description renamed to Docusaurus-standard description field. Redirects (docusaurus.config.js plugin-client-redirects, 17 rules): - All old top-level URLs map to closest new equivalents. - All /code-scanning-tools/* URLs map to new /integrations/* equivalents. - Pre-existing /integrations alias rules updated to point at new IA. SEO additions (config-only, no React components added): - Site-wide Organization JSON-LD via headTags in docusaurus.config.js. - docusaurus-plugin-llms generates llms.txt and llms-full.txt at build. - static/robots.txt explicitly allows GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot-Extended, CCBot, OAI-SearchBot. Verification: - yarn build clean (72 docs processed, zero broken links). - All 72 page slugs return 200 against yarn serve. - Sidebar order matches the spec; track badges visible on category labels. - Redirects emit correct meta-refresh + canonical link in build output. Deferred to v2 (per scope agreement): - React components: AudienceBadge, SchemaOrg, FeedbackWidget. - Per-page FAQPage / HowTo JSON-LD. - Raw .md alternates for AI agents. - Algolia DocSearch and HubSpot lead capture. Note: src/pages/index.js (the PixeeDocs hero landing) is unchanged in this PR. The new welcome page lives at /getting-started; whether to redirect / -> /getting-started or rework the React landing is a v2 decision. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the pre-existing PixeeDocs React landing (src/pages/index.js + HomepageFeatures component) and promotes the welcome page to the site root by setting slug: / on docs/getting-started/getting-started.md. The "Getting Started" sidebar category now lands at / via its existing link.id reference; the welcome page's subpaths (/getting-started/github, /getting-started/gitlab, etc.) remain unchanged and still resolve. Redirect updates in docusaurus.config.js: - /intro, /installing, /supported-scms now point to / (was /getting-started) - New rule: /getting-started -> / (catches stale links and old shares) Body content updates: 5 internal markdown links from `](/getting-started)` rewritten to `](/)` so navigation goes directly to the welcome page rather than hitting the redirect chain. Verification: yarn build clean. / returns "Welcome to Pixee" title; all 10 sidebar track-badged categories still render; spot-checked content pages return 200; legacy URLs redirect correctly via meta-refresh. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Removes the [DEV] / [LEADER] / [BOTH] suffixes from sidebar category labels. The track field stays in page frontmatter for potential v2 use (in-page audience badge), but the sidebar reads cleaner without them. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds migration/ at the repo root as a historical record of the 2026-05-05 docs redesign. Contents: - ASSESSMENT.md — planning and decision log, including the three-repo deploy flow, redirect table, SEO additions, and what was actually executed. - migrate.py — one-shot Python script that ported PR #117 content into docs/docs/, normalized frontmatter, dropped numeric prefixes, generated _category_.json files. - fixup_links.py — one-shot link-fixup pass that fixed 27 internal markdown links across 9 files after migrate.py. - README.md — orientation for future readers, plus a clear DO-NOT-RE-RUN warning (migrate.py would wipe the manually-authored Contrast page and revert the welcome doc's slug: /). Lives at the repo root rather than docs/migration/ so Docusaurus does not treat these files as published pages. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
b176d34 to
0c0b51f
Compare
Restructures the flat /integrations/<x> layout into two clean
subcategories that match the way the integrations actually divide:
SCM platforms (where Pixee delivers fixes) and scanning tools (where
findings come from). Moves URLs from /integrations/<x> to
/integrations/{scms,scanners}/<x> and adds redirects.
SCMs (4 pages) under docs/integrations/scms/:
- github.md (renamed from github-platform.md, content unchanged)
- gitlab.md (split out from scm-platform-reference.md)
- azure-devops.md (split out from scm-platform-reference.md)
- bitbucket.md (split out from scm-platform-reference.md)
Scanners (14 pages) under docs/integrations/scanners/:
- appscan.md, checkmarx.md, codeql.md, contrast.md, gitlab-sast.md,
semgrep.md, snyk-code.md, sonarqube.md, veracode.md (moved from flat)
- polaris.md, fortify.md (split out from commercial-scanners.md)
- trivy.md, defectdojo.md (split out from oss-aggregator-scanners.md)
- gitlab-sca.md (newly authored to match the SCA scope —
this content was missing from PR #117 and needs colleague review)
Removes three consolidated wrapper pages now that each scanner / SCM
has its own page: commercial-scanners.md, oss-aggregator-scanners.md,
scm-platform-reference.md.
Sidebar (autogen, no hand-built):
- /integrations/overview (sidebar_position: 1)
- /integrations/sarif-universal (sidebar_position: 2)
- Source Control subcategory (position: 3, generated-index landing)
- Scanning Tools subcategory (position: 4, generated-index landing)
Each subcategory gets a generated-index landing at /category/source-control
and /category/scanning-tools respectively, which renders a card list of the
pages inside.
Redirects (added to docusaurus.config.js):
- /integrations/<flat-scanner> -> /integrations/scanners/<x> (9 rules)
- /integrations/github -> /integrations/scms/github
- /integrations/{commercial-scanners,oss-aggregator-scanners,scm-platforms}
-> /integrations/overview
- Pre-existing /code-scanning-tools/* and /integrations/sonar redirects
retargeted to the new /integrations/scanners/<x> URLs.
Body content: 2 internal links updated from /integrations/codeql to
/integrations/scanners/codeql in the new github.md page.
Overview rewrite: integrations-overview.md updated to reflect the new
two-category structure, refreshed coverage matrix (13 scanners), and
new SCM links pointing at /integrations/scms/<x>.
Migration archive: migration/integrations_restructure.py captures the
mechanical operations (file moves, frontmatter updates, body-link
fixup, wrapper deletions) for posterity. Will not be re-run.
Verification:
- yarn build clean (77 docs processed; was 72 before this commit).
- yarn serve verified all 4 SCM pages, all 14 scanner pages, both
generated-index landings, 8 sample redirects, and sidebar order.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
daharmattan1
left a comment
There was a problem hiding this comment.
Review: New IA Migration from PR #117
Reviewed across 7 dimensions: source fidelity, technical accuracy, redirect correctness, Docusaurus config, content tone, IA completeness, and migration archive.
Summary
Strong migration. All 71 source pages from PR #117 ported faithfully with correct frontmatter normalization. Redirects are comprehensive and all targets resolve. Content tone is clean — no marketing language leakage. The integrations restructure into scanners/ and scms/ subdirectories is a good IA improvement over the flat structure.
Blocking Issues
None.
Non-Blocking Issues
1. /running_on_public_github_repos redirect content gap (docusaurus.config.js)
The redirect to /configuration/repositories is functional but the target page doesn't cover the original content (step-by-step guide for running Pixee on public GitHub repos without tools). Neither does any other page in the new IA. Suggest adding a paragraph to /getting-started/github or the repositories config page covering this use case, or changing the redirect target to /getting-started/github which is a closer topical match.
Nits
2. Source sonarqube.md had duplicate frontmatter keys — PR correctly deduped title and slug that each appeared twice. Nice catch by the migration script. (No action needed, just noting.)
Dimension-by-Dimension Detail
Source Fidelity (5/5 pages sampled): fix-safety, security, agentic-security-engineering, sonarqube, enterprise-overview — all faithful. Body content identical. Frontmatter correctly normalized: meta_description → description, sidebar_position injected, track lowercased, duplicate keys deduped.
Technical Accuracy: Contrast page is well-structured, consistent with CodeQL and Semgrep integration pages. Tone is appropriate for docs.
Redirect Correctness (30+ rules validated): All redirect to targets confirmed to exist via slug frontmatter in the new file set. The expanded redirect set covers old top-level pages, code-scanning-tools/* → integrations/scanners/*, and flat integrations/<name> → integrations/scanners/<name> or integrations/scms/<name>. Comprehensive.
Docusaurus Config: Organization JSON-LD data looks correct. docusaurus-plugin-llms registered. _category_.json files sampled (integrations, how-it-works, platform) have correct labels, positions, and link references.
Content Tone (5 pages spot-checked): phased-rollout, faq-general, java, operations-config, commercial-scanners — all factual and neutral. No SEO keyword stuffing, no customer quotes, no JSON-LD in FAQ pages, no competitive comparison tables, no CTAs. Clean docs tone.
IA Completeness: Integrations restructured into scanners/ and scms/ subdirectories — good improvement. 5 new scanner pages added (DefectDojo, Fortify, GitLab SCA, Polaris, Trivy). Consolidated pages verified: operations-config.md covers scheduling + notifications + reporting.
Migration Archive: Located at repo root (migration/), not inside docs/. README has clear "Do not re-run" warnings with explanation of why scripts are destructive. ✅
| { from: "/open-pixee", to: "/open-source/overview" }, | ||
| { | ||
| to: "/code-scanning-tools/overview", | ||
| from: "/integrations", |
There was a problem hiding this comment.
The redirect works, but the original page had specific setup instructions for public repos without tools that don't exist anywhere in the new IA. Consider adding a paragraph to /getting-started/github covering this use case, or retargeting to /getting-started/github as a closer match.
|
Thanks for the close read across all 7 dimensions, Victor. On the non-blocking item: agreed — the Pushing a follow-up commit that:
Will follow up here once the commit is in. The dedup nit needs no action — noted, thanks. |
Addresses Victor's review feedback on PR #256. The pre-migration site had a /running_on_public_github_repos page that walked new users through setting up Pixee on a public GitHub repo with no existing scanner: enable Issues for the dashboard, pick a free-tier scanner (CodeQL via GHAS or SonarQube Cloud), install Pixeebot. The initial migration redirected that URL to /configuration/repositories, which is functional as a redirect but does not actually cover the original use case. This commit: 1. Adds a "Public Repositories Without an Existing Scanner" section to docs/getting-started/github.md covering the three steps (enable Issues, connect a free scanner, install Pixeebot), re-toned to match the new docs voice. Cross-links to the CodeQL and SonarQube scanner integration pages for deeper detail. 2. Retargets the redirect: /running_on_public_github_repos now points at /getting-started/github (was /configuration/repositories). Verification: yarn build clean. Redirect HTML correctly points at the new target. New section renders in the production build. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Follow-up landed in
CI green. Ready for another look when you have a minute. |
Two redundancies surfaced by an overlap audit (>=70% Jaccard on 5-word shingles): 1. configuration/scheduling.md was a 90-line subset of the 249-line configuration/operations-config.md, both alive in the sidebar. The migration's operations-config consolidation was supposed to absorb scheduling but the standalone scheduling.md was never deleted. Removing it; the operations page covers everything it covered. Two inbound internal links (config-overview.md, sonarqube.md) repointed to /configuration/operations. Redirect added: /configuration/scheduling -> /configuration/operations. 2. platform/remediation.md and how-it-works/fix-safety.md shared three near-identical paragraphs about the independent fix evaluator (76%+ Jaccard). fix-safety.md is the canonical home (technical guide with the three-dimension rubric); the leader-track remediation page does not need the full detail. Replacing the three paragraphs in remediation.md with a one-paragraph summary that links to fix-safety. Verification: yarn build clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
Replaces the existing ~9-page docs.pixee.ai with the 71-page redesigned IA from
Pixee-Marketing-OSPR #117 (merged 2026-04-28).meta_descriptionrenamed todescription,sidebar_positioninjected./. Pre-existing React landing (src/pages/index.js+HomepageFeaturescomponent) deleted. Sidebar's "Getting Started" category now lands at root./integrations/contrastauthored from scratch (PR Update canonical URL #117 dropped Contrast from the IA; we kept it because it's still in the new sidebar). Drafted in the new docs voice from public Contrast Security docs + the existing 4-line stub. Worth a careful read before merge.docusaurus.config.jsmap every old URL to its closest new equivalent. Existing/integrations/* → /code-scanning-tools/*redirects flipped to point the new direction.headTagsdocusaurus-plugin-llmsgeneratesllms.txt+llms-full.txtat buildstatic/robots.txtexplicitly allowsGPTBot,ClaudeBot,PerplexityBot,Google-Extended, etc.migration/archive at repo root containsmigrate.py,fixup_links.py,ASSESSMENT.md, andREADME.md— historical record only. Do not re-run.Deferred to v2 (per scope agreement, not in this PR): in-page audience badge,
<SchemaOrg>per-page JSON-LD (FAQPage / HowTo), raw-.mdalternates for AI agents, Algolia DocSearch, HubSpot lead capture, GA4 custom events.What to review
docs/integrations/contrast.md— wholly new content, needs technical accuracy checkdocusaurus.config.jsredirects — confirm/running_on_public_github_repos → /configuration/repositoriesis the right target, otherwise we should change to/getting-started/githubmigration/ASSESSMENT.md— captures all decisions and tradeoffsTest plan
yarn buildclean — 72 docs processed, zero broken linksyarn serve— all 72 page slugs return 200, all 10 category landings render with correct titles/intro → /,/code-scanning-tools/sonar → /integrations/sonarqube,/faqs → /faq/general, etc.)/running_on_public_github_reposredirect target is correct🤖 Generated with Claude Code