Open-source CLI for deterministic website audits and OpenAI-powered strategy reports using the modern Responses API.
Quick start · OpenAI integration · Features · CLI reference · OpenAI docs · Contributing
ai-website-audit-cli turns a public website URL into a structured audit package:
- Extraction JSON with the facts the tool found on the page.
- Deterministic audit JSON with transparent rule-based checks and scores.
- Markdown report generated locally or with OpenAI.
- Optional standalone HTML report for sharing with clients or teams.
- OpenAI response metadata JSON when the AI report is used.
The project is intentionally practical: developers, freelancers, agencies, indie hackers, and small businesses can run a quick website audit from the terminal, inspect the raw evidence, and produce a prioritized improvement plan.
The tool is MIT-licensed and designed to be easy to extend with new extractors, checks, prompts, and report formats.
Many AI audit tools jump directly from a URL to vague LLM advice. This project separates the process into layers:
- Facts first: extract visible, inspectable data from public HTML.
- Rules second: run deterministic checks that can be tested and improved.
- AI last: use OpenAI to write a deeper report grounded in the saved evidence.
This makes the output easier to trust and easier for contributors to maintain.
The AI layer uses the OpenAI Python SDK and the Responses API through client.responses.create(...).
Default configuration:
OPENAI_MODEL=gpt-5.5
OPENAI_MAX_OUTPUT_TOKENS=6000
OPENAI_REASONING_EFFORT=medium
OPENAI_STORE_RESPONSES=false
OPENAI_SERVICE_TIER=autoThe default model is gpt-5.5 because this repository is meant to demonstrate a current production-style OpenAI integration. For cheaper or faster local testing, use:
ai-website-audit audit https://example.com --model gpt-5.4-miniThe OpenAI call lives in one clear file:
src/ai_website_audit/openai_client.py
It sends:
- the extracted public website facts
- deterministic scores and checks
- the selected report language
- strict system instructions to avoid hallucinated claims
It saves:
- generated Markdown report
- response ID when available
- model used
- API endpoint name
- usage metadata when returned by the SDK
- latency
- reasoning effort
- response storage setting
Privacy-first default: OPENAI_STORE_RESPONSES=false.
- public HTML fetch with configurable user agent
- redirect-aware final URL capture
- title and title length
- meta description and length
- canonical URL
- meta robots
- viewport tag
- HTML language
- H1/H2/H3 extraction
- internal and external links
- broken/empty/javascript link counting
- image count
- missing alt text count
- lazy-loaded image count
- images without dimensions count
- forms, inputs and buttons
- likely CTA detection
- Open Graph tags
- Twitter/X card tags
- JSON-LD schema type detection
- email and phone mentions
- word count and paragraph count
- readable text extraction
- text-to-HTML ratio
- HTML response size
- initial fetch time
- cookie/privacy/terms/pricing/testimonial/FAQ signals
- simple technology hints like WordPress, Shopify, Webflow, Next.js, React and GTM
- optional conservative same-domain crawl
- SEO score
- content score
- UX and conversion score
- accessibility score
- trust score
- performance score
- overall score
- rule-based findings with severity
- quick wins
- risk flags
- opportunities
- multi-page duplicate title/meta checks
- OpenAI Responses API
- current model support, including
gpt-5.5and smaller model overrides - reasoning effort option for supported models
- explicit max output token control
- storage control with
--store-response/--no-store-response - English and German prompt templates
- evidence-grounded report structure
- developer implementation checklist
- Markdown report
- HTML report
- extraction JSON
- deterministic audit JSON
- OpenAI response metadata JSON
git clone https://github.com/your-username/ai-website-audit-cli.git
cd ai-website-audit-cli
python -m venv .venv
source .venv/bin/activate # Windows: .venv\\Scripts\\activate
pip install -e .ai-website-audit inspect https://example.com --htmlThis produces a deterministic local audit and does not require an API key.
cp .env.example .env
# Add OPENAI_API_KEY to .env
ai-website-audit audit https://example.com --language en --htmlai-website-audit audit https://example.com \
--language en \
--model gpt-5.4-mini \
--max-output-tokens 4000 \
--htmlai-website-audit audit https://example.com --language de --crawl --max-pages 3 --htmlai-website-audit show-configreports/
├── example-com-audit.md
├── example-com-audit.html
├── example-com-extraction.json
├── example-com-deterministic-audit.json
└── example-com-openai-response.json
Deterministic Audit Score
┏━━━━━━━━━━━━━━━━━┳━━━━━━━━━┓
┃ Area ┃ Score ┃
┡━━━━━━━━━━━━━━━━━╇━━━━━━━━━┩
│ Overall │ 84/100 │
│ SEO │ 86/100 │
│ Content │ 91/100 │
│ UX & Conversion │ 82/100 │
│ Accessibility │ 88/100 │
│ Trust │ 76/100 │
│ Performance │ 85/100 │
└─────────────────┴─────────┘
Create .env from .env.example:
OPENAI_API_KEY=sk-your-key
OPENAI_MODEL=gpt-5.5
OPENAI_MAX_OUTPUT_TOKENS=6000
OPENAI_REASONING_EFFORT=medium
OPENAI_STORE_RESPONSES=false
OPENAI_SERVICE_TIER=auto
OPENAI_TEMPERATURE=
REQUEST_TIMEOUT_SECONDS=20
USER_AGENT=AIWebsiteAuditCLI/1.0.0 (+https://github.com/your-username/ai-website-audit-cli)| Variable | Required | Default | Description |
|---|---|---|---|
OPENAI_API_KEY |
only for AI reports | none | OpenAI API key |
OPENAI_MODEL |
no | gpt-5.5 |
Model for AI reports |
OPENAI_MAX_OUTPUT_TOKENS |
no | 6000 |
Maximum output tokens |
OPENAI_REASONING_EFFORT |
no | medium |
Reasoning effort for supported models |
OPENAI_STORE_RESPONSES |
no | false |
Whether OpenAI may store response objects |
OPENAI_SERVICE_TIER |
no | auto |
OpenAI service tier |
OPENAI_TEMPERATURE |
no | empty | Optional temperature override |
REQUEST_TIMEOUT_SECONDS |
no | 20 |
HTTP timeout for website fetches |
USER_AGENT |
no | project UA | User agent used for public requests |
Local deterministic audit only:
ai-website-audit inspect https://example.com --htmlFull audit. Uses OpenAI unless --no-ai is passed:
ai-website-audit audit https://example.com --language en --htmlImportant options:
--language en|de
--model gpt-5.5
--reasoning-effort low|medium|high|xhigh
--max-output-tokens 6000
--store-response / --no-store-response
--crawl
--max-pages 5
--no-ai
--html
--verboseai-website-audit show-configPrints runtime config without exposing secrets.
ai-website-audit-cli/
├── src/ai_website_audit/
│ ├── analyzer.py # deterministic checks and scoring
│ ├── cli.py # Typer CLI commands
│ ├── config.py # environment settings
│ ├── extractor.py # HTML extraction and crawler
│ ├── models.py # dataclasses for extraction/audit data
│ ├── openai_client.py # OpenAI Responses API integration
│ ├── prompts.py # prompt assembly
│ ├── report.py # Markdown/HTML/JSON outputs
│ └── utils.py # URL, slug and text helpers
├── prompts/ # English and German prompt templates
├── docs/ # architecture, CLI, OpenAI integration, roadmap
├── examples/ # sample reports and JSON artifacts
├── tests/ # pytest test suite
├── .github/ # issue templates and CI workflow
├── AGENTS.md # instructions for Codex/AI coding agents
├── Dockerfile
├── Makefile
└── README.md
make install
make test
make inspectOr manually:
pip install -e . pytest
pytestdocker build -t ai-website-audit-cli .
docker run --rm ai-website-audit-cli inspect https://example.com --htmlFor OpenAI-powered audits:
docker run --rm -e OPENAI_API_KEY=$OPENAI_API_KEY ai-website-audit-cli audit https://example.com --language enSee docs/roadmap.md. Good first issues include:
- add Lighthouse JSON import
- add sitemap.xml discovery
- add robots.txt inspection
- add broken-link verification mode
- add CSV export
- add JUnit/SARIF output for CI pipelines
- add more language prompt templates
- add plugin-style custom checks
Contributions are welcome. See CONTRIBUTING.md.
Good contributions include:
- new deterministic checks
- better extraction logic
- improved prompts
- more tests
- bug reports with reproducible URLs
- documentation improvements
- examples from real public websites
See SECURITY.md.
Important defaults:
- the tool only fetches public URLs provided by the user
- API keys are read from environment variables
.envis ignored by git- OpenAI response storage is disabled by default
- raw extraction JSON is saved locally so users can inspect what was sent to the AI layer
MIT. See LICENSE.