Changelog

All notable changes to this project are documented in this file.

The format is based on Keep a Changelog,
and this project adheres to Semantic Versioning.

[0.6.4] - 2026-06-03

Added

WS-26 — Adaptive anti-bot identity rotation (new src/identity.rs module)
- 12-identity pool (4 browser families × 3 platforms) for adaptive rotation
- IdentityProfile::shuffled_headers() produces seed-deterministic header order
- IdentityPool::rotate_on_block() implements a 5-level cascade: same identity → same family/different platform → different family/same platform → different family+platform → random
- BrowserFamily and Platform enums with canonical English names
- 5 unit tests covering pool size, cascade level, determinism, header shape, tag stability
New CLI flags (additive, no breaking changes)
- --probe — pre-flight health check (sends 1 minimal request, reports status/latency/Set-Cookie as JSON)
- --identity-profile — pin the session to a specific identity (auto, chrome-win, chrome-mac, chrome-linux, edge-win, firefox-linux, safari-mac). auto is default.
New JSON metadata fields (additive, Option + skip_serializing_if = "Option::is_none")
- metadados.identidade_usada — string tag of the identity that produced the response
- metadados.nivel_cascata — cascade level reached during the request

Changed

Version rollback: 0.7.0 (unpublished) → 0.6.4 to preserve the in-development feature set under a stable patch number
All existing CLI flags, JSON output schemas, and exit codes remain unchanged — strictly additive changes

Tests

5 new identity unit tests (313 total tests passing, up from 308)
All 224 lib tests + 83 integration tests + 6 doc tests pass
cargo clippy --lib --bins -- -D warnings clean
cargo fmt --check clean

[0.7.0] - 2026-06-01

Changed

Complete internationalization: ~600 identifiers renamed PT→EN across 15 source files (struct fields, local variables, parameters, production functions, test functions)
Module fetch_conteudo renamed to content_fetch
Test files integracao_*.rs renamed to integration_*.rs
Replaced anyhow with typed CliError across all 11 modules — zero external error crate dependency
output.rs: all formatting functions renamed (formatar_* → format_*, escrever_* → write_*)
config_init.rs: struct fields renamed with #[serde(rename)] to preserve JSON backwards compatibility
search.rs: RetryResult and AggregatedSearchResult fields renamed PT→EN
types.rs: Config fields perfil_browser/corresponde_plataforma_ua/caminho_chrome → browser_profile/match_platform_ua/chrome_path

Added

Loom concurrency tests (tests/loom_atomics.rs) — validates AtomicBool visibility across threads
Criterion benchmarks (benches/extraction_bench.rs) — HTML extraction performance baselines
Doc comments for all 70 previously undocumented public items — zero missing_docs warnings
.ingest-queue.sqlite added to .gitignore and Cargo.toml exclude

Fixed

RUSTSEC-2026-0097: updated rand 0.8.5 → 0.8.6
RUSTSEC-2026-0104: updated rustls-webpki 0.103.12 → 0.103.13

Security

deny.toml: added skip-tree for 30 transitive duplicate crates (chromiumoxide, scraper, console-subscriber ecosystems)

Known Limitations

Loom tests require RUSTFLAGS="--cfg loom" which conflicts with hyper-util — tests compile but cannot run until upstream resolves the cfg conflict
JSON output field names remain in Portuguese Brazilian (posicao, titulo, resultados, etc.) — BY DESIGN since v0.2.0

[0.6.3] - 2026-04-17

Changed

Translated all 96 doc comments (/// and //!) across 19 source files from Portuguese to English — docs.rs now renders fully in English for international crates.io audience.
No code behavior, public API, or JSON output fields changed.

[0.6.2] - 2026-04-17

Added

19 novos arquivos de documentação — conformidade completa com rules_rust_documentacao.md (28 gaps G01-G28)
Documentação bilíngue EN+PT: HOW_TO_USE, CROSS_PLATFORM, AGENTS-GUIDE, COOKBOOK.pt-BR, INTEGRATIONS.pt-BR
CODE_OF_CONDUCT.md + CODE_OF_CONDUCT.pt-BR.md — Contributor Covenant 2.1
README.pt-BR.md, CHANGELOG.pt-BR.md, CONTRIBUTING.pt-BR.md, SECURITY.pt-BR.md
docs/AGENTS.pt-BR.md — guia imperativo para LLMs em português
docs/AGENTS-GUIDE.md + docs/AGENTS-GUIDE.pt-BR.md — guia persuasivo bilíngue
llms.txt — arquivo compacto de orientação para LLMs (< 50 KB)
llms-full.txt — concatenação completa de docs para contexto longo de LLMs
eval-queries.json × 2 — 20 queries de avaliação EN + 20 PT-BR para skill testing

Changed

README.md — link para README.pt-BR.md + quick install antes da linha 30
CONTRIBUTING.md — MSRV Rust 1.75 explícito + PR checklist 8 itens + branching strategy + nextest
SECURITY.md — tabela de versão específica v0.6.2 + política de embargo 90 dias + zero bold + zero emojis
skill/SKILL.md (EN+PT) — seção Workflow com 5 passos numerados verificáveis

[0.6.1] - 2026-04-17

Fixed

--timeout 0 now returns exit 2 (invalid config) instead of executing a search with zero timeout and returning exit 5.
--output /tmp/../../etc/passwd now returns exit 2 (invalid config) instead of exit 1 (runtime OS error) — path traversal validation moved to montar_configuracoes(), before the pipeline starts.

Added

validar_timeout_segundos() method on CliArgs — rejects values of 0 with a descriptive error.
Early path traversal check in montar_configuracoes() — calls paths::validate_output_path() at config validation time, not at write time.
2 E2E regression tests: timeout_zero_retorna_exit_2 and output_com_path_traversal_retorna_exit_2.
1 unit test: validar_timeout_segundos_rejeita_zero.

[0.6.0] - 2026-04-16

Security

Browser fingerprint profiles per-family previnem detecção anti-bot do DuckDuckGo.
Headers Sec-Fetch-* e Client Hints por família imitam sessão de navegador real.
Accept-Language com q-values RFC 7231 elimina fingerprint de UA genérico.
Detecção de bloqueio silencioso com limiar de 5 KB previne resultados truncados.

Added

BrowserFamily enum — variantes Chrome, Firefox, Edge, Safari.
BrowserProfile struct — encapsula família, versão e conjunto de headers por família.
Headers Sec-Fetch-Dest, Sec-Fetch-Mode, Sec-Fetch-Site por família em http.rs.
Client Hints (Sec-Ch-Ua, Sec-Ch-Ua-Mobile, Sec-Ch-Ua-Platform) para Chrome e Edge.
Detecção de HTTP 202 anomaly em search.rs com backoff exponencial automático.
Detecção de bloqueio silencioso — resposta com menos de 5 000 bytes é tratada como bloqueio.
BrowserProfile propagado via Config para todos os módulos da pipeline.
Headers de paginação com Sec-Fetch-Site: same-origin para imitar navegação real.

Changed

Accept-Language atualizado para pt-BR,pt;q=0.9,en-US;q=0.8,en;q=0.7 conforme RFC 7231.
Accept header agora reflete o perfil completo do browser por família.
Delays de paginação aumentados de 500–1 000 ms para 800–1 500 ms.
Limiar de bloqueio silencioso aumentado de 100 para 5 000 bytes.

[0.5.0] - 2026-04-16

Security

Path traversal validation on --output — rejects .. components and writes to system directories (/etc, /usr, C:\Windows).
Proxy credential masking — error messages no longer expose passwords from --proxy http://user:pass@host URLs.

Added

src/paths.rs — centralized path validation, parent directory creation, and Unix permission application.
src/signals.rs — centralized SIGPIPE restoration (Unix) and Ctrl+C/SIGINT handler (cross-platform).
ErroCliDdg enum with thiserror — 11 typed error variants with exit_code() and codigo_erro() methods.
mascarar_url_proxy() in http.rs — redacts credentials from proxy URLs in error context.
21 new unit tests across paths.rs, signals.rs, error.rs, and http.rs.

Changed

thiserror = "2" added to dependencies for structured domain errors.
src/main.rs reduced from 63 to 23 lines — signal handling extracted to signals.rs.
src/output.rs file writes now validate paths via paths::validate_output_path() before I/O.
deny.toml updated with RUSTSEC-2026-0097 exception (rand 0.8 unsound with custom logger — not applicable).

[0.4.4] - 2026-04-16

Fixed

SIGPIPE restored to SIG_DFL on Unix — pipes to jaq, head, and other consumers no longer lose stdout silently.
BrokenPipe errors detected in anyhow chain and treated as exit 0 (not exit 1) at all output boundaries.

Added

--help now shows EXIT CODES (0–5) and PIPE USAGE sections via after_long_help.
3 E2E tests for pipe regression: exit codes in help, short help exclusion, stdout byte count.
README troubleshooting item 7: "Pipe to jaq/jq returns empty" with PIPESTATUS diagnostic (EN + PT).
docs_rules/rules_rust.md: SIGPIPE + BrokenPipe added to I/O checklist.
docs/AGENT_RULES.md: R24 pipe safety rule with PIPESTATUS diagnostic.
docs/COOKBOOK.md: Recipe 16 pipe diagnostic (EN + PT).
docs/INTEGRATIONS.md: pipe safety clause in baseline contract.
Exit code branching section in both skill files (EN + PT).

[0.4.3] - 2026-04-15

Changed

README.md — Nova seção persuasiva "Agent Skill" (EN + PT) posicionada
entre a tabela de agentes e a seção de Documentação, no pico de atenção do
leitor. Copywriting AIDA destacando a skill bilíngue empacotada em skill/:
auto-ativação semântica sem slash command, 14 seções canônicas MUST/NEVER,
contrato JSON anti-alucinação, economia de tokens em cada turno de busca,
instalação em um comando (git clone + cp -r). Benefícios explícitos para
LLMs (decisão automática de quando buscar) e desenvolvedores (zero prompt
engineering, zero tool registration). Tarball do crates.io inalterado —
skills continuam vivendo apenas no GitHub.

[0.4.2] - 2026-04-15

Added

skill/duckduckgo-search-cli-pt/SKILL.md e
skill/duckduckgo-search-cli-en/SKILL.md — Skills bilíngues para Claude
Code, Claude Agent SDK e plataformas compatíveis com Agent Skills. Cada
skill traz frontmatter YAML com name único por idioma e description
carregado de triggers semânticos para auto-invocação, além de 14 seções
H2 canônicas (Missão, Contrato de Invocação, Proibições Absolutas,
Parsing com jaq, Schema JSON, Exit Codes, Batch, Fetch-Content,
Endpoint, Retries, Receitas, Validação, Memória, Regra de Ouro).
Publicadas no GitHub, excluídas do tarball do crates.io.

Changed

docs/AGENT_RULES.md (833 linhas, +7,6%) — Reescrita editorial
aplicando copywriting AIDA: cada regra abre com benefício mensurável,
linguagem imperativa MUST/NEVER reforçada, zero narrativa decorativa,
zero negrito com asteriscos duplos, zero separador visual --- entre
seções. Bilíngue EN+PT espelhado com tom idêntico.
docs/COOKBOOK.md (1082 linhas, −3,1%) — Cada receita abre com o
ganho concreto antes do comando, bullets curtos de 8 a 15 palavras,
pipelines jaq + xh + sd preservados intactos.
docs/INTEGRATIONS.md (1212 linhas, +1,3%) — 16 agentes com tabela
comparativa textual, snippets determinísticos por agente, zero emoji.

[0.4.1] - 2026-04-14

Added

docs/AGENT_RULES.md (773 linhas) — Regras imperativas bilíngue (EN+PT)
com 30+ rules MUST/NEVER (R01..R30) para LLMs/agentes invocarem a CLI
em produção. Cobre: invariantes core, contrato JSON, rate limiting, error
handling, performance, segurança, anti-patterns. Quick Reference Card no
final.
docs/COOKBOOK.md (1117 linhas) — 15 receitas copy-paste bilíngue
combinando duckduckgo-search-cli + jaq + xh + sd para casos reais:
research consolidado, ETL multi-query, extração de domínios, monitoramento
com filtro temporal, content extraction com --fetch-content, comparação
top 5 vs top 15, NDJSON para pipelines, function wrappers para bash.
docs/INTEGRATIONS.md (1196 linhas) — Snippets prontos para 16
agentes/LLMs: Claude Code, OpenAI Codex, Gemini CLI, Cursor, Windsurf,
Aider, Continue.dev, MiniMax, OpenCode, Paperclip, OpenClaw, Google
Antigravity, GitHub Copilot CLI, Devin, Cline, Roo Code. Cada agente
documenta: pitch, mecanismo de shell, setup, snippet básico, snippet
multi-query, system prompt rule, caveats.
Seção Documentation no README.md (EN + PT) linkando os 3 guias.

Fixed

README.md badge cluster e referências internas conferidas contra
daniloaguiarbr/duckduckgo-search-cli (repo canônico).

[0.4.0] - 2026-04-14

Changed (BREAKING)

Default de --num / -n: alterado de "todos os resultados da primeira
página" (~11) para 15, com auto-paginação automática. Quando o
número efetivo excede 10, o binário agora busca 2 páginas por query
para satisfazer o teto solicitado, desde que --pages não tenha sido
customizado pelo usuário.
Auto-paginação automática: se --num > 10 (seja porque o usuário
passou explicitamente ou porque o default 15 foi aplicado) E --pages
não foi customizado (continua no default 1), o binário auto-eleva
--pages para ceil(num/10) respeitando o teto de 5 páginas validado
por validar_paginas. Impacto: mais requests por query (2x no caso
default) e latência marginalmente maior, porém com cobertura completa
dos resultados solicitados.

Added

Documentação no comentário do flag --num em cli.rs descrevendo a
nova semântica de default e auto-paginação.
4 novos testes unitários em lib.rs::testes:
montar_configuracoes_aplica_default_num_15_quando_omitido,
montar_configuracoes_respeita_pages_explicito_acima_de_1,
montar_configuracoes_auto_pagina_quando_num_maior_que_10,
montar_configuracoes_nao_auto_pagina_quando_num_10_ou_menos.
2 novos testes wiremock em tests/integracao_wiremock.rs:
testa_default_num_15_auto_pagina_2_paginas,
testa_auto_paginacao_respeita_pages_explicito.

Migration Guide

Quem quer o comportamento antigo (1 página, ~11 resultados):
passe --pages 1 --num 10 explicitamente. O --pages 1 explícito é
indistinguível do default (trade-off aceito: paginas > 1 é o único
sinal de "customização"), então o mais seguro é combinar com --num 10
para garantir que nada será auto-paginado.
Quem já passava --num 5 (ou qualquer valor <= 10): comportamento
inalterado (sem auto-paginação, 1 página).
Quem já passava --num 20 --pages 2 ou similar: comportamento
inalterado (respeita explícito do usuário).
Quem confiava no default sem flags: agora recebe até 15 resultados
em vez de ~11, com 1 request extra por query. Para restaurar o antigo,
passe --pages 1 --num 10.

[0.3.0] - 2026-04-14

Changed (BREAKING)

Schema JSON: campo buscas_relacionadas REMOVIDO de SearchOutput e
MultiSearchOutput.buscas[i]. O endpoint html.duckduckgo.com/html/ não
expõe related searches no DOM atual; manter o campo sempre vazio era ruído.
Pipelines que parseavam .buscas_relacionadas precisam ajuste.
Pool de User-Agents: removidos UAs de browsers de texto (Lynx 2.9.0,
w3m/0.5.3, Links 2.29, ELinks 0.16.1.1) que faziam DuckDuckGo retornar
HTML degradado. Substituídos por 6 UAs modernos validados empiricamente
contra o /html/ endpoint: Chrome 146 (Win/Mac/Linux), Edge 145 Windows,
Firefox 134 Linux, Safari 17.6 macOS. Firefox Win/Mac foram REMOVIDOS após
retornarem HTTP 202 anomaly em validação real (heurística anti-bot do DDG).

Fixed

Snippet duplicava título e URL no início: o seletor padrão tinha
fallback .result__body (container pai) que fazia text() recursivo
capturar título+URL+snippet concatenados. Trocado por .result__snippet
puro. Pipelines como jaq '.resultados[].snippet' agora retornam apenas
o texto descritivo do resultado.
Título "Official site": DuckDuckGo renderiza literalmente este texto
como label para domínios verificados (ex: prefeituras). O scraper agora
detecta este caso e substitui pelo url_exibicao (ex: saofidelis.rj.gov.br).
O texto original é preservado no novo campo opcional titulo_original
para auditoria.

Added

Campo titulo_original: Option<String> em SearchResult. Presente
apenas quando o título foi substituído por heurística (atualmente: caso
"Official site"). Serializado com #[serde(skip_serializing_if = "Option::is_none")]
— não aparece no JSON quando ausente.
Resultados patrocinados (.result--ad) excluídos do container default
via seletor .result:not(.result--ad).

Removed

Função extrair_buscas_relacionadas em src/search.rs (dead code com
seletor hardcoded que nunca encontrava nada).
Seção [related_searches] em selectors default.

Migration Guide (v0.2.x → v0.3.0)

Pipelines jaq '.buscas_relacionadas[]': campo não existe mais.
Remover do filtro ou tratar null.
Esperando snippet com prefixo título+URL? Agora vem só o texto descritivo
— ajuste regex/parsing downstream se necessário.
Confiando em titulo == "Official site" para detectar sites verificados?
Use titulo_original.as_deref() == Some("Official site").
CONFIG EXTERNO LEGADO: usuários que rodaram init-config em versões
anteriores possuem ~/.config/duckduckgo-search-cli/{selectors,user-agents}.toml
com defaults antigos (snippet com .result__body + UAs Lynx/w3m/etc.).
Esses arquivos OVERRIDE os defaults embutidos. Para aplicar as correções
desta versão, execute APÓS atualizar:
```
duckduckgo-search-cli init-config --force
```
O flag --force sobrescreve os arquivos externos. Backup recomendado se
você editou manualmente para hotfix de seletores.

[0.2.0] - 2026-04-14

Changed (BREAKING)

Schema JSON serializado agora usa nomes de campo em português brasileiro,
alinhado com os exemplos jaq do README e com o invariante INVIOLÁVEL do
blueprint v2 do projeto ("Logs e nomes de campo em português brasileiro").

Pipelines que dependiam do schema em inglês da v0.1.0 precisam atualizar
os seletores jaq. Tabela de renomeações:

Antes (v0.1.0)	Depois (v0.2.0)
`position`	`posicao`
`title`	`titulo`
`displayed_url`	`url_exibicao`
`content`	`conteudo`
`content_length`	`tamanho_conteudo`
`content_extraction_method`	`metodo_extracao_conteudo`
`execution_time_ms`	`tempo_execucao_ms`
`selectors_hash`	`hash_seletores`
`retries`	`retentativas`
`fallback_endpoint_used`	`usou_endpoint_fallback`
`concurrent_fetches`	`fetches_simultaneos`
`fetch_successes`	`sucessos_fetch`
`fetch_failures`	`falhas_fetch`
`chrome_used`	`usou_chrome`
`proxy_used`	`usou_proxy`
`engine`	`motor`
`region`	`regiao`
`results_count`	`quantidade_resultados`
`results`	`resultados`
`related_searches`	`buscas_relacionadas`
`pages_fetched`	`paginas_buscadas`
`error`	`erro`
`message`	`mensagem`
`metadata`	`metadados`
`queries_count`	`quantidade_queries`
`parallel`	`paralelismo`
`searches`	`buscas`

Campos inalterados: url, snippet, query, endpoint, timestamp, user_agent.

Fixed

Pipelines documentados no README (jaq '.resultados[].titulo', etc.) agora
funcionam end-to-end. Em v0.1.0 retornavam null por divergência do schema
(bug reportado pelo usuário).

Unreleased

Added

LICENSE-MIT and LICENSE-APACHE (dual-licensed per Cargo.toml, aligning the tarball with the SPDX declaration).
.pre-commit-config.yaml with three hook groups: (1) pre-commit-hooks standard (trailing whitespace, EOF, YAML/TOML validity, mixed line endings), (2) Rust hooks (cargo fmt + cargo clippy -D warnings), (3) local commit-msg hook blocking Co-authored-by: from AI agents (mirrors the CI commit_check job). Reduces CI round-trips for trivial violations.
.gitattributes forcing LF on .rs / .toml / .sh / .yml / .md / fixture HTML — prevents silent corruption when cloning on Windows with core.autocrlf=true (which would otherwise break shebangs, rustfmt, and content-extraction tests). Binary extensions (.png, .woff2, etc.) marked explicitly. Cargo.lock and target/ flagged linguist-generated to exclude from GitHub language stats.
.editorconfig normalizing UTF-8, LF, trailing-whitespace trim, and per-language indent (Rust/TOML 4, YAML/JSON/MD 2, Makefile tab) across VS Code, RustRover, vim, and other editors — eliminates spurious formatting diffs caused by per-dev settings drift.
.github/PULL_REQUEST_TEMPLATE.md with the 10-gate checklist + project-specific constraints (no cache, no MCP, rustls-only, println! confined to output.rs, PT-BR identifiers).
.github/ISSUE_TEMPLATE/bug_report.yml + feature_request.yml + config.yml — structured triage with platform dropdown (glibc/musl/NixOS/Flatpak/Snap/macOS ARM/macOS Intel/Windows/WSL), install method, and constraint verification. config.yml redirects security reports to Security Advisories and usage questions to Discussions.
Cross.toml enabling cross build --target <t> for ARM64/ARMv7 Linux targets (musl + glibc + hard-float) from any x86_64 host with Docker/Podman — complements the native CI pipeline for developers without a GitHub Actions runner.
CONTRIBUTING.md with the 10-gate validation matrix, coding standards (Brazilian Portuguese identifiers, rustls-only TLS, output.rs as the sole println! site), three-layer testing strategy, supply-chain guardrails, and the tag-driven release process.
.cargo/config.toml exposing 8 developer aliases (cargo check-all, cargo lint, cargo docs, cargo test-all, cargo cov, cargo cov-html, cargo publish-check, cargo pkg-list) — each mirrors a CI job for local reproduction.
Doctests in public API: pipeline::combine_and_dedup_queries, content_fetch::extract_host, and search::format_kl — compilable examples on docs.rs that double as regression tests.
SECURITY.md documenting the private-disclosure workflow via GitHub Security Advisories, response SLA (72 h), scope (HTTP/HTML parsing, credential leaks, path traversal, TLS) and security design assumptions (stateless, rustls-only, no JS for search).
.github/dependabot.yml enabling weekly automatic dependency updates for both cargo and github-actions ecosystems, with semantic grouping (dev-deps, tokio-ecosystem, tracing-ecosystem) and PR count limits.
rust-toolchain.toml pinning stable with rustfmt + clippy components for reproducible dev/CI builds.
.github/workflows/release.yml triggered by v*.*.* tags (and workflow_dispatch with dry_run) running the 5-stage release pipeline per rules_rust.md §19: validate → build_matrix (5 targets) → macos_universal (lipo) → github_release (with generated notes) → crates_io (publish gated on CRATES_IO_TOKEN secret).
msrv job in ci.yml extracting rust-version from Cargo.toml and running cargo check on that toolchain to detect MSRV drift on every PR.
.github/workflows/ci.yml enforcing the 10-gate validation matrix across Ubuntu, macOS, and Windows:
- cargo check / clippy -D warnings / fmt --check / doc -D warnings / test --all-features on all three OSes.
- cargo llvm-cov --fail-under-lines 80 dedicated job on Ubuntu.
- cargo audit + cargo deny check advisories licenses bans sources supply-chain gate.
- cargo publish --dry-run + cargo package --list sensitive-file guard.
- Static musl binary smoke test (x86_64-unknown-linux-musl) covering Alpine Linux and minimal containers.
- commit_check job blocking Co-authored-by: trailers from AI agents in PRs.
deny.toml with full four-axis supply-chain policy (advisories/licenses/bans/sources) and documented ignores for three transitive unmaintained advisories (RUSTSEC-2025-0057 fxhash, RUSTSEC-2025-0052 async-std, RUSTSEC-2026-0097 rand) with justification and revisit notes.
22 new tests raising coverage from 77.4% to 86.4% (lines): tests/integration_pipeline.rs (10), tests/integracao_fetch_conteudo.rs (3), and 9 inline tests for output.rs covering emit_ndjson, emit_stream_text, emit_stream_markdown, and the PipelineResult variants via tempfile.

Changed

parallel.rs coverage 50% → 81%; pipeline.rs 55% → 82%; content_fetch.rs 68% → 85%; output.rs 70% → 87%.

0.1.0 - 2026-04-14

Added

Core search pipeline against DuckDuckGo HTML endpoint via pure HTTP (html.duckduckgo.com/html/).
Lite endpoint fallback via --endpoint lite for JavaScript-less pages.
Multi-query mode with automatic deduplication, positional args, --queries-file, and stdin.
Parallel fan-out of queries with --parallel (1..=20), bounded by tokio::JoinSet + Semaphore.
--pages (1..=5) to collect multiple result pages per query.
--fetch-content fetches each result URL via pure HTTP, applies readability, and embeds the cleaned text in the JSON output.
--max-content-length (1..=100_000) truncates extracted content respecting word boundaries.
Chrome headless fallback under --features chrome with cross-platform detection (Linux including Flatpak/Snap, macOS including Apple Silicon, Windows including registry paths) and stealth flags (--disable-blink-features=AutomationControlled, --window-size=1920,1080, --no-first-run, platform-specific --no-sandbox, --disable-gpu).
--chrome-path flag to manually specify the Chrome/Chromium executable.
--proxy URL + --no-proxy (HTTP/HTTPS/SOCKS5) with precedence over env vars.
--global-timeout (1..=3600 s) wraps the whole pipeline in tokio::time::timeout.
--per-host-limit (1..=10) rate-limits fetches per host via a per-host Semaphore map.
--match-platform-ua narrows the user-agent pool to the current platform.
--stream NDJSON mode emits one result per line as they are extracted.
Four output formats: json (default), text, markdown, auto (TTY-aware).
External configuration files: selectors.toml and user-agents.toml under XDG config dir, overriding embedded defaults.
Subcommand init-config with --force and --dry-run to bootstrap user config files.
Exit codes: 0 success, 1 runtime, 2 config, 3 block (HTTP 202 anomaly), 4 global timeout, 5 zero results.
UTF-8 console initialization on Windows via SetConsoleOutputCP(65001).
Rustls-TLS everywhere for dependency-free cross-platform builds.
tracing + tracing-subscriber with RUST_LOG honored; --verbose / --quiet flags.
163 unit + integration tests covering CLI parsing, config montage, HTTP extraction, parallel fan-out, selectors, and wiremock-backed search flows.

Security

All credentials (--proxy user:pass@host) are masked in logs.
Output file creation applies Unix permissions 0o644.

v0.6.4: WS-26 anti-bot identity pool + 166 gaps resolved

Changelog

[0.6.4] - 2026-06-03

Added

Changed

Tests

[0.7.0] - 2026-06-01

Changed

Added

Fixed

Security

Known Limitations

[0.6.3] - 2026-04-17

Changed

[0.6.2] - 2026-04-17

Added

Changed

[0.6.1] - 2026-04-17

Fixed

Added

[0.6.0] - 2026-04-16

Security

Added

Changed

[0.5.0] - 2026-04-16

Security

Added

Changed

[0.4.4] - 2026-04-16

Fixed

Added

[0.4.3] - 2026-04-15

Changed

[0.4.2] - 2026-04-15

Added

Changed

Meta

[0.4.1] - 2026-04-14

Added

Fixed

[0.4.0] - 2026-04-14

Changed (BREAKING)

Added

Migration Guide

[0.3.0] - 2026-04-14

Changed (BREAKING)

Fixed

Added

Removed

Migration Guide (v0.2.x → v0.3.0)

[0.2.0] - 2026-04-14

Changed (BREAKING)

Fixed

Unreleased

Added

Changed

0.1.0 - 2026-04-14

Added

Security

Uh oh!