Command-line interface for the Olostep API — search, map, answer, scrape, crawl, and batch workflows from your terminal. Outputs are structured JSON (pretty-printed) so you can pipe them into jq, agents, and CI without writing a custom client first.
The same CLI is available as a standalone binary (no Python required) via npm, or from source with Python.
- Installation
- Authentication
- Quick start
- Output, stdout, and agents
- Commands
- Default output paths
- Global options
- Project structure
- Development
- Security
- References
- License
Installs a platform-specific binary on postinstall (macOS arm64/x64, Linux x64, Windows x64). Node.js 16+ is required only for install; the olostep command does not use Python.
npm install -g olostep-cliRun without a global install:
npx -y olostep-cli@latest --helpIf the binary failed to download, reinstall or check that a GitHub release exists for your package version and platform.
For development or when you want to run the Typer app directly:
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -U pip
pip install -e .The console script is olostep. You can also run python main.py ....
Package metadata: see pyproject.toml (olostep-api-cli).
Set one of these (process environment or a .env file next to the working directory / binary):
| Variable | Description |
|---|---|
OLOSTEP_API_KEY |
API key (preferred) |
OLOSTEP_API_TOKEN |
Alternative token name |
Create keys in the Olostep API Keys dashboard.
Batch commands resolve the token via resolve_api_key(); map/answer/scrape/crawl use the same credentials through config/config.py. Defaults include API_BASE_URL (https://api.olostep.com/v1) and batch base URL — change there if you need another environment.
export OLOSTEP_API_KEY=your_key_here
olostep --help
olostep map "https://example.com" --top-n 20
olostep scrape "https://example.com" --formats markdown,html--out <path>— Write JSON results to a file. Parent directories are created automatically.--out -— Write only the JSON result to stdout (UTF-8, indented). Use this for pipelines, subprocess capture in agents, and tools that expect machine-readable output on stdout.- Logs (e.g.
logger.info, progress) go to stderr, so you can redirect or ignore them while keeping clean JSON on stdout.
Examples:
olostep map "https://example.com" --top-n 50 --out - | jq '.urls[:10]'
olostep answer "What is Olostep?" --out - | jq .result
olostep scrape "https://example.com" --out - | jq .result.markdown_contentCI-style usage:
export OLOSTEP_API_KEY="${{ secrets.OLOSTEP_API_KEY }}"
olostep scrape "https://docs.example.com" --out result.jsonRun olostep <command> --help for full option text. HTTP timeout for most API-backed commands: --timeout (seconds).
Map a site to discover URLs (Olostep Maps).
| Option | Description |
|---|---|
--out |
Output path or - for stdout |
--top-n |
Max URLs to return |
--search-query |
Optional query to guide discovery |
--include-subdomain / --no-include-subdomain |
Include subdomains |
--include-url |
Repeatable URL patterns to include |
--exclude-url |
Repeatable URL patterns to exclude |
--cursor |
Pagination cursor |
--timeout |
HTTP timeout (s) |
olostep map "https://example.com" --top-n 100 --search-query "blog"
olostep map "https://example.com" --include-subdomain --out - | jq '.urls[:5]'Compatibility: --limit was removed — use --top-n.
Ask a question; the CLI polls until the answer job completes.
| Option | Description |
|---|---|
--out |
Output path or - |
--json-format |
Optional JSON shape / schema hint (string or JSON object string) |
--poll-interval |
Polling interval (seconds) |
--poll-timeout |
Max time to wait (seconds) |
--timeout |
HTTP timeout (s) |
olostep answer "Summarize this company's product" --out output/answer.json
olostep answer "Extract company facts" --json-format '{"company":"","country":""}' --out -Compatibility: --model was removed — use --json-format.
Scrape one URL in one or more formats.
Formats (comma-separated): html, markdown, text, json, raw_pdf, screenshot.
| Option | Description |
|---|---|
--out |
Output path or - |
--formats |
Comma-separated formats (default: markdown) |
--country |
Optional country code |
--wait-before-scraping |
Wait before scrape (milliseconds) |
--payload-json |
Advanced scrape options as a JSON object string |
--payload-file |
Same as above, from a JSON file (mutually exclusive with --payload-json) |
--timeout |
HTTP timeout (s) |
olostep scrape "https://example.com/article" --formats markdown,text
olostep scrape "https://example.com" --country US --wait-before-scraping 2000
olostep scrape "https://example.com" --payload-file advanced.json --out - | jq .Retrieve a previous scrape by ID.
olostep scrape-get "scrape_abc123" --out output/scrape_get.json
olostep scrape-get "scrape_abc123" --out - | jq .result.markdown_contentStart a crawl, poll until finished, then retrieve page contents.
Retrieve formats (comma-separated): markdown, html, json.
| Option | Description |
|---|---|
--out |
Output path or - |
--max-pages |
Maximum pages to crawl |
--max-depth |
Optional max depth |
--include-subdomain / --no-include-subdomain |
Subdomains |
--include-external / --no-include-external |
External domains |
--include-url / --exclude-url |
Repeatable path/URL patterns |
--search-query / --top-n |
Optional discovery filter and cap |
--webhook |
Optional webhook URL |
--crawl-timeout |
Crawl timeout (seconds) |
--follow-robots-txt / --ignore-robots-txt |
robots.txt |
--formats |
Retrieve formats |
--pages-limit |
Page size for crawl pages API |
--pages-search-query |
Filter when listing pages |
--poll-seconds / --poll-timeout |
Polling |
--timeout |
HTTP timeout (s) |
--dry-run |
Print API payload JSON and exit (no request) |
olostep crawl "https://docs.example.com" --max-pages 50 --formats markdown,html
olostep crawl "https://example.com" --max-pages 10 --dry-runSubmit many URLs from a CSV with columns custom_id/id and url. Polls until completion.
| Option | Description |
|---|---|
--out |
Output path or - |
--formats |
markdown, html, json (comma-separated) |
--country |
Optional country code |
--parser-id |
Optional parser for structured extraction |
--poll-seconds |
Poll interval |
--log-every |
Log every N polls |
--items-limit |
Batch items page size (API often suggests 10–50) |
--dry-run |
Print payload JSON and exit |
olostep batch-scrape urls.csv --formats markdown,html --country US
olostep batch-scrape urls.csv --parser-id "<PARSER_ID>" --out results.jsonUpdate metadata on an existing batch. One of --metadata-json or --metadata-file is required (JSON object).
olostep batch-update "batch_abc123" --metadata-json '{"team":"growth"}'
olostep batch-update "batch_abc123" --metadata-file meta.json --out -If you omit --out, JSON is written under output/:
| Command | Default file |
|---|---|
map |
output/map.json |
answer |
output/answer.json |
scrape |
output/scrape.json |
scrape-get |
output/scrape_get.json |
crawl |
output/crawl_results.json |
batch-scrape |
output/batch_results.json |
batch-update |
output/batch_update.json |
| Option | Description |
|---|---|
-V, --version |
Print CLI version and exit |
-h, --help |
Help (Typer / Rich) |
.
├── main.py # Typer CLI entrypoint
├── pyproject.toml # Python package + `olostep` script
├── olostep.spec # PyInstaller spec for release binaries
├── npm/
│ ├── package.json # olostep-cli on npm
│ ├── bin/olostep.js # Node shim → native binary
│ └── scripts/postinstall.js
├── config/
│ └── config.py # Env, defaults, base URLs
├── src/
│ ├── api_client.py
│ ├── map_api.py
│ ├── answer_api.py
│ ├── scrape_api.py
│ ├── crawl_api.py
│ ├── batch_api.py
│ └── batch_scraper.py
└── utils/
└── utils.py # JSON output, stdout `-`, polling helpers
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e ".[test]"
pytest
olostep --helpRelease binaries are built with PyInstaller (pip install -e ".[build]"); see .github/workflows/release.yml.
- Do not commit
.envor API keys. - Rotate keys if they are exposed.
MIT — see LICENSE.