An unofficial Octoparse CLI built for AI agents and automation workflows.
This project is not an official Octoparse CLI, and it is not published, maintained, or endorsed by Octoparse.
It is a personal CLI project built from my own packet inspection, reverse engineering, and interface analysis of the Octoparse desktop application and related APIs. Please do not treat this repository as an official product, official SDK, or part of Octoparse support scope.
I was genuinely excited when Octoparse started supporting ApiKey authentication.
I am a heavy Octoparse user, and I am also deeply invested in AI agents and workflow automation. I urgently needed a way to connect Octoparse to my own agent systems so I could automate template discovery, task execution, latest-row previews, unexported data exports, data orchestration, and larger end-to-end automation flows.
That need is important to me. This project is not just about wrapping a few APIs with a CLI. It is about turning a tool I rely on heavily into a capability that can participate directly in AI-native workflows. That is why I decided to open this project: I hope it can also help other users who want to integrate Octoparse into agents, automation systems, or structured operational pipelines.
This is still an early-stage project.
Right now it only implements the core features I use most often. The current scope is focused on account access, tasks, templates, exports, scheduling, and some local scraping entry points. Many interfaces and features are still missing, and some existing capabilities only have an initial CLI shape rather than a fully mature implementation.
In practical terms, this repository is currently best suited for:
- advanced users who are comfortable understanding API boundaries
- developers who want to connect Octoparse to AI agents
- contributors who are willing to help iterate, extend, and stabilize the project
It should not yet be described as a complete, fully stable, official-grade CLI.
- account basic information lookup
- task search, show, copy, move, delete, and rename
- task start, stop, and status lookup
- template search, template view, and template version lookup
- template task creation and update
- export count, latest-row preview, unexported data export, and asynchronous export creation
- cloud scheduling and local scheduling commands
- task URL chain commands
- local scraping entry-point commands
This project already exposes some sensitive operations, including but not limited to:
- deleting tasks
- deleting task groups
- deleting export records
- deleting or overwriting parts of user configuration
Please make sure you fully understand a command before you run it, and make sure the target account, task, schedule, or configuration object is correct.
If you accidentally delete tasks, overwrite configuration, trigger exports by mistake, update schedules incorrectly, or cause any direct or indirect loss while using this project, the responsibility belongs to the user, not to this project.
If you plan to use this in production environments, important business accounts, shared accounts, or high-value task pipelines, you should at minimum:
- validate command behavior in a test account or low-risk task first
- manually double-check destructive or overwrite operations before execution
- read the help output and source code before running sensitive commands
- back up important tasks, configuration, schedules, and export chains in advance
- This is a personal open-source project and comes with no official warranty of any kind.
- This project has no affiliation, authorization, employment, or agency relationship with Octoparse.
- The interface understanding and implementation in this repository may break when the official desktop app, services, authentication model, response shape, or risk-control behavior changes.
- Users must evaluate on their own whether use of this project complies with local laws, service terms, company policies, and internal security requirements.
- The project author is not responsible for account risk, broken tasks, lost configuration, incorrect data, workflow interruption, bans, or any other loss caused by using this project.
Authentication is based on:
- environment variable:
OCTOPARSE_API_KEY - request header:
x-api-key
Install globally:
npm install -g @stephenyang/octoparseRun with npx:
npx @stephenyang/octoparse accountoctoparse account
octoparse task list
octoparse task show <taskId>
octoparse task start <taskId>
octoparse task stop <taskId>
octoparse template search "<keyword>"
octoparse template view <templateId>
octoparse export preview <taskId> --size 5
octoparse export unexported <taskId> --format jsonFor compatibility, octoparse account whoami is still available, but the preferred simplified form is:
octoparse accountThe following command categories should only be executed after you are fully sure about the target object:
task deletetask-group deletetask-group set-defaultuser-config setschedule cloud updateschedule local updatetemplate-task update- any batch overwrite, update, or delete command
If you plan to let an agent execute these commands automatically, you should add another approval, whitelist, or policy layer on the agent side.
The long-term direction of this project is not limited to Octoparse interfaces only.
I want it to evolve into a more agent-friendly automation command layer. That means not only exposing Octoparse capabilities, but also adding non-Octoparse features that are still important to automation workflows, such as:
webfetch-based data retrieval- Chromium-based browser scraping and interaction
- more structured output designed for agent consumption
- clearer task orchestration, parameter validation, and error normalization
- local provider capabilities designed for automation pipelines
In other words, the longer-term goal is not just "a CLI for Octoparse APIs", but an automation CLI that uses Octoparse as a core entry point without being permanently limited by Octoparse interface boundaries.
- continue filling in the most useful but still incomplete CLI commands
- improve output formatting, help text, and error messages
- keep expanding test coverage to reduce misuse risk around sensitive commands
- keep the npm package structure lean and publish-friendly
- continue fixing early-stage implementation edge cases
- expand coverage for commonly used Octoparse interfaces
- strengthen template task, schedule, export, and task URL modules
- improve structured outputs for AI agent integration
- add finer-grained safety prompts and guardrails for risky operations
- provide clearer module docs, usage examples, and best practices
- add
webfetchdata retrieval capability - add Chromium-based browser scraping capability
- explore a more unified abstraction between local scraping providers and Octoparse task capabilities
- gradually evolve this project into a broader AI-agent automation data collection toolkit
This project comes from a real operational need, not from trying to make a CLI that only looks complete on the surface.
If you are also a heavy Octoparse user and are seriously exploring AI agents, automation orchestration, and data workflow tooling, this project should be useful. But because it is still evolving quickly, please use it with clear expectations and with appropriate caution.