Skip to content
Harsh Patel edited this page Jun 20, 2026 · 6 revisions

Skills

Skills can be used to add custom actions to Kodo beyond basic mouse and keyboard actions.

A skill can teach the planner how to approach a task, or give actors guidance or another path to complete a specific task, how to use other applications, or all of the above.

Skills can have actions attached to them, this corresponds to executable Python code, to allow the LLM to find other ways of performing certain actions.

Skill Types

Documentation Skills

Documentation skills teach the LLM how to operate specific applications, or complete specific tasks. These do not contain any executable code and just serve as extra information on how to approach certain tasks.

python documents the built-in python action, this serves as a way for Kodo to run code within the built-in Python Runner module inside Kodo.

Documentation skills include a planner_skill.md, used by the planner, and an actor_skill.md, used both by the Actor or by the Autonomy Mode Actor.

Executable Skills

Executable-Static Skills

Static skills are skills which include documentation and a Python file to run to perform a certain task. The documentation stays the same on every run and do not take any factors into account.

browser-navigation and toast_notifications are both static skills, as they include documentation on how to work with a browser, and how to send toast notifications respectively, as well as exposing a custom action.

{"action": "open_url", "url": "https://youtube.com"}
{"action": "send_toast", "title": "Done", "body": "Task complete"}

When the LLM emits a registered action name, it is passed on to the skill orchestrator, which will call the Python file associated with the action with the JSON emitted by the LLM.

Executable-Dynamic Skills

These are skills with no planner_skill.md and actor_skill.md, instead they are generated by calling the Python file with the --generate flag at runtime.

The documentation is generated, which allows it to be tweaked by the skill creator based on factors such as installed apps, Windows version, or whatever have you.

launch-windows-app is the example here. It scans the Start Menu at runtime to find every installed .lnk shortcut, then injects the real list of launchable apps into both the planner and actor prompts. The model always sees your actual installed apps rather than a hardcoded list.

{"action": "open_app", "app": "Microsoft Word"}

The --generate output must look like this:

{
  "planner": "## My Skill\nPlanner documentation here...",
  "actor": "## My Skill\nActor documentation here..."
}

You can use also use the provided AGENTS.md, it contains all the documentation needed for LLMs to create skills.

Skill Definitions

The skill.json file is used to define metadata about your skill. The contents will depend on the type of skill you are creating, but the skill at its minimum, must contain the name and the description.

The name and the description is what is fed into the LLM when it is deciding to install a skill to perform your task.

Skills can be enabled or disabled using the enabled flag, or by prefixing the folder with __. This will tell Kodo to ignore the skill.

Wiki Contents

Home, Installation and Running Instructions

Skills

Clone this wiki locally