Skip to content

igtm/skill-search

Repository files navigation

skill-search

LLM-framework-agnostic skill search library.

Parses SKILL.md files in a skills directory, builds a fast full-text search index using tantivy (Rust-based), and provides tool definitions + execution handlers that work with any LLM library (OpenAI SDK, Anthropic SDK, LangChain, LiteLLM, etc.).

日本語 README

Installation

pip install skill-search

Usage

Basic

from skill_search import SkillSearch

# Initialize with skill directories
ss = SkillSearch(skills_dirs=["./skills"])

# Get tool definitions (OpenAI function calling format)
tools = ss.get_tool_definitions()

# Get system prompt with skill listing
system_prompt = ss.get_system_prompt()

# Execute tool calls from LLM
result = ss.call_tool("search_skills", {"query": "API reference", "top_k": 3})

OpenAI SDK

import json
from openai import OpenAI
from skill_search import SkillSearch

client = OpenAI()
ss = SkillSearch(skills_dirs=["./skills"])

messages = [
    {"role": "system", "content": ss.get_system_prompt()},
    {"role": "user", "content": "How do I use the Figma API?"},
]

response = client.chat.completions.create(
    model="gpt-4o",
    messages=messages,
    tools=ss.get_tool_definitions(),
)

# Handle tool calls
for choice in response.choices:
    if choice.message.tool_calls:
        for tc in choice.message.tool_calls:
            result = ss.call_tool(
                tc.function.name,
                json.loads(tc.function.arguments),
            )
            messages.append({"role": "tool", "content": result, "tool_call_id": tc.id})

Anthropic SDK

from anthropic import Anthropic
from skill_search import SkillSearch

client = Anthropic()
ss = SkillSearch(skills_dirs=["./skills"])

# Convert to Anthropic format
tools = [
    {
        "name": t["function"]["name"],
        "description": t["function"]["description"],
        "input_schema": t["function"]["parameters"],
    }
    for t in ss.get_tool_definitions()
]

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    system=ss.get_system_prompt(),
    messages=[{"role": "user", "content": "How do I search in Jira?"}],
    tools=tools,
)

# Handle tool use
for block in response.content:
    if block.type == "tool_use":
        result = ss.call_tool(block.name, block.input)

Available Tools

Tool Description
list_skills List all available skills
read_skill Read full SKILL.md content
search_skills Full-text search (BM25 via tantivy)
read_resource Read supplementary resource files

SKILL.md Format

---
name: my-skill
description: Brief description of the skill
---

# My Skill

## Usage

1. Step 1
2. Step 2

How It Works

  1. Discovery — Recursively scans for SKILL.md files and parses YAML frontmatter
  2. Indexing — Splits documents into heading-level chunks and indexes with tantivy
  3. Search — BM25 scoring returns results ranked by relevance
  4. Tool Executioncall_tool() executes LLM tool calls and returns results as strings

Security

LLM tool calls are treated as untrusted input and protected with defense-in-depth.

Path Traversal Prevention

Layer Location Protection
Discovery discover_resources() Symlinks resolved; paths outside skill directory excluded
Input validation read_resource handler Resource names containing .. are rejected
Path resolution read_resource handler Resolved paths verified to be within skill directory
Whitelist read_resource handler Only pre-discovered resources are accessible

Extension Filter

Only these file extensions are indexed as resources:

.md, .json, .yaml, .yml, .csv, .xml, .txt

Executable files (.py, .sh, .exe, etc.) and binaries are excluded.

Design Principles

  • Read-only — No write or execute capabilities
  • Whitelist-based — Only pre-validated resources are accessible
  • Defense-in-depth — Input validation → path resolution → directory boundary check

Development

uv run pytest tests/ -v

License

MIT

About

LLM-framework-agnostic skill search library.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages