Extract source file skeletons using tree-sitter queries.
Removes function implementations while preserving structure, signatures, and docstrings. Supports 17 programming languages with a clean, fully-typed Python API and comprehensive CLI.
Requires: tree-sitter >= 0.25
- ✅ 17 Languages - Python, JS/TS, Java, Kotlin, Go, Rust, C/C++, C#, Ruby, PHP, Swift, Lua, Scala, Groovy, Objective-C
- ✅ Smart Extraction - Functions, methods, constructors, arrow functions, getters/setters
- ✅ Preserved Elements - Signatures, class definitions, imports, docstrings, decorators
- ✅ All File Types - Process any non-binary text files (code, markdown, JSON, YAML, etc.)
- ✅ Binary Detection - Automatically skips binary files
- ✅ Ignore Patterns - Built-in + custom .gitignore support
- ✅ Fully Typed - Complete type hints throughout
- ✅ CLI & Library - Use as command-line tool or Python library
# With uv (recommended)
uv pip install loppers
# With pip
pip install loppersThe public API consists of 5 core functions:
Extract skeleton from source code by language identifier.
from loppers import extract_skeleton
code = """
def calculate(x: int, y: int) -> int:
'''Calculate sum.'''
result = x + y
return result
"""
skeleton = extract_skeleton(code, "python")
print(skeleton)Output:
def calculate(x: int, y: int) -> int:
'''Calculate sum.'''Extract skeleton from a file by auto-detecting language from extension.
from loppers import get_skeleton
skeleton = get_skeleton("src/main.py")
print(skeleton)
# With header showing file path
skeleton = get_skeleton("src/main.py", add_header=True)
# Output: "--- /path/to/src/main.py\n..."Raises:
FileNotFoundError- If file doesn't existValueError- If file language is unsupported
3. find_files(root: str | Path, *, recursive: bool = True, ignore_patterns: Sequence[str] | None = None, use_default_ignore: bool = True, respect_gitignore: bool = True) -> list[str]
Collect all non-binary text files from a root directory.
from loppers import find_files
# Find all text files in src/ recursively (default)
files = find_files("src/")
# Returns file paths relative to root:
# ['main.py', 'utils.py', 'config.yaml', 'README.md']
# Non-recursive
files = find_files("src/", recursive=False)
# Custom ignore patterns (gitignore syntax)
files = find_files(
"src/",
ignore_patterns=["*.test.py", "venv/"],
use_default_ignore=True, # Still applies built-in patterns
respect_gitignore=True, # Still respects .gitignore
)Features:
- Takes single root directory (not multiple paths)
- Returns file paths relative to root
- Automatically excludes binary files (images, archives, etc.)
- Respects
.gitignoreby default - Supports custom gitignore-style ignore patterns
- Built-in patterns exclude node_modules, .git, pycache, build artifacts, etc.
- Works with ALL non-binary text files (code, markdown, JSON, YAML, etc.)
4. get_tree(root: str | Path, *, recursive: bool = True, ignore_patterns: Sequence[str] | None = None, use_default_ignore: bool = True, respect_gitignore: bool = True, collapse_single_dirs: bool = False, show_sizes: bool = False) -> str
Display formatted directory tree from a root directory.
from loppers import get_tree
# Display tree of src/ directory recursively
tree = get_tree("src/")
print(tree)
# Non-recursive tree
tree = get_tree("src/", recursive=False)
# With custom ignore patterns
tree = get_tree("src/", ignore_patterns=["*.test.py"])
# Collapse deep single-child directories (useful for Java packages)
tree = get_tree("src/", collapse_single_dirs=True)
# Show file sizes in human-friendly format
tree = get_tree("src/", show_sizes=True)
# Combine multiple options
tree = get_tree("src/", collapse_single_dirs=True, show_sizes=True)Output (with collapse_single_dirs=True and show_sizes=True):
.
├─ main/java/com/example
│ ├─ Source.java (2.3KB)
│ └─ Util.java (1.8KB)
└─ resources/config.yaml (512B)
5. concatenate_files(root: str | Path, file_paths: Sequence[str | Path], *, extract: bool = True, ignore_not_found: bool = False) -> str
Concatenate files with optional skeleton extraction. Useful when you already have a list of file paths that you want to combine.
from loppers import concatenate_files, find_files
# Get list of files to concatenate
root = "src/"
files = find_files(root)
# Concatenate with skeleton extraction (default)
result = concatenate_files(root, files)
print(result)
# Concatenate without extraction (include original content)
result = concatenate_files(root, files, extract=False)
# Ignore files that don't exist or can't be processed
result = concatenate_files(root, files, ignore_not_found=True)Output format:
--- path/to/file.py
def calculate(x: int) -> int:
'''Calculate.'''
--- path/to/other.js
function process() {
}
Features:
- Files are concatenated with headers showing their relative paths
- Each file separated by newlines
- Automatically extracts skeletons from code files (unless
extract=False) - Falls back to original content for unsupported file types
- Can optionally ignore files that don't exist or can't be processed
Parameters:
root- Root directory (file paths are relative to this)file_paths- List of file paths (relative to root) to concatenateextract- Extract skeletons from code files (defaultTrue)ignore_not_found- Ignore files that cannot be found or processed (defaultFalse)
Raises:
FileNotFoundError- If root doesn't exist or a file is not found (whenignore_not_found=False)ValueError- If no file paths provided or no files could be processedNotADirectoryError- If root is not a directory
get_language(extension: str) -> str | None - Get language identifier from file extension.
from loppers import get_language
get_language(".py") # "python"
get_language(".js") # "javascript"
get_language(".json") # None (no extraction for data files)Loppers provides 4 subcommands for common tasks.
loppers --version
loppers --helpExtract a single file's skeleton:
# From file
loppers extract file.py
loppers extract file.py -o skeleton.py
# From stdin with explicit language
echo 'def foo(): pass' | loppers extract -l python
# Verbose output
loppers extract file.py -vOptions:
FILE- File to extract (omit for stdin)-l, --language- Language identifier (auto-detected from extension if FILE provided, required for stdin)-o, --output- Output file (default: stdout)-v, --verbose- Print status to stderr
Process root directory with automatic skeleton extraction:
# Recursive (default)
loppers concatenate src/
# Non-recursive
loppers concatenate --no-recursive src/
# Save to file
loppers concatenate src/ -o combined.txt
# Verbose with progress
loppers concatenate -v src/
# Include original files without extraction
loppers concatenate --no-extract src/
# Custom ignore patterns
loppers concatenate -I "*.test.py" -I "venv/" src/
# Disable default ignores
loppers concatenate --no-default-ignore src/
# Don't respect .gitignore
loppers concatenate --no-gitignore src/Features:
- Processes a single root directory (paths relative to root)
- Includes ALL non-binary text files (code, markdown, JSON, YAML, etc.)
- Automatically extracts skeletons for supported code files
- Includes original content for unsupported file types (graceful degradation)
- Each file prefixed with
--- filepathheader (relative path) - Verbose mode shows extraction status for each file
Options:
root- Root directory to process (required)-o, --output- Output file (default: stdout)--no-extract- Include original files without extraction-I, --ignore-pattern- Add custom ignore pattern (gitignore syntax, can be used multiple times)--no-default-ignore- Disable built-in ignore patterns--no-gitignore- Don't respect .gitignore--no-recursive- Don't recursively traverse directories-v, --verbose- Print status to stderr
Display a formatted tree of all discovered files:
# Recursive tree (default)
loppers tree src/
# Non-recursive
loppers tree --no-recursive src/
# Save tree to file
loppers tree src/ -o tree.txt
# With ignore patterns
loppers tree -I "*.test.py" src/
# Collapse deep single-child directories (useful for Java packages)
loppers tree --collapse-single-dirs src/
# Show file sizes in human-friendly format
loppers tree --show-sizes src/
# Combine multiple options
loppers tree --collapse-single-dirs --show-sizes src/Options:
root- Root directory to process (required)-o, --output- Output file (default: stdout)-I, --ignore-pattern- Add custom ignore pattern--no-default-ignore- Disable built-in ignore patterns--no-gitignore- Don't respect .gitignore--no-recursive- Non-recursive tree--collapse-single-dirs- Collapse directories with single children (e.g.,java/com/examplebecomes one line)--show-sizes- Show file sizes in human-friendly format (e.g., "1.2KB", "5.0MB")-v, --verbose- Print status to stderr
Collapse Example:
Without collapse:
.
└─ src
└─ main
└─ java
└─ com
└─ example
├─ Source.java
└─ Util.java
With --collapse-single-dirs:
.
└─ src/main/java/com/example
├─ Source.java
└─ Util.java
Print one discovered file per line (relative to root):
# List all files recursively (default)
loppers files src/
# Save list to file
loppers files src/ -o file_list.txt
# Non-recursive
loppers files --no-recursive src/
# With custom ignores
loppers files -I "*.md" src/Options:
root- Root directory to process (required)-o, --output- Output file (default: stdout)-I, --ignore-pattern- Add custom ignore pattern--no-default-ignore- Disable built-in ignore patterns--no-gitignore- Don't respect .gitignore--no-recursive- Non-recursive listing-v, --verbose- Print status to stderr
Before:
class Calculator:
def __init__(self, name: str):
"""Initialize calculator."""
self.name = name
self._setup()
def process(self, data):
"""Process data."""
result = []
for item in data:
result.append(item * 2)
return resultAfter:
class Calculator:
def __init__(self, name: str):
"""Initialize calculator."""
def process(self, data):
"""Process data."""Before:
class UserService {
constructor(baseUrl: string) {
this.baseUrl = baseUrl;
this.cache = {};
}
async getUser(id: string) {
if (this.cache[id]) return this.cache[id];
const user = await fetch(this.baseUrl + '/' + id);
return user.json();
}
}After:
class UserService {
constructor(baseUrl: string) {
}
async getUser(id: string) {
}
}Before:
public class UserService {
private String baseUrl;
public UserService(String baseUrl) {
this.baseUrl = baseUrl;
this.validate();
}
public User getUserById(String id) {
Database db = new Database();
return db.query(id);
}
private void validate() {
if (baseUrl == null) {
throw new IllegalArgumentException("BaseUrl required");
}
}
}After:
public class UserService {
private String baseUrl;
public UserService(String baseUrl) {
}
public User getUserById(String id) {
}
private void validate() {
}
}| Language | Features |
|---|---|
| Python | Functions, methods, __init__, @property, docstrings |
| JavaScript/TypeScript | Functions, arrow functions, methods, async/await |
| Java | Methods, constructors, static methods, annotations |
| Kotlin | Functions, methods, properties (getters/setters) |
| Go | Functions, methods, closures |
| Rust | Functions, methods, closures |
| C/C++ | Functions, methods, constructors |
| C# | Methods, properties (get/set), async/await |
| Ruby | Methods, singleton methods, blocks |
| PHP | Functions, methods, closures |
| Swift | Functions, methods, closures |
| Lua | Functions, local functions |
| Scala | Functions, methods, closures |
| Groovy | Functions, methods, closures |
| Objective-C | Methods, instance/class methods |
- ✅ Function/method signatures
- ✅ Parameter types and defaults
- ✅ Return types
- ✅ Class definitions
- ✅ Import statements
- ✅ Comments
- ✅ Python docstrings
- ✅ Decorators
- ✅ Access modifiers (public, private, protected)
- ❌ Function/method bodies
- ❌ Local variable assignments
- ❌ Logic and implementation details
- ❌ Nested function implementations
- Concise arrow functions (
const f = x => x * 2) - no body to remove - Python lambdas - no body to remove
- Some edge cases with getters/setters in JavaScript/TypeScript
Loppers uses tree-sitter queries to parse source code into Abstract Syntax Trees (AST) and intelligently remove function/method bodies while preserving:
- Function/method signatures
- Class and interface definitions
- Import statements
- Python docstrings
- Comments
- Decorators
- Type hints
Each language has custom tree-sitter query patterns that capture function/method body nodes, which are then removed line-by-line.
# Install with dev dependencies
uv sync# All tests
uv run pytest
# Verbose output
uv run pytest -v
# With coverage
uv run pytest --cov=loppers --cov-report=html
# Specific test
uv run pytest tests/test_loppers.py::test_python_extraction# Check and fix
uv run ruff check . --fix
# Format
uv run ruff format .
# All checks
uv run ruff check . --fix && uv run ruff format .To add support for a new language:
-
Find the tree-sitter query - Use the tree-sitter playground to develop a query that captures function bodies
-
Add to LANGUAGE_CONFIGS in
src/loppers/loppers.py:LANGUAGE_CONFIGS["mylang"] = LanguageConfig( name="mylang", body_query="(function_definition body: (block) @body)", )
-
Add file extensions to
src/loppers/extensions.py:EXTENSION_TO_LANGUAGE = { ".ml": "mylang", ".mli": "mylang", }
-
Write a test in
tests/test_loppers.py:def test_mylang_extraction(self): code = "fun hello() { print('hi') }" skeleton = extract_skeleton(code, "mylang") assert "fun hello()" in skeleton assert "print" not in skeleton
-
Run tests to verify everything works
loppers/
├── src/loppers/
│ ├── __init__.py # Public API: extract_skeleton, get_skeleton, find_files, get_tree, concatenate_files
│ ├── loppers.py # Core extraction logic with SkeletonExtractor class
│ ├── source_utils.py # Convenience API and file operations
│ ├── extensions.py # Language extension mapping
│ ├── ignore_patterns.py # Default ignore patterns
│ ├── mapping.py # Backwards compatibility re-exports
│ └── cli.py # Command-line interface (4 subcommands)
├── tests/
│ └── test_loppers.py # Unit tests (38 tests)
├── pyproject.toml # Project configuration
├── README.md # This file
└── CLAUDE.md # Development guide for Claude Code
Runtime:
tree-sitter>=0.25.0- AST parsing librarytree-sitter-language-pack>=0.10.0- Language grammarsbinaryornot>=0.4.4- Binary file detectionpathspec>=0.9.0- .gitignore pattern matching
Development:
pytest>=7.0.0- Testing frameworkpytest-cov>=4.0.0- Coverage reportingruff>=0.1.0- Linting and formatting
- tree-sitter Documentation
- tree-sitter-language-pack
- binaryornot Library
- pathspec Library
- uv Package Manager
MIT - See LICENSE file for details