AST-aware code chunking for semantic search and embeddings. Chisel parses source code into meaningful units—functions, classes, methods—preserving the context that makes code searchable.
source := []byte(`
func New(cfg Config) *Handler { ... }
func (h *Handler) ServeHTTP(w http.ResponseWriter, r *http.Request) { ... }
type Config struct {
Timeout time.Duration
Logger *slog.Logger
}
`)
chunks, _ := c.Chunk(ctx, chisel.Go, "api.go", source)
for _, chunk := range chunks {
fmt.Printf("[%s] %s (lines %d-%d)\n", chunk.Kind, chunk.Symbol, chunk.StartLine, chunk.EndLine)
}
// [function] New (lines 2-2)
// [method] Handler.ServeHTTP (lines 4-4)
// [class] Config (lines 6-9)Every chunk carries its symbol name, kind, line range, and parent context. Methods know their receiver. Nested types know their enclosing scope.
chunk := chunks[1]
// chunk.Symbol → "Handler.ServeHTTP"
// chunk.Kind → "method"
// chunk.Context → ["Handler"]
// chunk.Content → the full method source
// chunk.StartLine → 4
// chunk.EndLine → 4Feed chunks to an embedding model, store in a vector database, and search code by meaning rather than text.
go get github.com/zoobz-io/chiselLanguage providers (install only what you need):
go get github.com/zoobz-io/chisel/golang # Go (stdlib, no deps)
go get github.com/zoobz-io/chisel/markdown # Markdown (no deps)
go get github.com/zoobz-io/chisel/typescript # TypeScript/JavaScript (tree-sitter)
go get github.com/zoobz-io/chisel/python # Python (tree-sitter)
go get github.com/zoobz-io/chisel/rust # Rust (tree-sitter)Requires Go 1.24+.
package main
import (
"context"
"fmt"
"github.com/zoobz-io/chisel"
"github.com/zoobz-io/chisel/golang"
"github.com/zoobz-io/chisel/typescript"
)
func main() {
// Create a chunker with language providers
c := chisel.New(
golang.New(),
typescript.New(),
typescript.NewJavaScript(),
)
source := []byte(`
package auth
// Authenticate validates user credentials.
func Authenticate(username, password string) (*User, error) {
// ...
}
// User represents an authenticated user.
type User struct {
ID string
Email string
}
`)
chunks, err := c.Chunk(context.Background(), chisel.Go, "auth.go", source)
if err != nil {
panic(err)
}
for _, chunk := range chunks {
fmt.Printf("[%s] %s\n", chunk.Kind, chunk.Symbol)
fmt.Printf(" Lines: %d-%d\n", chunk.StartLine, chunk.EndLine)
if len(chunk.Context) > 0 {
fmt.Printf(" Context: %v\n", chunk.Context)
}
}
}Output:
[function] Authenticate
Lines: 4-6
[class] User
Lines: 8-12
| Feature | Description | Docs |
|---|---|---|
| Multi-language | Go, TypeScript, JavaScript, Python, Rust, Markdown | Providers |
| Semantic extraction | Functions, methods, classes, interfaces, types, enums | Concepts |
| Context preservation | Parent chain for nested definitions | Architecture |
| Line mapping | Precise source locations for each chunk | Types |
| Zero-copy providers | Go and Markdown use stdlib only | Architecture |
- Semantic boundaries — Chunks split at function/class boundaries, not arbitrary line counts
- Embedding-ready — Output designed for vector databases and semantic search
- Isolated dependencies — Tree-sitter only where needed; Go/Markdown have zero external deps
- Context-aware — Methods know their parent class; nested functions know their scope
- Consistent interface — Same
Providercontract across all languages
Chisel enables a pattern: parse once, search by meaning.
Your codebase becomes a corpus of semantic units. Each function, method, and type gets embedded with its full context — symbol name, parent scope, documentation. Queries match intent, not just text.
// Chunk your codebase
chunks, _ := c.Chunk(ctx, chisel.Go, path, source)
// Embed each chunk (using your embedding provider)
for _, chunk := range chunks {
embedding := embedder.Embed(chunk.Content)
vectorDB.Store(embedding, chunk.Symbol, chunk.Kind, path)
}
// Search by meaning
results := vectorDB.Query("authentication middleware")
// Returns: AuthMiddleware, ValidateToken, SessionHandler
// Not just files containing the word "authentication"Symbol names and kinds become metadata. Line ranges enable source navigation. Context chains power hierarchical search.
Chisel provides the chunking layer for code intelligence pipelines:
- vicky — Code search and retrieval service
- Learn
- Overview — What chisel is and why
- Quickstart — Get productive in minutes
- Concepts — Core abstractions
- Architecture — How it works internally
- Guides
- Providers — Language-specific details
- Testing — Testing code that uses chisel
- Troubleshooting — Common issues
- Reference
Contributions welcome. See CONTRIBUTING.md for guidelines.
MIT — see LICENSE for details.