Skip to content

nyxCore-Systems/LIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LIP — Linked Incremental Protocol

Website Docs Crates.io License: MIT Rust 1.78+

LIP is a persistent, incremental code intelligence daemon. It keeps a live, queryable graph of your entire repository and updates only the blast radius of each change — the files and symbols actually affected.


The problem

Every AI coding tool — Cursor, Copilot, Claude Code, Cody — needs to answer the same questions:

  • What breaks if I change this function signature?
  • Where is AuthService.verifyToken called from across the whole repo?
  • Which symbols depend on this external package version?
  • Is it safe to rename this interface?

All of them have built custom, proprietary, incompatible code graph layers to answer these. LIP is the open protocol for that layer — a standardized, persistent, live code intelligence graph that editors, AI agents, CI systems, and developer tools can all speak.


How it fits with LSP and SCIP

LSP — the editor protocol. In-memory, per-session, scoped to open files.
SCIP — the audit format. Compiler-precise snapshot of a whole repo, generated by CI.
LIP — the live intelligence layer. Persistent, incremental, always current.

LSP SCIP LIP
Scope Open files only Full repo snapshot Full repo + deps, live
Freshness Session lifetime Stale between CI runs Always current (<1 ms/save)
Change handling Full re-parse per file Full re-index O(N) Blast radius O(Δ + depth)
Blast radius No Not native First-class, bounded BFS
Cross-session persistence No Read-only snapshot WAL journal, survives restart
Incremental sync Fire-and-forget (drift bugs) None Acknowledged deltas + Merkle
Annotations on symbols No No Yes — survives file changes
External dep indexing None Re-indexes every run Federated CAS slices (once, shared)
AI agent integration Limited Read-only batch MCP server, batch API, agent-lock
Type precision Compiler-level Compiler-level Tier 1: syntactic · Tier 2: compiler
Data-flow / taint No Yes No (Tier 2 roadmap)
Cold start < 1 s 30–90 min (large repo) < 30 s shallow + background

LIP is not a replacement for SCIP. SCIP has compiler-accurate precision for every language and is the right tool for data-flow analysis, compliance snapshots, and full CPG work. LIP imports SCIP artifacts (lip import --from-scip index.scip) and maintains freshness incrementally from that point.

LIP is not a replacement for LSP either. It ships an LSP bridge (lip lsp) so any editor sees it as a standard language server. The difference is that LIP's answers come from a persistent live graph rather than an in-memory server that restarts cold.

See LSP, SCIP & LIP for the full breakdown, including when to use all three together.


What makes LIP practical

Blast-radius indexing.
LIP tracks a reverse dependency graph per symbol. When a file is saved, it checks whether the exported API surface changed. If not — true for the vast majority of edits — zero downstream recomputation happens. If yes, only the files that import the changed symbols are re-verified. Results carry a risk level (low / medium / high), direct vs transitive separation, and per-item depth-weighted confidence.

Single-file edit, 500k-file repo:
  SCIP → re-index all 500k files      (~60 min)
  LIP  → re-verify ~10–200 files      (~200–800 ms)

Federated dependency slices.
External packages are content-addressed blobs. react@18.2.0 is indexed once — by anyone on the team — and stored in the registry keyed by its content hash. Every subsequent machine downloads it in under a second. node_modules, target/, and .pub-cache are never re-indexed locally again.

Progressive confidence.
Tree-sitter (< 1 ms/file) gives symbol search and go-to-definition immediately on file open. The compiler runs on the blast radius in the background and silently upgrades those results to compiler-level precision. The IDE never blocks.

Persistent annotations.
Symbols carry key/value annotations (lip:fragile, team:owner, agent:note, lip:nyx-agent-lock) that survive file changes, daemon restarts, and CI runs. Agents use these to coordinate work and leave notes that persist across sessions.

Batch API.
BatchQuery runs N queries under a single db lock acquisition — one Unix socket round-trip instead of N. Planning a 10-symbol refactor (blast radius + references + annotation checks per symbol) costs one connection instead of 30.

Semantic nearest-neighbour search (v1.3+).
When LIP_EMBEDDING_URL points to any OpenAI-compatible embeddings endpoint (Ollama, OpenAI, Together AI, …), LIP stores dense vectors for files and symbols. QueryNearest returns the most semantically similar files to a given file; QueryNearestByText embeds a free-text query on the fly; QueryNearestBySymbol finds symbols semantically similar to a given lip:// URI. BatchQueryNearestByText embeds N queries in a single HTTP round-trip. Embeddings are invalidated automatically on source change and are opt-in — all other LIP capabilities work without them.

Daemon observability (v1.3).
QueryIndexStatus and QueryFileStatus expose indexed file count, pending embedding coverage, last-update timestamp, and per-file age — enabling ckb doctor and CI health checks to query the daemon's real-time state.

Push notifications (v1.5).
IndexChanged events are broadcast to all active sessions immediately after every successful delta upsert, carrying the new indexed_files count and the list of affected_uris. Clients can invalidate their caches precisely without polling QueryIndexStatus.

Protocol lifecycle (v1.5).
Handshake / HandshakeResult lets clients detect daemon/client version drift at connect time — the daemon responds with its semver and a monotonic protocol_version integer. --managed mode spawns a parent-process watchdog so IDE integrations that launch the daemon as a subprocess get automatic cleanup when the editor exits.


Confidence tiers

Tier Score Source Latency When
1 1–50 Tree-sitter (syntactic) < 1 ms/file Immediately on file open
2 51–90 Compiler / analyzer 200–500 ms After file save, blast radius only
3 100 Federated registry slice Instant (cached) On startup for external deps

Tier 2 is implemented for Rust, TypeScript, Python, and Dart. If a language server binary is not in PATH, LIP degrades gracefully to Tier 1 for that language.


CLI

# Start the daemon (watches files, updates blast radius on save)
lip daemon --socket /tmp/lip.sock

# IDE-managed mode: daemon exits automatically when the parent process exits
lip daemon --socket /tmp/lip.sock --managed

# Index a directory (Tier 1, tree-sitter)
lip index ./src

# Query a running daemon
lip query definition    file:///src/main.rs 42 10
lip query hover         file:///src/main.rs 42 10
lip query references    "lip://local/src/auth.rs#AuthService.verifyToken"
lip query blast-radius  "lip://local/src/auth.rs#AuthService.verifyToken"
lip query symbols       verifyToken
lip query dead-symbols

# Batch: run N queries in one round-trip (reads JSON array from stdin or file)
lip query batch <<'EOF'
[
  {"type":"query_blast_radius","symbol_uri":"lip://local/src/auth.rs#AuthService"},
  {"type":"query_references",  "symbol_uri":"lip://local/src/auth.rs#AuthService","limit":50},
  {"type":"annotation_get",    "symbol_uri":"lip://local/src/auth.rs#AuthService","key":"lip:fragile"}
]
EOF

# Semantic search (requires LIP_EMBEDDING_URL)
export LIP_EMBEDDING_URL=http://localhost:11434/v1/embeddings
export LIP_EMBEDDING_MODEL=nomic-embed-text

lip query embedding-batch file:///src/auth.rs file:///src/payments.rs
lip query nearest          file:///src/auth.rs --top-k 5
lip query nearest-by-text  "authentication token validation" --top-k 10
lip query nearest-symbol   lip://local/src/auth.rs#AuthService.verifyToken --top-k 5

# Batch nearest-by-text (N queries, one HTTP round-trip)
lip query batch-nearest-by-text "token validation" "oauth flow" --top-k 5

# Batch annotation read (N URIs, one lock acquisition)
lip query batch-annotation-get --key team:owner \
  lip://local/src/auth.rs#AuthService \
  lip://local/src/payments.rs#PaymentProcessor

# Daemon health / ckb doctor integration
lip query index-status
lip query file-status file:///src/main.rs

# Protocol version negotiation
lip query handshake

# Import a SCIP index (upgrades all symbols to Tier 2 / score 90)
lip import --from-scip index.scip

# Start the LSP bridge (editors connect here — standard LSP, no plugin needed)
lip lsp --socket /tmp/lip.sock

# Start the MCP server (AI agents: Claude Code, Cursor, CKB, …)
lip mcp --socket /tmp/lip.sock

# Build and share dependency slices
lip slice --cargo                                        # index ./Cargo.toml deps
lip slice --npm                                          # index ./package.json deps
lip slice --pub                                          # index ./pubspec.yaml deps
lip slice --pip                                          # index pip-installed packages
lip slice --cargo --push --registry https://your-registry.internal

# Fetch / publish slices
lip fetch <sha256-hash> --registry https://your-registry.internal
lip push  slice.json    --registry https://your-registry.internal

# Force re-index specific files from disk (v1.6)
lip query reindex-files file:///src/auth.rs file:///src/generated/schema.rs

# Pairwise cosine similarity of two stored embeddings (v1.6)
lip query similarity file:///src/auth.rs file:///src/session.rs
lip query similarity lip://local/src/auth.rs#verifyToken lip://local/src/session.rs#validateSession

# Expand a query into related symbol names (v1.6, requires LIP_EMBEDDING_URL)
lip query query-expansion "token validation" --top-k 5

# Group files by embedding proximity (v1.6, requires LIP_EMBEDDING_URL)
lip query cluster --radius 0.85 \
  file:///src/auth.rs file:///src/session.rs \
  file:///src/payments.rs file:///src/invoices.rs

# Export raw embedding vectors for external pipelines (v1.6)
lip query export-embeddings file:///src/auth.rs file:///src/session.rs --output vectors.json

MCP tools (for AI agents)

Tool Description
lip_blast_radius Risk level + direct/transitive callers for a symbol
lip_workspace_symbols Semantic symbol search across the whole repo
lip_references All call sites for a symbol URI
lip_definition Go-to-definition at (file, line, col)
lip_hover Type signature + docs at a position
lip_document_symbols All symbols in a file
lip_dead_symbols Symbols defined but never referenced
lip_annotation_get Read a persistent symbol annotation
lip_annotation_set Write a persistent symbol annotation
lip_annotation_workspace_list All annotations matching a key prefix, workspace-wide
lip_similar_symbols Trigram fuzzy-search across all symbol names and docs
lip_stale_files Merkle sync probe — which files need re-indexing
lip_load_slice Mount a pre-built dependency slice into the daemon graph
lip_batch_query Multiple queries in one round-trip
lip_embedding_batch Compute and cache file embeddings (requires LIP_EMBEDDING_URL)
lip_nearest Top-K files most similar to a given file (cosine similarity)
lip_nearest_by_text Top-K files most similar to a free-text query
lip_index_status Daemon health: indexed count, embedding coverage, last update
lip_file_status Per-file: indexed, has embedding, age
lip_reindex_files Force re-index of specific file URIs from disk
lip_similarity Pairwise cosine similarity of two stored embeddings
lip_query_expansion Expand a query string into related symbol names via nearest-neighbour
lip_cluster Group URIs by embedding proximity within a given radius
lip_export_embeddings Return raw embedding vectors for external pipelines
lip_nearest_by_contrast Contrastive search: files like A but unlike B (v1.7)
lip_outliers Most semantically misplaced files in a set (v1.7)
lip_semantic_drift Cosine distance scalar between two stored embeddings (v1.7)
lip_similarity_matrix All pairwise cosine similarities for a list of URIs (v1.7)
lip_find_counterpart Rank candidate URIs by similarity to a source file (v1.7)
lip_coverage Embedding coverage by directory under a root path (v1.7)
lip_find_boundaries Detect semantic boundaries within a file by chunk-embedding (v1.8)
lip_semantic_diff Drift distance + direction between two content versions (v1.8)
lip_nearest_in_store Nearest-neighbour search against a caller-provided embedding store (v1.8)
lip_novelty_score Per-file novelty relative to the rest of the codebase (v1.8)
lip_extract_terminology Domain vocabulary most central to a set of files (v1.8)
lip_prune_deleted Remove index entries for files no longer on disk (v1.8)
lip_get_centroid Server-side embedding centroid of a file set (v1.9)
lip_stale_embeddings Files whose embedding is older than their current mtime (v1.9)
lip_explain_match Why a result matched: top-scoring chunks of result_uri against a query (v2.0)

Recommended agent workflow before modifying code:

  1. lip_workspace_symbols — find URIs for all symbols you plan to touch
  2. lip_batch_query — blast radius + references + lip:fragile + lip:nyx-agent-lock for each
  3. lip_annotation_set — set lip:nyx-agent-lock on claimed symbols
  4. Make changes
  5. lip_annotation_set — release locks, leave notes

See MCP Integration docs for full tool reference.


Supported languages

Language Tier 1 Tier 2
Rust ✓ Functions, structs, enums, traits, impls, consts ✓ rust-analyzer
TypeScript ✓ Functions, classes, interfaces, type aliases ✓ typescript-language-server
Python ✓ Functions, classes, async functions ✓ pyright-langserver (pylsp fallback)
Dart ✓ Functions, classes, methods, constructors ✓ dart language-server

Symbol URIs

lip://scope/package@version/path#descriptor
       │      │        │      │     │
       │      │        │      │     └── symbol name
       │      │        │      └──────── file path within package
       │      │        └─────────────── semver
       │      └──────────────────────── package name
       └─────────────────────────────── scope: npm | cargo | pub | pip | local | scip

Architecture

Editor (any LSP client)        AI agent / CKB / Cursor
  │ LSP (stdio)                  │ MCP (stdio, JSON-RPC 2.0)
  ▼                              ▼
lip lsp bridge              lip mcp server
  │                              │
  │       LIP protocol (Unix socket, length-prefixed JSON)
  └──────────────┬───────────────┘
                 ▼
          LipDaemon  ──── WAL journal (survives restart)
                 │
                 ├── FileWatcher    OS-native, per-file, re-indexes on change
                 ├── Tier2Manager   rust-analyzer in background, upgrades confidence
                 ▼
          LipDatabase  ──── query_graph/db.rs
                 │  blast-radius · CPG · name_to_symbols · annotations
                 │  early-cutoff: API surface unchanged → zero downstream work
                 ▼
          Tier1Indexer  ──── tree-sitter (Rust · TypeScript · Python · Dart)

                 ▲
          lip import    SCIP artifact (nightly CI) → Tier 2 confidence
          lip slice     lockfile deps  → Tier 3 registry slices

Performance

Measured on the Rust reference implementation, optimised build, Apple Silicon. Fixtures are ~60–80 line source files.

Tier 1 indexer

Language Measured Budget Margin
Rust 205 µs < 10 ms 49× under
TypeScript 234 µs < 10 ms 42× under
Python 279 µs < 10 ms 35× under

Query graph

Operation Measured Notes
upsert_file 92–104 ns O(1) HashMap insert + cache invalidation
file_symbols cache hit 24 ns Arc clone only
file_symbols cache miss 26 µs Full tree-sitter re-parse
blast_radius (50 files) 5.6 µs Warm cache
workspace_symbols (100 files) 14.6 µs Warm cache

Wire framing

Scenario Time
Round-trip, 64 B 6 µs
Round-trip, 64 KB 43 µs
Burst 1 000 × 256 B 1.47 ms

Repository layout

bindings/
  rust/                 # Rust reference implementation
    src/
      schema/           # Owned types and wire schema
      query_graph/      # Incremental query database (Salsa-inspired)
      indexer/          # Tier 1 tree-sitter + Tier 2 language server clients
      daemon/           # Unix-socket daemon, WAL journal, file watcher
      bridge/           # LSP bridge (tower-lsp)
      registry/         # Dependency slice cache + registry client

tools/
  lip-cli/              # `lip` command-line tool
  lip-registry/         # Registry HTTP server + Docker image

docs/
  LIP_SPEC.mdx          # Full protocol specification
  user/
    getting-started.md
    cli-reference.md
    mcp-integration.md
    daemon.md
    registry.md
    comparisons.md      # LIP vs SCIP vs LSP — when to use each

Building

cargo build --workspace
cargo test  --workspace

Requires Rust 1.78+. No system protoc required.


Status

v2.0 — ExplainMatch (chunk-level explanation: which lines in a result file drove the match), model provenance (FileStatus exposes the embedding model per file; IndexStatus warns when the index contains mixed-model vectors). v1.9: filter glob + min_score on all NN calls, GetCentroid, QueryStaleEmbeddings. v1.8: FindBoundaries, SemanticDiff, QueryNearestInStore (cross-repo federation), QueryNoveltyScore, ExtractTerminology, PruneDeleted. v1.7: 6 semantic retrieval primitives. v1.6: ReindexFiles, Similarity, QueryExpansion, Cluster, ExportEmbeddings. Wire format is JSON.


License

MIT — © Lisa Welsch

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors