In [1]:
from leann import LeannBuilder,LeannChat,LeannSearcher
from pathlib import Path
INDEX_PATH = str(Path("./").resolve()/"demo.leann")

In [7]:
#### STEP 1: BUILD INDEX with embeddinggemma:latest 

builder = LeannBuilder(
    backend_name = "hnsw", 
    embedding_model = "embeddinggemma:latest", 
    embedding_mode ="ollama", 
    graph_degree=32, 
    build_complexity = 64
)
builder.add_text("Leann saves 97 percent storage compared to traditional vector databases.")
builder.add_text("Gemini called they need their nano-banana model back")
builder.build_index(INDEX_PATH)
print(" Index built with embeddinggemma:latest(Ollama)")

##### STEP 2: SEARCH - ALWAYS EXPECT A LIST 
searcher = LeannSearcher(
    INDEX_PATH, 
    embedding_model ="embeddinggemma:latest", 
    embedding_mode="ollama",
    search_complexity=32
)
results = searcher.search("Fantastical AI-generated creatures", top_k=1)

print("Search result:", results[0].text if results else "No match")

chat = LeannChat(
    INDEX_PATH, 
    llm_config = {"type": "ollama", "model":"gpt-oss:20b"}, 
    embedding_model = "embeddinggemma:latest", 
    embedding_mode="ollama", 
    thinking_budget="medium"
)
response = chat.ask("How much storage does LEANN save?", top_k=1)
print("Answer:", response) 

Writing passages: 100%|██████████| 2/2 [00:00<00:00, 10979.85chunk/s]
Computing Ollama embeddings: 100%|██████████| 1/1 [00:00<00:00,  8.01it/s]
INFO:leann_backend_hnsw.hnsw_backend:INFO: Converting HNSW index to CSR-pruned format...


M: 64 for level: 0
Starting conversion: /Users/jetblue23/Desktop/leann/leann/demo.index -> /Users/jetblue23/Desktop/leann/leann/demo.csr.tmp
[0.00s] Reading Index HNSW header...
[0.00s]   Header read: d=768, ntotal=2
[0.00s] Reading HNSW struct vectors...
  Reading vector (dtype=<class 'numpy.float64'>, fmt='d')... Count=6, Bytes=48
[0.00s]   Read assign_probas (6)
  Reading vector (dtype=<class 'numpy.int32'>, fmt='i')... Count=7, Bytes=28
[0.08s]   Read cum_nneighbor_per_level (7)
  Reading vector (dtype=<class 'numpy.int32'>, fmt='i')... Count=2, Bytes=8
[0.13s]   Read levels (2)
[0.17s]   Probing for compact storage flag...
[0.17s]   Found compact flag: False
[0.17s]   Compact flag is False, reading original format...
[0.17s]   Probing for potential extra byte before non-compact offsets...
[0.17s]   Found and consumed an unexpected 0x00 byte.
  Reading vector (dtype=<class 'numpy.uint64'>, fmt='Q')... Count=3, Bytes=24
[0.17s]   Read offsets (3)
[0.21s]   Attempting to read neighbo

INFO:leann_backend_hnsw.hnsw_backend:✅ CSR conversion successful.
INFO:leann_backend_hnsw.hnsw_backend:INFO: Replaced original index with CSR-pruned version at '/Users/jetblue23/Desktop/leann/leann/demo.index'
INFO:leann.embedding_server_manager:Terminating server process (PID: 9145) for backend leann_backend_hnsw.hnsw_embedding_server...


    CSR Stats: |data|=2, |level_ptr|=4
[0.28s] Writing CSR HNSW graph data in FAISS-compatible order...
   Pruning embeddings: Writing NULL storage marker.
[0.32s] Conversion complete.
 Index built with embeddinggemma:latest(Ollama)
[read_HNSW - CSR NL v4] Reading metadata & CSR indices (manual offset)...
[read_HNSW NL v4] Read levels vector, size: 2
[read_HNSW NL v4] Reading Compact Storage format indices...
[read_HNSW NL v4] Read compact_level_ptr, size: 4
[read_HNSW NL v4] Read compact_node_offsets, size: 3
[read_HNSW NL v4] Read entry_point: 1, max_level: 0
[read_HNSW NL v4] Read storage fourcc: 0x6c6c756e
[read_HNSW NL v4 FIX] Detected FileIOReader. Neighbors size field offset: 242
[read_HNSW NL v4] Reading neighbors data into memory.
[read_HNSW NL v4] Read neighbors data, size: 2
[read_HNSW NL v4] Finished reading metadata and CSR indices.
INFO: Skipping external storage loading, since is_recompute is true.


INFO:leann.embedding_server_manager:Server process 9145 terminated gracefully.
INFO:leann.embedding_server_manager:Server process 9145 cleanup completed
INFO:leann.api:🔍 LeannSearcher.search() called:
INFO:leann.api:  Query: 'Fantastical AI-generated creatures'
INFO:leann.api:  Top_k: 1
INFO:leann.api:  Metadata filters: None
INFO:leann.api:  Additional kwargs: {}
INFO:leann.embedding_server_manager:Starting embedding server on port 5557...
INFO:leann.embedding_server_manager:Command: /Users/jetblue23/Desktop/leann/leann/.venv/bin/python -m leann_backend_hnsw.hnsw_embedding_server --zmq-port 5557 --model-name embeddinggemma:latest --passages-file /Users/jetblue23/Desktop/leann/leann/demo.leann.meta.json --embedding-mode ollama --distance-metric mips
INFO:leann.embedding_server_manager:Starting server process with command: /Users/jetblue23/Desktop/leann/leann/.venv/bin/python -m leann_backend_hnsw.hnsw_embedding_server --zmq-port 5557 --model-name embeddinggemma:latest --passages-file /

ZmqDistanceComputer initialized: d=768, metric=0


INFO:leann_backend_hnsw.hnsw_backend:  Search time in HNSWSearcher.search() backend: 0.20774292945861816 seconds
INFO:leann.api:  Search time in search() LEANN searcher: 0.20852208137512207 seconds
INFO:leann.api:  Backend returned: labels=1 results
INFO:leann.api:  Processing 1 passage IDs:
INFO:leann.api:   [92m✓[0m [94m[ 1][0m [93mID:[0m '1' [93mScore:[0m 0.7438 [93mText:[0m Gemini called they need their nano-banana model back
INFO:leann.api:  [92m✓ Final enriched results: 1 passages[0m
INFO:leann.chat:Attempting to create LLM of type='ollama' with model='gpt-oss:20b'
INFO:leann.chat:Initializing OllamaChat with model='gpt-oss:20b' and host='http://localhost:11434'
INFO:leann.api:🔍 LeannSearcher.search() called:
INFO:leann.api:  Query: 'How much storage does LEANN save?'
INFO:leann.api:  Top_k: 1
INFO:leann.api:  Metadata filters: None
INFO:leann.api:  Additional kwargs: {}
INFO:leann.embedding_server_manager:Starting embedding server on port 5558...
INFO:leann.embedding

Search result: Gemini called they need their nano-banana model back
[read_HNSW - CSR NL v4] Reading metadata & CSR indices (manual offset)...
[read_HNSW NL v4] Read levels vector, size: 2
[read_HNSW NL v4] Reading Compact Storage format indices...
[read_HNSW NL v4] Read compact_level_ptr, size: 4
[read_HNSW NL v4] Read compact_node_offsets, size: 3
[read_HNSW NL v4] Read entry_point: 1, max_level: 0
[read_HNSW NL v4] Read storage fourcc: 0x6c6c756e
[read_HNSW NL v4 FIX] Detected FileIOReader. Neighbors size field offset: 242
[read_HNSW NL v4] Reading neighbors data into memory.
[read_HNSW NL v4] Read neighbors data, size: 2
[read_HNSW NL v4] Finished reading metadata and CSR indices.
INFO: Skipping external storage loading, since is_recompute is true.


INFO:leann.embedding_server_manager:Embedding server is ready!
INFO:leann.api:  Launching server time: 0.5162389278411865 seconds
INFO:leann.embedding_server_manager:Reusing in-process server
INFO:leann.api:  Generated embedding shape: (1, 768)
INFO:leann.api:  Embedding time: 0.17192411422729492 seconds


ZmqDistanceComputer initialized: d=768, metric=0


INFO:leann_backend_hnsw.hnsw_backend:  Search time in HNSWSearcher.search() backend: 0.21129512786865234 seconds
INFO:leann.api:  Search time in search() LEANN searcher: 0.2123880386352539 seconds
INFO:leann.api:  Backend returned: labels=1 results
INFO:leann.api:  Processing 1 passage IDs:
INFO:leann.api:   [92m✓[0m [94m[ 1][0m [93mID:[0m '0' [93mScore:[0m 0.7741 [93mText:[0m Leann saves 97 percent storage compared to traditional vector databases.
INFO:leann.api:  [92m✓ Final enriched results: 1 passages[0m
INFO:leann.api:  Search time: 0.9064779281616211 seconds
INFO:leann.chat:Sending request to Ollama and waiting for response...
INFO:leann.api:  Ask time: 39.90999507904053 seconds


Answer: LeANN can cut storage usage by **97 %** compared with a conventional vector database.
