In [8]:
from leann.api import LeannBuilder, LeannSearcher, LeannChat
# 1. Build index (no embeddings stored!)
builder = LeannBuilder(backend_name="hnsw")
builder.add_text("C# is a powerful programming language")
builder.add_text("Python is a powerful programming language and it is very popular")
builder.add_text("Machine learning transforms industries")  
builder.add_text("Neural networks process complex data")
builder.add_text("Leann is a great storage saving engine for RAG on your macbook")
builder.build_index("knowledge.leann")
# 2. Search with real-time embeddings
searcher = LeannSearcher("knowledge.leann")
results = searcher.search("programming languages", top_k=2, recompute_beighbor_embeddings=True)
print("LEANN Search results: ", results)
# 3. Chat with LEANN
chat = LeannChat(index_path="knowledge.leann", llm_config={"type": "ollama", "model": "llama3.2:1b"})
response = chat.ask(
    "Compare the two retrieved programming languages and say which one is more popular today. Respond in a single well-formed sentence.",
    top_k=2,
    recompute_beighbor_embeddings=True,
)
print("LEANN Chat response: ", response)

INFO: Computing embeddings for 1 texts using SentenceTransformer, model: 'facebook/contriever-msmarco'
INFO: Using cached model: facebook/contriever-msmarco
INFO: Starting embedding computation...
INFO: Generated 1 embeddings, dimension: 768


Writing passages: 100%|██████████| 5/5 [00:00<00:00, 14655.15chunk/s]


INFO: Computing embeddings for 5 texts using SentenceTransformer, model: 'facebook/contriever-msmarco'
INFO: Using cached model: facebook/contriever-msmarco
INFO: Starting embedding computation...


Batches: 100%|██████████| 1/1 [00:00<00:00, 41.25it/s]
INFO:leann_backend_hnsw.hnsw_backend:INFO: Converting HNSW index to CSR-pruned format...


INFO: Generated 5 embeddings, dimension: 768
M: 64 for level: 0
Starting conversion: knowledge.index -> knowledge.csr.tmp
[0.00s] Reading Index HNSW header...
[0.00s]   Header read: d=768, ntotal=5
[0.00s] Reading HNSW struct vectors...
  Reading vector (dtype=<class 'numpy.float64'>, fmt='d')... Count=6, Bytes=48
[0.00s]   Read assign_probas (6)
  Reading vector (dtype=<class 'numpy.int32'>, fmt='i')... Count=7, Bytes=28
[0.18s]   Read cum_nneighbor_per_level (7)
  Reading vector (dtype=<class 'numpy.int32'>, fmt='i')... Count=5, Bytes=20
[0.30s]   Read levels (5)
[0.40s]   Probing for compact storage flag...
[0.40s]   Found compact flag: False
[0.40s]   Compact flag is False, reading original format...
[0.40s]   Probing for potential extra byte before non-compact offsets...
[0.40s]   Found and consumed an unexpected 0x00 byte.
  Reading vector (dtype=<class 'numpy.uint64'>, fmt='Q')... Count=6, Bytes=48
[0.40s]   Read offsets (6)
[0.50s]   Attempting to read neighbors vector...
  Rea

INFO:leann_backend_hnsw.hnsw_backend:✅ CSR conversion successful.
INFO:leann_backend_hnsw.hnsw_backend:INFO: Replaced original index with CSR-pruned version at 'knowledge.index'



[read_HNSW - CSR NL v4] Reading metadata & CSR indices (manual offset)...
[read_HNSW NL v4] Read levels vector, size: 5
[read_HNSW NL v4] Reading Compact Storage format indices...
[read_HNSW NL v4] Read compact_level_ptr, size: 10
[read_HNSW NL v4] Read compact_node_offsets, size: 6
[read_HNSW NL v4] Read entry_point: 4, max_level: 0
[read_HNSW NL v4] Read storage fourcc: 0x6c6c756e
[read_HNSW NL v4 FIX] Detected FileIOReader. Neighbors size field offset: 326
[read_HNSW NL v4] Reading neighbors data into memory.
[read_HNSW NL v4] Read neighbors data, size: 20
[read_HNSW NL v4] Finished reading metadata and CSR indices.
INFO: Skipping external storage loading, since is_recompute is true.
INFO: Terminating server process (PID: 26499) for backend leann_backend_hnsw.hnsw_embedding_server...
INFO: Server process 26499 terminated.
🔍 DEBUG LeannSearcher.search() called:
  Query: 'programming languages'
  Top_k: 2
  Additional kwargs: {'recompute_beighbor_embeddings': True}
INFO: Starting emb

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


✅ Embedding server is ready!
[leann_backend_hnsw.hnsw_embedding_server LOG]: INFO: Registering backend 'diskann'
DEBUG: Found process on port 5557: /Users/yichuan/Desktop/code/LEANN/leann/.venv/bin/python -m leann_backend_hnsw.hnsw_embedding_server --zmq-port 5557 --model-name facebook/contriever-msmarco --passages-file knowledge.leann.meta.json
[leann_backend_hnsw.hnsw_embedding_server LOG]: INFO: Registering backend 'hnsw'
[leann_backend_hnsw.hnsw_embedding_server LOG]: Starting HNSW server on port 5557 with model facebook/contriever-msmarco
DEBUG: model_matches: True, passages_matches: True, overall: True
✅ Existing server process (PID 27144) is compatible
[leann_backend_hnsw.hnsw_embedding_server LOG]: Using embedding mode: sentence-transformers
[leann_backend_hnsw.hnsw_embedding_server LOG]: Successfully imported unified embedding computation module
[leann_backend_hnsw.hnsw_embedding_server LOG]: Loaded PassageManager with 5 passages from metadata
[leann_backend_hnsw.hnsw_embeddin

INFO:leann.chat:Attempting to create LLM of type='ollama' with model='llama3.2:1b'
INFO:leann.chat:Initializing OllamaChat with model='llama3.2:1b' and host='http://localhost:11434'


    2. passage_id='1' -> SUCCESS: Python is a powerful programming language and it is very popular...
  Final enriched results: 2 passages
LEANN Search results:  [SearchResult(id='0', score=np.float32(1.444752), text='C# is a powerful programming language', metadata={}), SearchResult(id='1', score=np.float32(1.394647), text='Python is a powerful programming language and it is very popular', metadata={})]
[read_HNSW - CSR NL v4] Reading metadata & CSR indices (manual offset)...
[read_HNSW NL v4] Read levels vector, size: 5
[read_HNSW NL v4] Reading Compact Storage format indices...
[read_HNSW NL v4] Read compact_level_ptr, size: 10
[read_HNSW NL v4] Read compact_node_offsets, size: 6
[read_HNSW NL v4] Read entry_point: 4, max_level: 0
[read_HNSW NL v4] Read storage fourcc: 0x6c6c756e
[read_HNSW NL v4 FIX] Detected FileIOReader. Neighbors size field offset: 326
[read_HNSW NL v4] Reading neighbors data into memory.
[read_HNSW NL v4] Read neighbors data, size: 20
[read_HNSW NL v4] Finished

INFO:leann.chat:Sending request to Ollama: {'model': 'llama3.2:1b', 'prompt': 'Here is some retrieved context that might help answer your question:\n\nPython is a powerful programming language and it is very popular\n\nC# is a powerful programming language\n\nQuestion: Compare the two retrieved programming languages and say which one is more popular today. Respond in a single well-formed sentence.\n\nPlease provide the best answer you can based on this context and your knowledge.', 'stream': False, 'options': {}}


    1. passage_id='1' -> SUCCESS: Python is a powerful programming language and it is very popular...
    2. passage_id='0' -> SUCCESS: C# is a powerful programming language...
  Final enriched results: 2 passages
LEANN Chat response:  Python has gained immense popularity significantly more so than C#, becoming one of the most widely used programming languages globally today.
