-
Notifications
You must be signed in to change notification settings - Fork 17
feat: Add JSON output support to query_rag.py script #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughAdds a Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant U as User (CLI)
participant Q as query_rag.py
participant I as llama_index
participant S as llama_stack
U->>Q: run CLI (--source [index|stack], --json?, --node?, --top_k, --threshold)
alt source == llama_index
Q->>I: fetch node(s) or specific node
I-->>Q: nodes[], scores, metadata
alt --json
Q-->>U: JSON { query, type?, node_id?, top_k?, threshold?, nodes: [...] }
else non-JSON
Q-->>U: textual node blocks or error text
end
else source == llama_stack
Q->>S: query with cfg (vector, chunk_size=380, overlap=0, max_chunks=top_k)
S-->>Q: chunks[], scores
alt --json
Q-->>U: JSON { query, top_k, threshold, nodes: [...] }
else non-JSON
Q-->>U: textual chunk blocks or error text
end
end
opt Threshold or empty-results
note right of Q: emits JSON error/info payload or logs/writes text
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 💡 Knowledge Base configuration:
You can enable these sources in your CodeRabbit configuration. 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (8)
scripts/query_rag.py (8)
55-65: Include query parameters in “no nodes” error payload for parityConsider including
top_kandthresholdin the error payload to keep the JSON error shape aligned with success payloads and aid debugging.- result = { - "query": args.query, - "error": f"No nodes retrieved for query: {args.query}", - "nodes": [] - } + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "error": f"No nodes retrieved for query: {args.query}", + "nodes": [] + }
101-106: Optional: Improve non-JSON text output readabilityPrinting the raw
NodeWithScoreobject may not be as readable as printing ID/score/text explicitly. Consider formatting similarly to the JSON fields.Example tweak (outside current range for brevity):
- Print node_id, score, and first N chars of text for each item.
154-163: Include query parameters in “no chunks” error for consistencySimilar to the llama-index path, including
top_kandthresholdhelps debugging and keeps structure consistent.- result = { - "query": args.query, - "error": f"No chunks retrieved for query: {args.query}", - "chunks": [] - } + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "error": f"No chunks retrieved for query: {args.query}", + "chunks": [] + }
185-191: Align JSON schema across vector stores; also fix text-mode score displayThe PR description promises a consistent JSON structure. The llama-index path emits
nodes: [{id, score, text, metadata}], while the llama-stack path emitschunkswithdocument_idand nometadata. Consider unifying tonodeswithidandmetadata(even if empty), and fix text mode to print a numeric score.If consistency is required, apply this diff:
- result = { - "query": args.query, - "top_k": args.top_k, - "threshold": args.threshold, - "chunks": [] - } + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "nodes": [] + } @@ - for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): - chunk_data = { - "document_id": _id, - "score": score.score if hasattr(score, 'score') else score, - "text": chunk - } - result["chunks"].append(chunk_data) + for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): + score_val = score.score if hasattr(score, 'score') else score + node_data = { + "id": _id, + "score": score_val, + "text": chunk, + "metadata": {} + } + result["nodes"].append(node_data) @@ - if args.json: - print(json.dumps(result, indent=2)) - else: - for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): - print("=" * 80) - print(f"Node ID: {_id}\nScore: {score}\nText:\n{chunk}") + if args.json: + print(json.dumps(result, indent=2)) + else: + for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): + score_val = score.score if hasattr(score, 'score') else score + print("=" * 80) + print(f"Document ID: {_id}\nScore: {score_val}\nText:\n{chunk}")If backward compatibility with
chunksconsumers is a concern, you could keep both keys temporarily (duplicate arrays) and deprecatechunkslater.Also applies to: 193-199, 200-205
258-263: Return richer JSON for unknown DB auto-detectionInclude the provided
db_pathand that auto-detection was attempted. Easier to debug from logs.- if args.json: - print(json.dumps({"error": error_msg}, indent=2)) + if args.json: + print(json.dumps( + {"error": error_msg, "db_path": args.db_path, "vector_store_type": "auto"}, + indent=2 + ))
142-150: Minor: consider with-context managers for file I/O
open(...); yaml.safe_load(...)andyaml.safe_dump(..., open(...))leave file handles open until GC. Usingwith open(...)blocks is cleaner and safer.
200-205: Nit: label should say “Document ID” (llama-stack), and print numeric scoreText mode currently uses “Node ID” and prints the score object, not the numeric value, in some llama-stack configurations.
Included in the unification diff above; if you don’t unify schemas, at least change:
- “Node ID” -> “Document ID”
score.score if hasattr(score, "score") else score
239-241: Optional: centralize JSON printing to reduce duplicationA tiny helper like
def _emit(result): print(json.dumps(result, indent=2)) if args.json else ...would DRY up many branches and make future schema changes safer.I can factor this into
print_json_or_text(args, payload, text_fn)to consolidate all emit paths. Want me to push a patch?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
scripts/query_rag.py(5 hunks)
🔇 Additional comments (6)
scripts/query_rag.py (6)
5-5: Importing json is appropriate and scoped correctlyNo concerns. The import is used consistently for JSON output.
67-83: Threshold error path looks fineThe message is clear and the JSON payload includes threshold and top_score. No functional issues here.
143-147: Query config changes LGTM
chunk_size_in_tokens,mode, andchunk_overlap_in_tokenslook reasonable defaults and keep behavior explicit.
238-242: CLI: --json flag addition looks goodFlag semantics are clear and match output behavior.
246-247: Suppressing command echo in JSON mode is correctPrevents polluting the JSON stream. Good call.
85-91: Result scaffolding for llama-index path looks goodFields
query,top_k,threshold, andnodesprovide a clean envelope for results.
lpiwowar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good first iteration! 👍 Two suggestions that can make the code better IMO:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (3)
scripts/query_rag.py (3)
37-51: Guard get_node() returning None and make single-node payload attribute-safeIf the node is missing, this will raise AttributeError, and even when present, id/text access can differ across llama-index versions.
Apply this diff:
- node = storage_context.docstore.get_node(args.node) - result = { - "query": args.query, - "type": "single_node", - "node_id": args.node, - "node": { - "id": node.node_id, - "text": node.text, - "metadata": node.metadata if hasattr(node, 'metadata') else {} - } - } - if args.json: - print(json.dumps(result, indent=2)) - else: - print(node) + node = storage_context.docstore.get_node(args.node) + if node is None: + error_msg = f"Node '{args.node}' not found." + payload = { + "query": args.query, + "type": "single_node", + "node_id": args.node, + "node": None, + "error": error_msg, + } + if args.json: + print(json.dumps(payload, indent=2)) + else: + print(error_msg) + exit(1) + result = { + "query": args.query, + "type": "single_node", + "node_id": args.node, + "node": { + "id": getattr(node, "node_id", getattr(node, "id_", None)), + "text": getattr(node, "text", None) if hasattr(node, "text") else (node.get_content() if hasattr(node, "get_content") else None), + "metadata": getattr(node, "metadata", {}) or {}, + }, + } + if args.json: + print(json.dumps(result, indent=2)) + else: + print(node)
91-98: NodeWithScore vs BaseNode: access underlying node to avoid AttributeErrorretriever.retrieve() yields NodeWithScore; id/text/metadata live on the underlying BaseNode.
- for node in nodes: - node_data = { - "id": node.node_id, - "score": node.score, - "text": node.text, - "metadata": node.metadata if hasattr(node, 'metadata') else {} - } - result["nodes"].append(node_data) + for nws in nodes: + base = getattr(nws, "node", nws) + node_data = { + "id": getattr(base, "node_id", getattr(base, "id_", None)), + "score": getattr(nws, "score", None), + "text": getattr(base, "text", None) if hasattr(base, "text") else (base.get_content() if hasattr(base, "get_content") else None), + "metadata": getattr(base, "metadata", {}) or {}, + } + result["nodes"].append(node_data)
179-194: Threshold check can crash if md['scores'][0] is a float; normalize top_score firstYou later handle both shapes when formatting; apply the same robustness here.
- threshold = args.threshold - if threshold > 0.0 and md.get("scores") and md["scores"][0].score < threshold: + threshold = args.threshold + if threshold > 0.0 and md.get("scores"): + _top = md["scores"][0] + top_score = _top.score if hasattr(_top, "score") else _top + if top_score < threshold: - logging.error( - f"Score {md['scores'][0].score} of the top retrieved node for query '{args.query}' " - f"didn't cross the minimal threshold {threshold}." - ) - if args.json: - result = { - "query": args.query, - "top_k": args.top_k, - "threshold": args.threshold, - "top_score": md["scores"][0].score, - "nodes": [] - } - print(json.dumps(result, indent=2)) - exit(1) + logging.error( + f"Score {top_score} of the top retrieved node for query '{args.query}' " + f"didn't cross the minimal threshold {threshold}." + ) + if args.json: + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "top_score": top_score, + "nodes": [], + } + print(json.dumps(result, indent=2)) + exit(1)
🧹 Nitpick comments (5)
scripts/query_rag.py (5)
5-6: Imports look good; consider initializing logging to avoid polluting JSON outputSince you’re relying on logging for errors, initialize logging in main so logs go to stderr and don’t interleave with JSON on stdout. Keep it terse in JSON mode.
Add this near argument parsing (no diff, outside changed lines):
# After args = parser.parse_args() logging.basicConfig( level=logging.ERROR if args.json else logging.INFO, format="%(levelname)s: %(message)s", )
56-66: Return a structured JSON error for “no nodes retrieved”Current JSON has no "error" field, which makes it harder to programmatically detect failures.
- if len(nodes) == 0: - logging.error(f"No nodes retrieved for query: {args.query}") - if args.json: - result = { - "query": args.query, - "top_k": args.top_k, - "threshold": args.threshold, - "nodes": [] - } - print(json.dumps(result, indent=2)) + if len(nodes) == 0: + error_msg = f"No nodes retrieved for query: {args.query}" + logging.error(error_msg) + if args.json: + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "nodes": [], + "error": error_msg, + } + print(json.dumps(result, indent=2)) exit(1)
167-176: Return a structured JSON error for “no chunks retrieved”Same as the llama-index path: include an "error" field so clients can detect failure.
- if len(md["chunks"]) == 0: - logging.error(f"No chunks retrieved for query: {args.query}") - if args.json: - result = { - "query": args.query, - "top_k": args.top_k, - "threshold": args.threshold, - "nodes": [] - } - print(json.dumps(result, indent=2)) + if len(md["chunks"]) == 0: + error_msg = f"No chunks retrieved for query: {args.query}" + logging.error(error_msg) + if args.json: + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "nodes": [], + "error": error_msg, + } + print(json.dumps(result, indent=2)) exit(1)
203-211: Optional: include metadata in llama-stack results when availableIf llama-stack returns metadata (e.g., md["metadatas"]), consider including it to match the schema parity with the llama-index path and the PR’s examples.
If md contains a "metadatas" list aligned with document_ids, adjust the loop accordingly (illustrative snippet outside the changed range):
metas = md.get("metadatas") for idx, (_id, chunk, score) in enumerate(zip(md["document_ids"], md["chunks"], md["scores"])): metadata = (metas[idx] if metas and idx < len(metas) else {}) or {} node_data = { "id": _id, "score": score.score if hasattr(score, "score") else score, "text": chunk, "metadata": metadata, } result["nodes"].append(node_data)
276-281: Enrich unknown-DB JSON error with contextInclude db_path to aid callers and keep error payloads informative.
- if args.json: - print(json.dumps({"error": error_msg}, indent=2)) + if args.json: + print(json.dumps({"error": error_msg, "db_path": args.db_path}, indent=2)) else: print(error_msg)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
scripts/query_rag.py(5 hunks)
🔇 Additional comments (3)
scripts/query_rag.py (3)
212-217: LGTM: preserves plain-text output for non-JSON modeText output remains backward-compatible.
256-260: LGTM: CLI --json flag additionClear help text and consistent behavior toggle.
264-265: LGTM: suppress banner when JSON is requestedPrevents non-JSON noise on stdout.
bb2d979 to
23dde0e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
scripts/query_rag.py (3)
36-51: Guard against missing node and make attribute access robust for single-node pathDereferencing
node.node_id/node.textwill crash whenget_node()returns None; also attribute names vary across versions (id_,get_content()). Return a structured error in JSON mode and use resilient getters.Apply this diff:
- node = storage_context.docstore.get_node(args.node) - result = { - "query": args.query, - "type": "single_node", - "node_id": args.node, - "node": { - "id": node.node_id, - "text": node.text, - "metadata": node.metadata if hasattr(node, 'metadata') else {} - } - } - if args.json: - print(json.dumps(result, indent=2)) - else: - print(node) + node = storage_context.docstore.get_node(args.node) + if node is None: + error_msg = f"Node '{args.node}' not found." + result = { + "query": args.query, + "type": "single_node", + "node_id": args.node, + "node": None, + "error": error_msg, + } + if args.json: + print(json.dumps(result, indent=2)) + else: + print(error_msg) + exit(1) + # Node exists; normalize attributes across versions + base = node + result = { + "query": args.query, + "type": "single_node", + "node_id": args.node, + "node": { + "id": getattr(base, "node_id", getattr(base, "id_", None)), + "text": getattr(base, "text", None) if hasattr(base, "text") else (base.get_content() if hasattr(base, "get_content") else None), + "metadata": getattr(base, "metadata", {}) or {}, + }, + } + if args.json: + print(json.dumps(result, indent=2)) + else: + print(node)
84-100: NodeWithScore vs BaseNode: extract BaseNode before reading id/text/metadata
retriever.retrieve()returns NodeWithScore items; readingnode_id,text,metadatadirectly will break. Normalize vianws.nodeand use safe fallbacks.- for node in nodes: - node_data = { - "id": node.node_id, - "score": node.score, - "text": node.text, - "metadata": node.metadata if hasattr(node, 'metadata') else {} - } - result["nodes"].append(node_data) + for nws in nodes: + base = getattr(nws, "node", nws) + node_data = { + "id": getattr(base, "node_id", getattr(base, "id_", None)), + "score": getattr(nws, "score", None), + "text": getattr(base, "text", None) if hasattr(base, "text") else (base.get_content() if hasattr(base, "get_content") else None), + "metadata": getattr(base, "metadata", {}) or {}, + } + result["nodes"].append(node_data)
178-194: Threshold check may crash: md['scores'][0] can be a float; normalize top_score and include "error"Accessing
.scoreblindly can raise AttributeError. Normalize first, then compare and emit structured output. Also avoid logging when in JSON mode.- threshold = args.threshold - if threshold > 0.0 and md.get("scores") and md["scores"][0].score < threshold: - logging.error( - f"Score {md['scores'][0].score} of the top retrieved node for query '{args.query}' " - f"didn't cross the minimal threshold {threshold}." - ) - if args.json: - result = { - "query": args.query, - "top_k": args.top_k, - "threshold": args.threshold, - "top_score": md["scores"][0].score, - "nodes": [] - } - print(json.dumps(result, indent=2)) - exit(1) + threshold = args.threshold + if threshold > 0.0 and md.get("scores"): + _top = md["scores"][0] + top_score = _top.score if hasattr(_top, "score") else _top + if top_score < threshold: + error_msg = ( + f"Score {top_score} of the top retrieved node for query '{args.query}' " + f"didn't cross the minimal threshold {threshold}." + ) + if args.json: + result = { + "query": args.query, + "top_k": args.top_k, + "threshold": args.threshold, + "top_score": top_score, + "nodes": [], + "error": error_msg, + } + print(json.dumps(result, indent=2)) + else: + logging.error(error_msg) + exit(1)
🧹 Nitpick comments (7)
scripts/query_rag.py (7)
56-66: In JSON mode, avoid logging noise and include an error field for machine consumersCurrently logs the error unconditionally and returns JSON without an "error" message. For clean JSON output and actionable payloads, only log in non-JSON path and add an "error" string.
- logging.error(f"No nodes retrieved for query: {args.query}") - if args.json: + if not args.json: + logging.error(f"No nodes retrieved for query: {args.query}") + if args.json: result = { "query": args.query, "top_k": args.top_k, "threshold": args.threshold, - "nodes": [] + "nodes": [], + "error": f"No nodes retrieved for query: {args.query}", } print(json.dumps(result, indent=2)) exit(1)
68-83: Threshold failure: add non-JSON logging and include "error" in JSON payloadNon-JSON path exits silently; also the JSON result lacks an "error" message. Emit a human-readable log when not using --json and include "error" in JSON for consistency.
- if args.threshold > 0.0 and nodes[0].score < args.threshold: + if args.threshold > 0.0 and nodes[0].score < args.threshold: error_msg = ( f"Score {nodes[0].score} of the top retrieved node for query '{args.query}' " f"didn't cross the minimal threshold {args.threshold}." ) - if args.json: + if args.json: result = { "query": args.query, "top_k": args.top_k, "threshold": args.threshold, "top_score": nodes[0].score, - "nodes": [] + "nodes": [], + "error": error_msg, } print(json.dumps(result, indent=2)) + else: + logging.error(error_msg) exit(1)
165-177: In JSON mode, avoid logging and add "error" key for no-chunks caseMirror the faiss/llama-index handling: don't log when --json is set and include an error message in the JSON body.
- logging.error(f"No chunks retrieved for query: {args.query}") - if args.json: + if not args.json: + logging.error(f"No chunks retrieved for query: {args.query}") + if args.json: result = { "query": args.query, "top_k": args.top_k, "threshold": args.threshold, - "nodes": [] + "nodes": [], + "error": f"No chunks retrieved for query: {args.query}", } print(json.dumps(result, indent=2)) exit(1)
212-218: Normalize score for human-readable non-JSON outputFor textual output, print the normalized float value instead of a potential wrapper object.
- else: - for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): - print("=" * 80) - print(f"Node ID: {_id}\nScore: {score}\nText:\n{chunk}") + else: + for _id, chunk, score in zip(md["document_ids"], md["chunks"], md["scores"]): + _score = score.score if hasattr(score, "score") else score + print("=" * 80) + print(f"Node ID: {_id}\nScore: {_score}\nText:\n{chunk}")
264-266: Configure logging based on --json to keep stdout strictly JSONTo avoid mixing logs with JSON output, disable logging in JSON mode and set a default config otherwise.
- if not args.json: - print("Command line used: " + " ".join(sys.argv)) + # Configure logging early: suppress logs in JSON mode to keep stdout machine-parsable + if args.json: + logging.disable(logging.CRITICAL) + else: + logging.basicConfig(level=logging.INFO, stream=sys.stderr) + print("Command line used: " + " ".join(sys.argv))
276-286: Unknown DB path: include error in JSON and avoid logging when --jsonReturn a machine-parseable error message and log only in non-JSON mode.
- logging.error(f"Cannot recognize the DB in {args.db_path}") - if args.json: + if not args.json: + logging.error(f"Cannot recognize the DB in {args.db_path}") + if args.json: # In JSON mode, output minimal error info and exit result = { "query": args.query, "top_k": args.top_k, "threshold": args.threshold, - "nodes": [] + "nodes": [], + "error": f"Cannot recognize the DB in {args.db_path}", } print(json.dumps(result, indent=2)) exit(1)
267-267: Unused variable
vector_store_type = args.vector_store_typeis assigned but not used.You can drop this line.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
scripts/query_rag.py(5 hunks)
🔇 Additional comments (5)
scripts/query_rag.py (5)
100-106: LGTM: JSON vs text output preservedGood separation between JSON output and legacy textual prints. With the threshold/no-result logging addressed above, this path will be clean.
203-211: LGTM: Robust score extraction for llama-stack pathNice defensive handling of score as either a float or an object with
.score.
256-260: CLI UX: --json flag addition is clear and discoverableHelp text is concise and the flag name is intuitive.
195-202: Consistency note: JSON schema looks uniform across backendsThe success payload shape (query/top_k/threshold/nodes) appears consistent between llama-index and llama-stack. Once the error payloads include an "error" field everywhere (as suggested), the contract will match the PR objective.
Would you like me to generate a JSON schema (and a quick jq validation script) to pin this contract in CI?
123-130: Metadata parity across backendsIn the llama-stack path, result nodes have an empty "metadata". The PR objective mentions metadata fields like docs_url/title/url_reachable. If llama-stack provides these in res.metadata or via a follow-up call, consider adding them so both backends return comparable metadata.
Do you have a reliable source for chunk metadata in llama-stack responses we can surface here? If yes, I can propose a patch to plumb those through.
Also applies to: 147-163
lpiwowar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates! 🙏 Just couple of small comments:)
- Add --json flag to enable structured JSON output format - Maintain backward compatibility with existing plain text output - Implement consistent JSON structure for all response types: * Successful queries return nodes array with metadata * Error cases return structured error messages * Include query parameters (top_k, threshold) in response - Support JSON output for both Faiss and Llama-Stack vector stores - Add proper error handling with JSON format
tisnik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you
lpiwowar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! 👍 Thank you 🙏
There are some prints() from llama_index that polute the stdout. Adding option to save only the JSON to a file might help. But this can be done in a follow up PR:):
python scripts/query_rag.py --db-path ./vector_db/os_product_docs --product-index os-docs-2024.2 -m ./embeddings_model/ -q "What is Nova in OpenStack?" --json --threshold 0.2 2> result.error
LLM is explicitly disabled. Using MockLLM.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./vector_db/os_product_docs/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./vector_db/os_product_docs/index_store.json.
{
"query": "What is Nova in OpenStack?",
"top_k": 1,
"threshold": 0.2,
"nodes": [
{
"id": "00ab1db5-0c2b-438e-8001-7497428a1eec",
"score": 0.8653656840324402,
"text": "Technical Reference Deep Dives\n******************************\n\nThe nova project is large, and there are lots of complicated parts in\nit where it helps to have an overview to understand how the internals\nof a particular part work.",
"metadata": {
"docs_url": "https://docs.openstack.org/nova/2024.2_docs/reference/index.html",
"title": "Technical Reference Deep Dives",
"url_reachable": false
}
}
]
}
Description
Type of change
Related Tickets & Documents
https://issues.redhat.com/browse/OSPRH-18801
Checklist before requesting a review
Testing
uv run python scripts/query_rag.py
-p vector_db/custom_docs/0.1
-x custom_docs-0_1
-m embeddings_model
-k 2
-q "vector database"
--json
[--vector-store-type {auto,faiss,llamastack-faiss,llamastack-sqlite-vec}]
[--json]
--
vector store type to be used.
--json Output results in JSON format
Command line used: scripts/query_rag.py -p ./test_container_output -x custom_docs-0_1 -m embeddings_model -k 2 -q vector database
LLM is explicitly disabled. Using MockLLM.
Load pretrained SentenceTransformer: embeddings_model
Loading faiss.
Successfully loaded faiss.
Failed to load GPU Faiss: name 'GpuIndexIVFFlat' is not defined. Will not load constructor refs for GPU indexes. This is only an error if you're trying to use GPU Faiss.
Loading llama_index.vector_stores.faiss.base from ./test_container_output/default__vector_store.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/index_store.json.
Loading indices with ids: ['custom_docs-0_1']
Batches: 100%|████████████████████████████████████████| 1/1 [00:00<00:00, 7.64it/s]
================================================================================
Node ID: 2c3a6a58-48eb-4229-896c-92c31f35580b
Text: Vector Database is an efficient way how to provide information
to LLM
Score: 0.677
================================================================================
Node ID: 9c226a5d-1885-4b46-8b36-8416cc6b0dae
Text: Lightspeed projects leverage vector databases for improved AI
responses
Score: 0.560
JSON format:
LLM is explicitly disabled. Using MockLLM.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/index_store.json.
{
"query": "vector database",
"top_k": 2,
"threshold": 0.0,
"nodes": [
{
"id": "2c3a6a58-48eb-4229-896c-92c31f35580b",
"score": 0.6774420738220215,
"text": "Vector Database is an efficient way how to provide information to LLM",
"metadata": {
"docs_url": "https://www.redhat.com",
"title": "Vector Database is an efficient way how to provide information to LLM",
"url_reachable": true
}
},
{
"id": "9c226a5d-1885-4b46-8b36-8416cc6b0dae",
"score": 0.559581458568573,
"text": "Lightspeed projects leverage vector databases for improved AI responses",
"metadata": {
"docs_url": "https://www.redhat.com",
"title": "Lightspeed projects leverage vector databases for improved AI responses",
"url_reachable": true
}
}
]
}
Different query (RAG systems):
LLM is explicitly disabled. Using MockLLM.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/index_store.json.
{
"query": "RAG systems",
"top_k": 1,
"threshold": 0.0,
"nodes": [
{
"id": "2d7ef837-88b6-45a4-acf7-aa6dfab91f80",
"score": 0.5027890801429749,
"text": "RAG systems use embeddings to find relevant context for queries",
"metadata": {
"docs_url": "https://www.redhat.com",
"title": "RAG systems use embeddings to find relevant context for queries",
"url_reachable": true
}
}
]
}
Error handling - invalid path:
ERROR: Cannot recognize the DB in ./nonexistent
Test with threshold (might trigger threshold error):
LLM is explicitly disabled. Using MockLLM.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/docstore.json.
Loading llama_index.core.storage.kvstore.simple_kvstore from ./test_container_output/index_store.json.
{
"query": "vector database",
"top_k": 1,
"threshold": 0.95,
"nodes": []
}
Summary by CodeRabbit
New Features
Bug Fixes
Behavior