-
Notifications
You must be signed in to change notification settings - Fork 8
DOC-790 | Importer & Retriever: Update startup parameters #809
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: DOC-761
Are you sure you want to change the base?
Conversation
|
Deploy Preview Available Via |
aMahanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! minor comment about including embedding_api_provider
| "openrouter_model": "mistralai/mistral-nemo" // Specify a model here | ||
| "chat_api_provider": "openai", | ||
| "embedding_api_provider": "openai", | ||
| "chat_api_url": "https://openrouter.ai/api/v1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine - since setting openai as the provider means that we just use the OpenAI() client to interact with the OpenRouter URL, which is OpenAI-compatible
diegomendez40
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of comments.
|
IMHO: It would be great if we could provide short documentation on how to create a project in genai-service. Create a New Project Endpoint: POST /v1/project Validation: The name must be 1–63 characters long and can only contain letters, numbers, underscores, and hyphens. Request Body: json json WDYT? |
@bluepal-pavan-kothapalli There is a section on how to create a new project, in |
|
You have run out of free Bugbot PR reviews for this billing cycle. This will reset on November 28. To receive reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial. |
|
@diegomendez40 @bluepal-pavan-kothapalli FYI changes from my two latest commits:
|
diegomendez40
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work, @nerpaula. Unfortunately, I have found a number of possible enhancements and corrections.
While I added most of them to the 3.12 folder, they do also apply to 3.13.
| - `triton_model`: Name of the LLM model to use for text processing. | ||
|
|
||
| ### Using OpenAI (Public LLM) | ||
| ### Using OpenAI for chat and embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"openai" doesn't stand for OpenAI, but rather for any OpenAI-compatible API, including essentially any large LLM provider: OpenRouter, Gemini, Anthropic, corporate LLMs, etc.
The URL can point to the relevant non-OpenAI endpoint, even if it is served via an OpenAI-compatible API.
|
|
||
| {{< info >}} | ||
| By default, for OpenAI API, the service is using | ||
| `gpt-4o-mini` and `text-embedding-3-small` models as LLM and | ||
| embedding model respectively. | ||
| {{< /info >}} | ||
|
|
||
| ### Using OpenRouter (Gemini, Anthropic, etc.) | ||
| ### Using OpenRouter for chat and OpenAI for embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not just OpenRouter. It's literally any OpenAI compatible API.
| - **Instant search**: Focuses on specific entities and their relationships, ideal | ||
| for fast queries about particular concepts. | ||
| - **Deep search**: Analyzes the knowledge graph structure to identify themes and patterns, | ||
| perfect for comprehensive insights and detailed summaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, these definitions are incorrect. These are the definitions for global and local. However, I had already provided the relevant definitions for instant vs. deep search, which can be used here.
| Deep Search is designed for highly detailed, accurate responses that require understanding | ||
| what kind of information is available in different parts of the knowledge graph and | ||
| sequentially retrieving information in an LLM-guided research process. Use whenever | ||
| detail and accuracy are required (e.g. aggregation of highly technical details) and | ||
| very short latency is not (i.e. caching responses for frequently asked questions, | ||
| or use case with agents or research use cases). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
| The request parameters are the following: | ||
| - `query`: Your search query text. | ||
| - `level`: The community hierarchy level to use for the search (`1` for top-level communities). | ||
| - `level`: The community hierarchy level to use for the search (`1` for top-level communities). Defaults to `2` if not provided. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the 'level' parameter. That's for global queries.
| - `UNIFIED`: Instant search. | ||
| - `LOCAL`: Deep search. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't exactly right.
Local with no LLM planner is the typical local query.
Local with LLM planner is Deep Search.
| {{< /info >}} | ||
|
|
||
| ### Using OpenRouter (Gemini, Anthropic, etc.) | ||
| ### Using OpenRouter for chat and OpenAI for embedding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, this should mention any OpenAI compatible API, not just OpenRouter
| - **Instant search**: Focuses on specific entities and their relationships, ideal | ||
| for fast queries about particular concepts. | ||
| - **Deep search**: Analyzes the knowledge graph structure to identify themes and patterns, | ||
| perfect for comprehensive insights and detailed summaries. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All changes above (on 3.12) should be replicated for 3.13.
Co-authored-by: Anthony Mahanna <43019056+aMahanna@users.noreply.github.com>
…oject creation; move and extend Projects
diegomendez40
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your work, @nerpaula.
This version is far better -- far more accurate. Still, I managed to find a couple of possible enhancements. Thanks for considering them before merging.
| "chat_api_provider": "<your-api-provider>", | ||
| "chat_api_key": "<your-llm-provider-api-key>", | ||
| "chat_model": "<model-name>" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this example is right.
You'd also need an embedding_api_provider:
And then, due to the providers that we actually support, since you're already using a chat_api_key you're using an OpenAI-compatible API, which means you'd also need an embedding_api_key.
Just to be sure I'd use all these args:
- "db_name"
- "chat_api_provider"
- "embedding_api_provider"
- "chat_model"
- "embedding_model"
- "chat_api_url"
- "embedding_api_url"
- "embedding_dim"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick question about "embedding_dim". Should we add this to all examples, currently missing. Optional, I suppose? How can one decide the embedding dimension?
| graph and get contextually relevant responses. | ||
| The Retriever service provides intelligent search and retrieval from knowledge graphs, | ||
| with multiple search methods optimized for different query types. The service supports | ||
| both private (Triton Inference Server) and public (any OpenAI-compatible API) LLM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A corporate LLM isn't necessarily public. We've done projects with customers where we have been using their private LLMs via an OpenAI-compatible API.
Thus, the distinction private-Triton vs. public-OpenAI-compatible is false. OpenAI-compatible can be private.
|
|
||
| The Retriever service can be configured to use either the Triton Inference Server | ||
| (for private LLM deployments) or OpenAI/OpenRouter (for public LLM deployments). | ||
| (for private LLM deployments) or any OpenAI-compatible API (for public LLM deployments), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
false private Triton/public OpenAI-compatible dichotomy
| - `1`: Global search. | ||
| - `2`: Local search. | ||
| - `GLOBAL` or `1`: Global Search (default if not specified). | ||
| - `LOCAL` or `2`: Deep Search when used with LLM planner, or standard Local Search without the planner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - `LOCAL` or `2`: Deep Search when used with LLM planner, or standard Local Search without the planner. | |
| - `LOCAL` or `2`: Deep Search when used with LLM planner (default), or standard Local Search when llm_planner is explicitly set to false. |
| - `UNIFIED` or `3`: Instant Search. | ||
|
|
||
| - `use_llm_planner`: Whether to use LLM planner for intelligent query orchestration (optional) | ||
| - When enabled, orchestrates retrieval using both local and global strategies (powers Deep Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - When enabled, orchestrates retrieval using both local and global strategies (powers Deep Search) | |
| - When enabled (default), orchestrates retrieval using both local and global strategies (powers Deep Search) |
| By default, for OpenAI API, the service is using | ||
| `gpt-4o-mini` and `text-embedding-3-small` models as LLM and | ||
| embedding model respectively. | ||
| When using the official OpenAI API, the service defaults to `gpt-4o-mini` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs say default models are gpt-4o-mini, but code uses gpt-4o. Please update docs to match actual behavior.
Refs: server.py:197, graph_builder.py:183
| By default, for OpenAI API, the service is using | ||
| `gpt-4o-mini` and `text-embedding-3-small` models as LLM and | ||
| embedding model respectively. | ||
| When using the official OpenAI API, the service defaults to `gpt-4o-mini` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as mentioned above docs has the default model gpt-4o-mini, but the code uses gpt-4o. update the docs to reflect the actual behavior.
Refs: server.py:197, graph_builder.py:183
| OpenRouter makes it possible to connect to a huge array of LLM API | ||
| providers, including non-OpenAI LLMs like Gemini Flash, Anthropic Claude | ||
| and publicly hosted open-source models. | ||
| You can mix and match any OpenAI-compatible APIs for chat and embedding. For example, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs currently state that chat and embedding providers can be mixed, but the code blocks this and raises an error:
server.py(149-154)
if args.chat_api_provider != args.embedding_api_provider:
raise ValueError("Chat API provider and embedding API provider must be the same.")
Please update the docs — either remove this section or mark it as a planned feature.
Suggested note:
Mixed provider support is planned. Currently, both chat_api_provider and embedding_api_provider must match (OpenAI or Triton).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bluepal-keerthi-datla I wanted to illustrate in this example that you can use different OpenAI-compatible services - as in OpenRouter for chat and OpenAI for embeddings by setting both providers to "openai" and differentiating them with different URLs . I see that this can be confusing since both are considered the same provider type. I will change the section title to reflect this and clarify that you cannot mix Triton and OpenAI-compatible APIs.
Description
Importer & Retriever: Update startup parameters., including descriptions and parameters for instant and deep search.
Applying the changes to all versions no longer needed. The AI suite folder has been removed from the ArangoDB versioned folders.