-
Notifications
You must be signed in to change notification settings - Fork 10.2k
Description
Is your feature request related to a problem? Please describe:
When working with large codebases or multiple files, the context window of LLMs is too limited to fit all relevant code/documentation. This makes it difficult to ask questions like “How does authentication work across these modules?” or “Where is this function used in the project?” because the model cannot see the entire project at once.
Describe the solution you'd like:
Introduce Retrieval-Augmented Generation (RAG) support so that instead of trying to fit everything into the prompt, the system can:
Index the entire codebase or documentation.
Dynamically retrieve only the most relevant chunks based on a query.
Feed those retrieved pieces into the LLM prompt.
Return grounded answers with citations (file names, line numbers, snippets).
Describe alternatives you've considered:
Splitting files manually and pasting them into the prompt (inefficient and error-prone).
Increasing the context window by switching models (not always possible, expensive, and still limited).
Using embeddings + search outside the repo (adds extra infrastructure burden for every user).
Additional context:
RAG would make the tool scalable for:
Large repositories with thousands of files.
Documentation-heavy projects.
Long-lived projects where the model must stay aligned with evolving code.
It also ensures better accuracy, as the model relies on actual retrieved code/docs rather than guessing.
Reference : https://kilocode.ai/docs/advanced-usage/large-projects