-
-
Notifications
You must be signed in to change notification settings - Fork 976
chore: update configuration for rerankers #448
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Added RERANKERS_ENABLED option to control reranking functionality. - Updated rerank_documents function to handle cases when reranking is disabled. - Enhanced documentation for environment variables related to rerankers in installation guides.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
WalkthroughThese changes introduce a Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as QnA Agent
participant Nodes as Rerank Nodes
participant Config as Config
participant Reranker as Reranker Service
Agent->>Nodes: rerank_documents(state, config)
Nodes->>Config: Check RERANKERS_ENABLED
alt Reranker Enabled
Nodes->>Reranker: Build reranker input
Nodes->>Reranker: Call rerank()
alt Rerank Success
Reranker-->>Nodes: Reranked results
Nodes->>Nodes: Sort by score (desc)
Nodes-->>Agent: Reranked documents
else Rerank Exception
Nodes->>Nodes: Log error
Nodes-->>Agent: Original documents (fallback)
end
else Reranker Disabled
Nodes->>Nodes: Log "reranking disabled"
Nodes-->>Agent: Original documents
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review by RecurseML
🔍 Review performed on b71e154..0987440
| Severity | Location | Issue | Delete |
|---|---|---|---|
| surfsense_backend/app/config/init.py:97 | Missing validation for required parameters |
✅ Files analyzed, no issues (4)
• surfsense_backend/.env.example
• surfsense_backend/app/agents/researcher/qna_agent/nodes.py
• surfsense_web/content/docs/docker-installation.mdx
• surfsense_web/content/docs/manual-installation.mdx
| ) | ||
| RERANKERS_ENABLED = os.getenv("RERANKERS_ENABLED", "FALSE").upper() == "TRUE" | ||
| if RERANKERS_ENABLED: | ||
| RERANKERS_MODEL_NAME = os.getenv("RERANKERS_MODEL_NAME") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When RERANKERS_ENABLED=TRUE, the code attempts to initialize the Reranker without validating that RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are set. If these environment variables are not provided, os.getenv() will return None, and the Reranker class will be instantiated with model_name=None and model_type=None at lines 99-102. Based on the rerankers library documentation (https://github.com/AnswerDotAI/rerankers), the Reranker class requires valid model_name and model_type parameters to function. Passing None values will cause a runtime error when the application starts or when reranking is attempted, likely raising a TypeError or ValueError during Reranker initialization. The application will crash at startup if RERANKERS_ENABLED=TRUE without the required model configuration values.
React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
surfsense_backend/.env.example (1)
59-59: Minor formatting issue in the comment.Missing a space before the parenthesis. Should be
"TRUE or FALSE (Default: FALSE)"instead of"TRUE or FALSE(Default: FALSE)".Apply this diff:
-RERANKERS_ENABLED=TRUE or FALSE(Default: FALSE) +RERANKERS_ENABLED=TRUE or FALSE (Default: FALSE)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
surfsense_backend/.env.example(3 hunks)surfsense_backend/app/agents/researcher/qna_agent/nodes.py(2 hunks)surfsense_backend/app/config/__init__.py(1 hunks)surfsense_web/content/docs/docker-installation.mdx(1 hunks)surfsense_web/content/docs/manual-installation.mdx(1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/.env.*
📄 CodeRabbit inference engine (.rules/no_env_files_in_repo.mdc)
Do not commit variant environment files like .env.* (e.g., .env.local, .env.production)
Files:
surfsense_backend/.env.example
**/.env.example
📄 CodeRabbit inference engine (.rules/no_env_files_in_repo.mdc)
Provide a .env.example file with placeholder values instead of real secrets
Files:
surfsense_backend/.env.example
🧬 Code graph analysis (1)
surfsense_backend/app/agents/researcher/qna_agent/nodes.py (1)
surfsense_backend/app/services/reranker_service.py (1)
rerank_documents(21-90)
🪛 dotenv-linter (4.0.0)
surfsense_backend/.env.example
[warning] 29-29: [ValueWithoutQuotes] This value needs to be surrounded in quotes
(ValueWithoutQuotes)
[warning] 31-31: [UnorderedKey] The GOOGLE_OAUTH_CLIENT_ID key should go before the REGISTRATION_ENABLED key
(UnorderedKey)
[warning] 32-32: [UnorderedKey] The GOOGLE_OAUTH_CLIENT_SECRET key should go before the REGISTRATION_ENABLED key
(UnorderedKey)
[warning] 59-59: [ValueWithoutQuotes] This value needs to be surrounded in quotes
(ValueWithoutQuotes)
[warning] 63-63: [ExtraBlankLine] Extra blank line detected
(ExtraBlankLine)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Python Backend Quality
🔇 Additional comments (5)
surfsense_web/content/docs/manual-installation.mdx (1)
76-78: LGTM!The documentation clearly reflects the conditional nature of the reranker configuration. The optional flag and conditional requirements are well-documented.
surfsense_web/content/docs/docker-installation.mdx (1)
91-93: LGTM!Documentation is consistent with the manual installation guide and accurately reflects the conditional reranker configuration.
surfsense_backend/.env.example (1)
66-66: Note: TTS service default changed from OpenAI to Kokoro.The default TTS service has been changed from
openai/tts-1tolocal/kokoro. This aligns with the PR summary but is unrelated to the reranker feature. Ensure this change is intentional and documented for users upgrading their configurations.surfsense_backend/app/agents/researcher/qna_agent/nodes.py (1)
45-83: LGTM! Well-structured conditional reranking logic.The implementation correctly:
- Returns early with original documents when reranking is disabled
- Prepares reranker input documents with all required fields
- Sorts reranked results by score in descending order
- Provides safe fallback to original documents on errors
The error handling ensures resilience, and the logging helps with debugging.
surfsense_backend/app/config/__init__.py (1)
95-104: The review comment is based on a misunderstanding of the code structure.RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are not class attributes—they are local variables scoped to the
if RERANKERS_ENABLED:block used only to instantiate the Reranker. They are never intended to persist as class attributes or be accessed elsewhere.Verification confirms:
- No code references
Config.RERANKERS_MODEL_NAMEorConfig.RERANKERS_MODEL_TYPEanywhere in the codebase- The only conditional class attribute is
reranker_instance, which is properly safeguarded by ahasattr()check inreranker_service.py(line 102) before access- No AttributeError risk exists
Likely an incorrect or invalid review comment.
chore: update configuration for rerankers
API Changes
Change Type
Testing Performed
Checklist
High-level PR Summary
This PR adds a configuration toggle to enable or disable document reranking functionality. It introduces the
RERANKERS_ENABLEDenvironment variable that controls whether the reranker service is initialized, updates thererank_documentsfunction to gracefully handle cases when reranking is disabled by returning original documents, and enhances documentation across installation guides to clarify when reranker configuration is required.⏱️ Estimated Review Time: 15-30 minutes
💡 Review Order Suggestion
surfsense_backend/app/config/__init__.pysurfsense_backend/app/agents/researcher/qna_agent/nodes.pysurfsense_backend/.env.examplesurfsense_web/content/docs/docker-installation.mdxsurfsense_web/content/docs/manual-installation.mdxSummary by CodeRabbit
New Features
Changes
Documentation