Skip to content

Conversation

@MODSetter
Copy link
Owner

@MODSetter MODSetter commented Oct 30, 2025

  • Added RERANKERS_ENABLED option to control reranking functionality.
  • Updated rerank_documents function to handle cases when reranking is disabled.
  • Enhanced documentation for environment variables related to rerankers in installation guides.

API Changes

  • This PR includes API changes

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring
  • Documentation
  • Dependency/Build system
  • Breaking change
  • Other (specify):

Testing Performed

  • Tested locally
  • Manual/QA verification

Checklist

  • Follows project coding standards and conventions
  • Documentation updated as needed
  • Dependencies updated as needed
  • No lint/build errors or new warnings
  • All relevant tests are passing

High-level PR Summary

This PR adds a configuration toggle to enable or disable document reranking functionality. It introduces the RERANKERS_ENABLED environment variable that controls whether the reranker service is initialized, updates the rerank_documents function to gracefully handle cases when reranking is disabled by returning original documents, and enhances documentation across installation guides to clarify when reranker configuration is required.

⏱️ Estimated Review Time: 15-30 minutes

💡 Review Order Suggestion
Order File Path
1 surfsense_backend/app/config/__init__.py
2 surfsense_backend/app/agents/researcher/qna_agent/nodes.py
3 surfsense_backend/.env.example
4 surfsense_web/content/docs/docker-installation.mdx
5 surfsense_web/content/docs/manual-installation.mdx

Need help? Join our Discord

Analyze latest changes

Summary by CodeRabbit

  • New Features

    • Added optional reranking feature flag to enable/disable document reranking functionality.
  • Changes

    • Changed default text-to-speech service from OpenAI to local Kokoro backend.
    • Reranking now gracefully handles disabled state without errors.
  • Documentation

    • Updated configuration guides to reflect optional reranking setup and new TTS default.

- Added RERANKERS_ENABLED option to control reranking functionality.
- Updated rerank_documents function to handle cases when reranking is disabled.
- Enhanced documentation for environment variables related to rerankers in installation guides.
@vercel
Copy link

vercel bot commented Oct 30, 2025

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Preview Comments Updated (UTC)
surf-sense-frontend Ready Ready Preview Comment Oct 30, 2025 6:24am

@coderabbitai
Copy link

coderabbitai bot commented Oct 30, 2025

Walkthrough

These changes introduce a RERANKERS_ENABLED feature flag to make reranking optional across the system. When disabled, documents bypass reranking and return unmodified. The configuration layer, business logic, and documentation are updated to reflect this conditional behavior. The default TTS service is also changed from OpenAI to Kokoro.

Changes

Cohort / File(s) Summary
Configuration & Environment
surfsense_backend/.env.example, surfsense_backend/app/config/__init__.py
Added RERANKERS_ENABLED feature flag. Reranker instance is now conditionally initialized only when enabled; otherwise set to None. RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are now conditional to the enabled state. Changed default TTS_SERVICE from openai/tts-1 to local/kokoro.
Business Logic
surfsense_backend/app/agents/researcher/qna_agent/nodes.py
Updated rerank_documents() function to check if reranker is available before attempting reranking. When disabled, logs and returns original documents immediately. When enabled, performs reranking with error handling and fallback to original documents on failure.
Documentation
surfsense_web/content/docs/docker-installation.mdx, surfsense_web/content/docs/manual-installation.mdx
Updated installation guides to reflect new RERANKERS_ENABLED flag. RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are now documented as required only when RERANKERS_ENABLED=TRUE.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as QnA Agent
    participant Nodes as Rerank Nodes
    participant Config as Config
    participant Reranker as Reranker Service
    
    Agent->>Nodes: rerank_documents(state, config)
    Nodes->>Config: Check RERANKERS_ENABLED
    
    alt Reranker Enabled
        Nodes->>Reranker: Build reranker input
        Nodes->>Reranker: Call rerank()
        
        alt Rerank Success
            Reranker-->>Nodes: Reranked results
            Nodes->>Nodes: Sort by score (desc)
            Nodes-->>Agent: Reranked documents
        else Rerank Exception
            Nodes->>Nodes: Log error
            Nodes-->>Agent: Original documents (fallback)
        end
    else Reranker Disabled
        Nodes->>Nodes: Log "reranking disabled"
        Nodes-->>Agent: Original documents
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Verify conditional logic is correctly threaded across configuration, instantiation, and invocation layers
  • Ensure fallback behavior in rerank_documents() properly handles both disabled state and reranker failures
  • Cross-check that documentation accurately reflects the new conditional requirements for RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE

Possibly related PRs

Poem

🐰 A flag to toggle reranking on and off so fine,
When disabled, documents flow unranked through the pipeline,
With graceful fallback and logs to guide the way,
Kokoro whispers now instead of OpenAI's say!
✨ Optional features make systems play.

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "chore: update configuration for rerankers" is directly related to the main changes in this pull request. The changeset introduces a new RERANKERS_ENABLED configuration flag, makes reranker_instance conditional based on this flag, updates environment variable documentation, and modifies the rerank_documents function to handle cases when reranking is disabled. The title accurately captures that configuration updates related to rerankers are the primary focus, and it's specific enough that a teammate scanning the history would understand the changeset involves reranker configuration modifications rather than being misled or confused by vague language.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@recurseml recurseml bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review by RecurseML

🔍 Review performed on b71e154..0987440

  Severity     Location     Issue     Delete  
High surfsense_backend/app/config/init.py:97 Missing validation for required parameters
✅ Files analyzed, no issues (4)

surfsense_backend/.env.example
surfsense_backend/app/agents/researcher/qna_agent/nodes.py
surfsense_web/content/docs/docker-installation.mdx
surfsense_web/content/docs/manual-installation.mdx

)
RERANKERS_ENABLED = os.getenv("RERANKERS_ENABLED", "FALSE").upper() == "TRUE"
if RERANKERS_ENABLED:
RERANKERS_MODEL_NAME = os.getenv("RERANKERS_MODEL_NAME")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When RERANKERS_ENABLED=TRUE, the code attempts to initialize the Reranker without validating that RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are set. If these environment variables are not provided, os.getenv() will return None, and the Reranker class will be instantiated with model_name=None and model_type=None at lines 99-102. Based on the rerankers library documentation (https://github.com/AnswerDotAI/rerankers), the Reranker class requires valid model_name and model_type parameters to function. Passing None values will cause a runtime error when the application starts or when reranking is attempted, likely raising a TypeError or ValueError during Reranker initialization. The application will crash at startup if RERANKERS_ENABLED=TRUE without the required model configuration values.


React with 👍 to tell me that this comment was useful, or 👎 if not (and I'll stop posting more comments like this in the future)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
surfsense_backend/.env.example (1)

59-59: Minor formatting issue in the comment.

Missing a space before the parenthesis. Should be "TRUE or FALSE (Default: FALSE)" instead of "TRUE or FALSE(Default: FALSE)".

Apply this diff:

-RERANKERS_ENABLED=TRUE or FALSE(Default: FALSE)
+RERANKERS_ENABLED=TRUE or FALSE (Default: FALSE)
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b71e154 and 0987440.

📒 Files selected for processing (5)
  • surfsense_backend/.env.example (3 hunks)
  • surfsense_backend/app/agents/researcher/qna_agent/nodes.py (2 hunks)
  • surfsense_backend/app/config/__init__.py (1 hunks)
  • surfsense_web/content/docs/docker-installation.mdx (1 hunks)
  • surfsense_web/content/docs/manual-installation.mdx (1 hunks)
🧰 Additional context used
📓 Path-based instructions (2)
**/.env.*

📄 CodeRabbit inference engine (.rules/no_env_files_in_repo.mdc)

Do not commit variant environment files like .env.* (e.g., .env.local, .env.production)

Files:

  • surfsense_backend/.env.example
**/.env.example

📄 CodeRabbit inference engine (.rules/no_env_files_in_repo.mdc)

Provide a .env.example file with placeholder values instead of real secrets

Files:

  • surfsense_backend/.env.example
🧬 Code graph analysis (1)
surfsense_backend/app/agents/researcher/qna_agent/nodes.py (1)
surfsense_backend/app/services/reranker_service.py (1)
  • rerank_documents (21-90)
🪛 dotenv-linter (4.0.0)
surfsense_backend/.env.example

[warning] 29-29: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 31-31: [UnorderedKey] The GOOGLE_OAUTH_CLIENT_ID key should go before the REGISTRATION_ENABLED key

(UnorderedKey)


[warning] 32-32: [UnorderedKey] The GOOGLE_OAUTH_CLIENT_SECRET key should go before the REGISTRATION_ENABLED key

(UnorderedKey)


[warning] 59-59: [ValueWithoutQuotes] This value needs to be surrounded in quotes

(ValueWithoutQuotes)


[warning] 63-63: [ExtraBlankLine] Extra blank line detected

(ExtraBlankLine)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Python Backend Quality
🔇 Additional comments (5)
surfsense_web/content/docs/manual-installation.mdx (1)

76-78: LGTM!

The documentation clearly reflects the conditional nature of the reranker configuration. The optional flag and conditional requirements are well-documented.

surfsense_web/content/docs/docker-installation.mdx (1)

91-93: LGTM!

Documentation is consistent with the manual installation guide and accurately reflects the conditional reranker configuration.

surfsense_backend/.env.example (1)

66-66: Note: TTS service default changed from OpenAI to Kokoro.

The default TTS service has been changed from openai/tts-1 to local/kokoro. This aligns with the PR summary but is unrelated to the reranker feature. Ensure this change is intentional and documented for users upgrading their configurations.

surfsense_backend/app/agents/researcher/qna_agent/nodes.py (1)

45-83: LGTM! Well-structured conditional reranking logic.

The implementation correctly:

  • Returns early with original documents when reranking is disabled
  • Prepares reranker input documents with all required fields
  • Sorts reranked results by score in descending order
  • Provides safe fallback to original documents on errors

The error handling ensures resilience, and the logging helps with debugging.

surfsense_backend/app/config/__init__.py (1)

95-104: The review comment is based on a misunderstanding of the code structure.

RERANKERS_MODEL_NAME and RERANKERS_MODEL_TYPE are not class attributes—they are local variables scoped to the if RERANKERS_ENABLED: block used only to instantiate the Reranker. They are never intended to persist as class attributes or be accessed elsewhere.

Verification confirms:

  • No code references Config.RERANKERS_MODEL_NAME or Config.RERANKERS_MODEL_TYPE anywhere in the codebase
  • The only conditional class attribute is reranker_instance, which is properly safeguarded by a hasattr() check in reranker_service.py (line 102) before access
  • No AttributeError risk exists

Likely an incorrect or invalid review comment.

@MODSetter MODSetter merged commit b79befd into main Oct 30, 2025
8 of 9 checks passed
okapteinis pushed a commit to okapteinis/SurfSense that referenced this pull request Nov 21, 2025
chore: update configuration for rerankers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants