-
Notifications
You must be signed in to change notification settings - Fork 4
Disable Re ranker and Add LLM cost tracking #112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Get update from RAG-6 to RAG-7
get update from RAG-6 into RAG-7
Update wip-temp from RAG-7
Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip
Fixed merge conflicts
LLM connection creation changes (#108)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR disables the reranker component for performance optimization and adds comprehensive LLM cost tracking throughout the system. The changes focus on removing reranker dependencies while implementing detailed usage monitoring for all LLM operations.
- Completely disabled reranker functionality by commenting out initialization and usage code
- Added new cost tracking utilities to monitor LLM usage, tokens, and costs across components
- Integrated cost tracking into response generation and prompt refinement workflows
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/vector_indexer/hybrid_retrieval.py | Disabled reranker initialization and usage, always using fusion scores only |
| src/utils/cost_utils.py | New utility module for LLM cost calculation and usage tracking |
| src/response_generator/response_generate.py | Added usage tracking to response generation workflow |
| src/prompt_refine_manager/prompt_refiner.py | Added usage tracking to prompt refinement process |
| src/llm_orchestration_service.py | Integrated cost tracking across orchestration workflow with detailed logging |
| pyproject.toml | Removed rerankers dependency from project requirements |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| # ) | ||
| # self.reranker = None | ||
|
|
||
| # Reranker disabled - set to None |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] This comment is redundant since the code immediately below sets self.reranker = None and there's already a comprehensive comment block above explaining the reranker is disabled. Consider removing this line.
| # Reranker disabled - set to None |
src/utils/cost_utils.py
Outdated
| def track_lm_usage( | ||
| operation: Callable[..., Any], *args, **kwargs | ||
| ) -> tuple[Any, Dict[str, Any]]: | ||
| """ | ||
| Context manager-like function to track LM usage for any operation. | ||
| Args: | ||
| operation: The function to execute and track | ||
| *args: Positional arguments for the operation | ||
| **kwargs: Keyword arguments for the operation | ||
| Returns: | ||
| Tuple of (operation_result, usage_info_dict) | ||
| Example: | ||
| result, usage = track_lm_usage(predictor, question="What is AI?") | ||
| """ | ||
| # Get initial history length | ||
| lm = dspy.settings.lm | ||
| history_length_before = len(lm.history) if lm and hasattr(lm, "history") else 0 | ||
|
|
||
| # Execute the operation | ||
| result = operation(*args, **kwargs) | ||
|
|
||
| # Extract usage from new history entries | ||
| usage_info = get_default_usage_dict() | ||
|
|
||
| if lm and hasattr(lm, "history"): | ||
| try: | ||
| new_history = lm.history[history_length_before:] | ||
| usage_info = extract_cost_from_lm_history(new_history) | ||
| except Exception as e: | ||
| logger.warning(f"Failed to extract usage info: {str(e)}") | ||
|
|
||
| return result, usage_info | ||
|
|
||
|
|
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function track_lm_usage is defined but not used anywhere in the codebase according to the diff. Consider removing it if it's not needed, or add test coverage if it's intended for future use.
| def track_lm_usage( | |
| operation: Callable[..., Any], *args, **kwargs | |
| ) -> tuple[Any, Dict[str, Any]]: | |
| """ | |
| Context manager-like function to track LM usage for any operation. | |
| Args: | |
| operation: The function to execute and track | |
| *args: Positional arguments for the operation | |
| **kwargs: Keyword arguments for the operation | |
| Returns: | |
| Tuple of (operation_result, usage_info_dict) | |
| Example: | |
| result, usage = track_lm_usage(predictor, question="What is AI?") | |
| """ | |
| # Get initial history length | |
| lm = dspy.settings.lm | |
| history_length_before = len(lm.history) if lm and hasattr(lm, "history") else 0 | |
| # Execute the operation | |
| result = operation(*args, **kwargs) | |
| # Extract usage from new history entries | |
| usage_info = get_default_usage_dict() | |
| if lm and hasattr(lm, "history"): | |
| try: | |
| new_history = lm.history[history_length_before:] | |
| usage_info = extract_cost_from_lm_history(new_history) | |
| except Exception as e: | |
| logger.warning(f"Failed to extract usage info: {str(e)}") | |
| return result, usage_info |
src/llm_orchestration_service.py
Outdated
| """Stateless service class for handling LLM orchestration business logic.""" | ||
|
|
||
| def __init__(self) -> None: | ||
| """Initialize the stateless orchestration service.""" |
Copilot
AI
Oct 2, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nitpick] The docstring mentions 'stateless orchestration service' but the method now has logic that tracks costs in a dictionary, making it somewhat stateful during execution. Consider updating the docstring to reflect this behavior.
| """Stateless service class for handling LLM orchestration business logic.""" | |
| def __init__(self) -> None: | |
| """Initialize the stateless orchestration service.""" | |
| """ | |
| Service class for handling LLM orchestration business logic. | |
| The service does not maintain state between requests (stateless in the architectural sense), | |
| but tracks per-request state (such as costs) internally during the execution of a request. | |
| """ | |
| def __init__(self) -> None: | |
| """ | |
| Initialize the orchestration service. | |
| Note: The service does not persist state between requests, but tracks per-request | |
| information (e.g., costs) internally during request processing. | |
| """ |
src/llm_orchestration_service.py
Outdated
|
|
||
| total_costs = calculate_total_costs(costs_dict) | ||
|
|
||
| logger.info("=" * 50) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we remove this?
Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip
Get update from wip into RAG-111
Disable Re ranker and Add LLM cost tracking (buerokratt#112)
Add LLM cost tracking and update reranker flow