Disable Re ranker and Add LLM cost tracking #112

nuwangeek · 2025-10-02T12:15:35Z

Add LLM cost tracking and update reranker flow

Removed reranker from the main retrieval pipeline
Commented out encoder model loading to simplify flow
Added metadata collection function in cost_utils to extract LLM costs and token usage
Implemented total cost calculation utility in cost_utils
Integrated logging in LLM Orchestrator Service to track cost and token metrics

Get update from RAG-6 to RAG-7

get update from RAG-6 into RAG-7

Update wip-temp from RAG-7

Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip

Wip

Fixed merge conflicts

LLM connection creation changes (#108)

Copilot

Pull Request Overview

This PR disables the reranker component for performance optimization and adds comprehensive LLM cost tracking throughout the system. The changes focus on removing reranker dependencies while implementing detailed usage monitoring for all LLM operations.

Completely disabled reranker functionality by commenting out initialization and usage code
Added new cost tracking utilities to monitor LLM usage, tokens, and costs across components
Integrated cost tracking into response generation and prompt refinement workflows

Reviewed Changes

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
src/vector_indexer/hybrid_retrieval.py	Disabled reranker initialization and usage, always using fusion scores only
src/utils/cost_utils.py	New utility module for LLM cost calculation and usage tracking
src/response_generator/response_generate.py	Added usage tracking to response generation workflow
src/prompt_refine_manager/prompt_refiner.py	Added usage tracking to prompt refinement process
src/llm_orchestration_service.py	Integrated cost tracking across orchestration workflow with detailed logging
pyproject.toml	Removed rerankers dependency from project requirements

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-02T12:16:54Z

src/vector_indexer/hybrid_retrieval.py

+        #     )
+        #     self.reranker = None
+
+        # Reranker disabled - set to None


[nitpick] This comment is redundant since the code immediately below sets self.reranker = None and there's already a comprehensive comment block above explaining the reranker is disabled. Consider removing this line.

Suggested change

# Reranker disabled - set to None

Copilot · 2025-10-02T12:16:54Z

src/utils/cost_utils.py

+def track_lm_usage(
+    operation: Callable[..., Any], *args, **kwargs
+) -> tuple[Any, Dict[str, Any]]:
+    """
+    Context manager-like function to track LM usage for any operation.
+
+    Args:
+        operation: The function to execute and track
+        *args: Positional arguments for the operation
+        **kwargs: Keyword arguments for the operation
+
+    Returns:
+        Tuple of (operation_result, usage_info_dict)
+
+    Example:
+        result, usage = track_lm_usage(predictor, question="What is AI?")
+    """
+    # Get initial history length
+    lm = dspy.settings.lm
+    history_length_before = len(lm.history) if lm and hasattr(lm, "history") else 0
+
+    # Execute the operation
+    result = operation(*args, **kwargs)
+
+    # Extract usage from new history entries
+    usage_info = get_default_usage_dict()
+
+    if lm and hasattr(lm, "history"):
+        try:
+            new_history = lm.history[history_length_before:]
+            usage_info = extract_cost_from_lm_history(new_history)
+        except Exception as e:
+            logger.warning(f"Failed to extract usage info: {str(e)}")
+
+    return result, usage_info
+
+


The function track_lm_usage is defined but not used anywhere in the codebase according to the diff. Consider removing it if it's not needed, or add test coverage if it's intended for future use.

Suggested change

def track_lm_usage(

operation: Callable[..., Any], *args, **kwargs

) -> tuple[Any, Dict[str, Any]]:

"""

Context manager-like function to track LM usage for any operation.

Args:

operation: The function to execute and track

*args: Positional arguments for the operation

**kwargs: Keyword arguments for the operation

Returns:

Tuple of (operation_result, usage_info_dict)

Example:

result, usage = track_lm_usage(predictor, question="What is AI?")

"""

# Get initial history length

lm = dspy.settings.lm

history_length_before = len(lm.history) if lm and hasattr(lm, "history") else 0

# Execute the operation

result = operation(*args, **kwargs)

# Extract usage from new history entries

usage_info = get_default_usage_dict()

if lm and hasattr(lm, "history"):

try:

new_history = lm.history[history_length_before:]

usage_info = extract_cost_from_lm_history(new_history)

except Exception as e:

logger.warning(f"Failed to extract usage info: {str(e)}")

return result, usage_info

Copilot · 2025-10-02T12:16:55Z

src/llm_orchestration_service.py

    """Stateless service class for handling LLM orchestration business logic."""

    def __init__(self) -> None:
        """Initialize the stateless orchestration service."""


[nitpick] The docstring mentions 'stateless orchestration service' but the method now has logic that tracks costs in a dictionary, making it somewhat stateful during execution. Consider updating the docstring to reflect this behavior.

Suggested change

"""Stateless service class for handling LLM orchestration business logic."""

def __init__(self) -> None:

"""Initialize the stateless orchestration service."""

"""

Service class for handling LLM orchestration business logic.

The service does not maintain state between requests (stateless in the architectural sense),

but tracks per-request state (such as costs) internally during the execution of a request.

"""

def __init__(self) -> None:

"""

Initialize the orchestration service.

Note: The service does not persist state between requests, but tracks per-request

information (e.g., costs) internally during request processing.

"""

src/llm_orchestration_service.py

Thirunayan22 · 2025-10-03T10:44:42Z

src/llm_orchestration_service.py

+
+            total_costs = calculate_total_costs(costs_dict)
+
+            logger.info("=" * 50)


can we remove this?

src/llm_orchestration_service.py

Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip

Get update from wip into RAG-111

Disable Re ranker and Add LLM cost tracking (buerokratt#112)

nuwangeek and others added 22 commits September 10, 2025 21:21

Merge pull request #17 from rootcodelabs/RAG-6

1b38b0d

Get update from RAG-6 to RAG-7

Merge pull request #18 from rootcodelabs/RAG-6

7eb51d2

get update from RAG-6 into RAG-7

partialy completes prompt refiner

5d1c9ce

integrate prompt refiner with llm_config_module

e7382d1

fixed ruff lint issues

a7a2303

Merge pull request #22 from rootcodelabs/RAG-7

61023b4

Update wip-temp from RAG-7

complete prompt refiner, chunk retriver and reranker

a7eeeeb

remove unnesessary comments

ef4630e

updated .gitignore

76515ce

Remove data_sets from tracking

1c059b8

update .gitignore file

f4ca76d

complete vault setup and response generator

a7d7d68

remove ignore comment

5589413

Merge pull request #25 from buerokratt/wip

369e9a0

Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip

Merge branch 'RAG-81' into wip

33915ea

Merge pull request #32 from buerokratt/wip

286b761

Wip

fixed merge conflicts

9cf32b3

Merge branch 'buerokratt-wip' into wip

4d282d3

Fixed merge conflicts

removed old modules

be08c23

fixed merge conflicts

2f39178

Merge pull request #35 from buerokratt/wip

1a8cb4e

LLM connection creation changes (#108)

remove reranker and adding logs to output costs from llm calling

e019656

nuwangeek requested review from Thirunayan22 and Copilot October 2, 2025 12:16

Copilot AI reviewed Oct 2, 2025

View reviewed changes

fixed requested changes

4c7f523

Thirunayan22 requested changes Oct 3, 2025

View reviewed changes

nuwangeek and others added 3 commits October 3, 2025 16:27

Merge pull request #36 from buerokratt/wip

18138f2

Get update from buerokratt/RAG-Module wip to rootcodelabs/RAG-Module wip

Merge pull request #37 from rootcodelabs/wip

aa8c61f

Get update from wip into RAG-111

fixed logging issue

7035acc

Thirunayan22 approved these changes Oct 3, 2025

View reviewed changes

Thirunayan22 merged commit 0294237 into buerokratt:wip Oct 3, 2025
6 of 7 checks passed

nuwangeek added a commit to rootcodelabs/RAG-Module that referenced this pull request Oct 3, 2025

Merge pull request #38 from buerokratt/wip

176ffd5

Disable Re ranker and Add LLM cost tracking (buerokratt#112)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable Re ranker and Add LLM cost tracking #112

Disable Re ranker and Add LLM cost tracking #112

Uh oh!

nuwangeek commented Oct 2, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 2, 2025

Uh oh!

Copilot AI Oct 2, 2025

Uh oh!

Copilot AI Oct 2, 2025

Uh oh!

Uh oh!

Thirunayan22 Oct 3, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-    """Stateless service class for handling LLM orchestration business logic."""
-    def __init__(self) -> None:
-        """Initialize the stateless orchestration service."""
+    """
+    Service class for handling LLM orchestration business logic.
+    The service does not maintain state between requests (stateless in the architectural sense),
+    but tracks per-request state (such as costs) internally during the execution of a request.
+    """
+    def __init__(self) -> None:
+        """
+        Initialize the orchestration service.
+        Note: The service does not persist state between requests, but tracks per-request
+        information (e.g., costs) internally during request processing.
+        """


		total_costs = calculate_total_costs(costs_dict)

		logger.info("=" * 50)

Disable Re ranker and Add LLM cost tracking #112

Disable Re ranker and Add LLM cost tracking #112

Uh oh!

Conversation

nuwangeek commented Oct 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Thirunayan22 Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nuwangeek commented Oct 2, 2025 •

edited

Loading