Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose complete response metadata from chat model via .invoke/.batch/.stream #16403

Open
1 task done
eyurtsev opened this issue Jan 22, 2024 · 4 comments
Open
1 task done
Labels
03 enhancement Enhancement of existing functionality Ɑ: models Related to LLMs or chat model modules

Comments

@eyurtsev
Copy link
Collaborator

Privileged issue

  • I am a LangChain maintainer, or was asked directly by a LangChain maintainer to create an issue here.

Issue Content

Impossible to access system_fingerprint from OpenAI responses.

see: #13170 (reply in thread)

@eyurtsev
Copy link
Collaborator Author

cc @baskaryan

@eyurtsev
Copy link
Collaborator Author

Another discussion: #16030

@dosubot dosubot bot added Ɑ: models Related to LLMs or chat model modules 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jan 22, 2024
@eyurtsev eyurtsev added 03 enhancement Enhancement of existing functionality and removed 🤖:bug Related to a bug, vulnerability, unexpected error with an existing feature labels Jan 22, 2024
@Keiku
Copy link

Keiku commented Mar 1, 2024

@eyurtsev Is there an update here? I'm having trouble with the lack of reproducibility of the output.

@alex-ber
Copy link

Inspired by #16030 (reply in thread)

class OpenAICallbackHandler:
    """Callback Handler that tracks OpenAI info."""

    total_tokens: int = 0
    prompt_tokens: int = 0
    completion_tokens: int = 0
    successful_requests: int = 0
    total_cost: float = 0.0
    system_fingerprint: str = ""
    
    def __init__(self) -> None:
        super().__init__()
        self._lock = threading.Lock()
        
         def __repr__(self) -> str:
            return (
            f"Tokens Used: {self.total_tokens}\n"
            f"\tPrompt Tokens: {self.prompt_tokens}\n"
            f"\tCompletion Tokens: {self.completion_tokens}\n"
            f"Successful Requests: {self.successful_requests}\n"
            f"Total Cost (USD): ${self.total_cost}"\n
           "f"system_filngerprint is {self.system_filngerprint}"
        )
        
        @property
    def always_verbose(self) -> bool:
        """Whether to call verbose callbacks even if verbose is False."""
        return True

    def on_llm_start(
        self, serialized: Dict[str, Any], prompts: List[str], **kwargs: Any
    ) -> None:
        """Print out the prompts."""
        pass

    def on_llm_new_token(self, token: str, **kwargs: Any) -> None:
        """Print out the token."""
        pass

    def on_llm_end(self, response: LLMResult, **kwargs: Any) -> None:
        """Collect token usage."""
        if response.llm_output is None:
            return None

        if "token_usage" not in response.llm_output:
            with self._lock:
                self.successful_requests += 1
            return None

        # compute tokens and cost for this request
        token_usage = response.llm_output["token_usage"]
        completion_tokens = token_usage.get("completion_tokens", 0)
        prompt_tokens = token_usage.get("prompt_tokens", 0)
        model_name = standardize_model_name(response.llm_output.get("model_name", ""))
        if model_name in MODEL_COST_PER_1K_TOKENS:
            completion_cost = get_openai_token_cost_for_model(
                model_name, completion_tokens, is_completion=True
            )
            prompt_cost = get_openai_token_cost_for_model(model_name, prompt_tokens)
        else:
            completion_cost = 0
            prompt_cost = 0

        # update shared state behind lock
        with self._lock:
            self.total_cost += prompt_cost + completion_cost
            self.total_tokens += token_usage.get("total_tokens", 0)
            self.prompt_tokens += prompt_tokens
            self.completion_tokens += completion_tokens
            self.system_fingerprint = response.llm_output.get("system_fingerprint", "")
            self.successful_requests += 1

    def __copy__(self) -> "OpenAICallbackHandler":
        """Return a copy of the callback handler."""
        return self

    def __deepcopy__(self, memo: Any) -> "OpenAICallbackHandler":
        """Return a deep copy of the callback handler."""
        return self

bechbd pushed a commit to bechbd/langchain that referenced this issue Mar 29, 2024
gkorland pushed a commit to FalkorDB/langchain that referenced this issue Mar 30, 2024
hinthornw pushed a commit that referenced this issue Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
03 enhancement Enhancement of existing functionality Ɑ: models Related to LLMs or chat model modules
Projects
None yet
Development

No branches or pull requests

3 participants