---
# "Logging LLM Requests"
## "Get started by logging your LLM completion requests with HoneyHive."
This notebook is a companion to this [guide](https://docs.honeyhive.ai/quickstart/completions) <a target="_blank" href="https://colab.research.google.com/github/honeyhiveai/honeyhive-cookbook/blob/master/docs/notebooks/quickstart_completions_open_source.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>


### Introduction

Our Python SDK allows you to trace your log individual LLM requests as well as full pipeline traces. This allows you to monitor your LLM's performance and log user feedback and ground truth labels associated with it.

For an in-depth overview of how our logging data is structured, please see our [Logging Overview](/logging-overview) page.

### Get API key

After signing up on the app, you can find your API key in the [Settings](https://app.honeyhive.ai/settings/account) page under Account.

### Install the SDK

We currently support a native Python SDK. For other languages, we encourage using HTTP request libraries to send requests.



In [1]:
!pip install honeyhive -q


hhai



### Capture relevant details on your completion requests

This method allows you to log any arbitrary LLM requests on the client-side **without proxying your requests via HoneyHive servers**. Using this method, evaluation metrics such as custom metrics and AI feedback functions will be automatically computed based on the metrics you've defined and enabled in the [Metrics](https://app.honeyhive.ai/metrics) page. Learn more about defining evaluation metrics [here](/evaluation).

Let's start by running an OpenAI Chat Completion request and calculate a basic metric like latency.

<Note>We're using `OpenAI`, `Anthropic` and `Hugging Face` models in this guide to simply demonstrate how to log requests with HoneyHive. You can alternatively use the SDK to log any arbitrary model completion requests across other model providers such as `Cohere`, `AI21 Labs`, or your own custom, self-hosted models.</Note>



In [1]:
import honeyhive
from honeyhive.sdk.utils import fill_template
from transformers import AutoTokenizer # & appropriate model imports
import time

honeyhive.api_key = "HONEYHIVE_API_KEY"

hf_model_path = "dummy/model"

tokenizer = AutoTokenizer.from_pretrained(hf_model_path)
# using the huggingface model import
# model = 

USER_TEMPLATE = f"{HUMAN_PROMPT} Write me an email on {{topic}} in a {{tone}} tone.{AI_PROMPT}"
user_inputs = {
    "topic": "AI Services",
    "tone": "Friendly"
}
#"Write an email on AI Services in a Friendly tone."
user_message = fill_template(USER_TEMPLATE, user_inputs)

start = time.perf_counter()

# tokenize the inputs
# model_inputs = tokenizer(user_message)

# generate model completion
# generated_ids = model.generate(**model_inputs)

# decode the tokens
# decoded_output = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]

end = time.perf_counter()

request_latency = (end - start)*1000
generation = decoded_output
token_usage = {
    "completion_tokens": len(generated_ids),
    "prompt_tokens": len(model_inputs),
    "total_tokens": len(generated_ids) + len(model_inputs)
}


hhai



### Log your completion request

Now that you've run the request, let's try logging the request and some user metadata in HoneyHive. 

<Note>Adding a `session_id` field on the `metadata` field will enable session tracking on completions</Note>



In [1]:

gpu_cost_per_ms = 0.0000006
response = honeyhive.generations.log(
    project="Sandbox - Email Writer",
    source="staging",
    model=model_path,
    hyperparameters={
        # any additional arguments used when generating model completion
    },
    prompt_template=USER_TEMPLATE,
    inputs=user_inputs,
    generation=generation,
    metadata={
        "session_id": session_id  # Optionally specify a session id to track related completions
    },
    usage=token_usage,
    latency=request_latency,
    cost=request_latency * gpu_cost_per_ms,
    user_properties={
        "user_device": "Macbook Pro",
        "user_Id": "92739527492",
        "user_country": "United States",
        "user_subscriptiontier": "Enterprise",
        "user_tenantID": "Acme Inc."
    }
)


hhai


<br/>
<Note>Using this method, you will not be able to use our Prompt CI/CD capabilities within the platform or calculate cost and latency metrics automatically. In order to update prompts, you will need to manually update the prompt, model provider and hyperparamater settings in your codebase when deploying new variants to production.</Note>

### Log user feedback and ground truth labels

Now that you've logged a request in HoneyHive, let's try logging user feedback and ground truth labels associated with that completion. 

Using the `generation_id` that is returned, you can send arbitrary feedback to HoneyHive using the `feedback` endpoint.



In [1]:
from honeyhive.sdk.feedback import generation_feedback
generation_feedback(
    project="Sandbox - Email Writer",
    generation_id=response.generation_id,
    ground_truth="INSERT_GROUND_TRUTH_LABEL",
    feedback_json={
        "provided": True,
        "accepted": False,
        "edited": True
    }
)


hhai



### [Optional] Proxy requests via HoneyHive

Alternatively, you can automatically call the current deployed prompt-model configuration within a specified project without specifying all the parameters. Using this method, we automatically route your requests to the model-prompt configuration that is currently active within the platform and capture some basic metrics like cost and latency.

More documentation can be found on our [saved prompt generations API page](/api-reference/generations/post_saved).



In [1]:
import honeyhive

honeyhive.api_key = "HONEYHIVE_API_KEY"
honeyhive.openai_api_key = "OPENAI_API_KEY"

response = honeyhive.generations.generate(
    project="Sandbox - Email Writer",
    source="staging",
    input={
        "topic": "Model evaluation for companies using GPT-4",
        "tone": "friendly"
    },
)


hhai