Create an LLM usage debug plugin

Create a plugin that writes LLM usage information to a file as it intercepts LLM requests and responses. This information is helpful to understand token usage over time and how it might lead to throttling.

- [x] triggered only for LLM requests (use the same detection method as in the OpenAITelemetryPlugin
- [x] for an intercepted LLM response gets the following information:
  - [x] `time` (response.headers.date)
  - [x] `status` (http status code)
  - [x] `retry_after` (response.headers.retry-after)
  - [x] `policy` (response.headers.policy-id, useful to understand why throttling occurred)
  - [x] `prompt_tokens` (response.body.usage.prompt_tokens)
  - [x] completion_tokens (response.body.usage.completion_tokens)
  - [x] cached_tokens (response.body.usage.prompt_tokens_details.cached_tokens)
  - [x] total_tokens (response. body.usage.total_tokens)
  - [x] remaining_tokens (response.headers.x-ratelimit-remaining-tokens)
  - [x] remaining_requests (response.headers.x-ratelimit-remaining-requests)
- [x] each time the plugin intercepts a response, it gathers the information and appends it to a file named `devproxy-llm-usage.csv`. On startup, it checks if a file with that name already exists. If it does, it appends the current date and time until it finds a unique name. The plugin stores the file name to use while Dev Proxy is running.
- [x] if the file doesn't exist, the plugin creates it including the headers and the first line of information

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create an LLM usage debug plugin #1413

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create an LLM usage debug plugin #1413

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions