Reciords actual token counts for OpenAI and Anthropic reponses #6339

lonelycode · 2024-06-11T05:43:06Z

User description

This response middleware will extract token usage from. OpenAI and Anthropic APIs and provide them as context to be recorded in Analytics

Description

Adds res_handler_llm_reporting.mw
Middleware will:
- Search for openai or anthropic tags in the Spec to activate
- If present, will decode responses from the API into generic holders
- Will extract token count information from the response for reporting
- Creates three context entries: LLMResponseReporterInputTokens, LLMResponseReporterOutputTokens, LLMResponseReporterTotalTokens which hold the data for further processing.

Motivation and Context

Similarly to the other middleware, instead of providing data pre-proxy, this middleware will provide actuals from the vendor, this could be used for quota work, while the prior middleware could be used for rate limiting or flagging usage patterns.

How This Has Been Tested

Manual testing against the anthropic and OpenAI APIs

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Refactoring or add test (improvements in base code or adds test coverage to functionality)

Checklist

I ensured that the documentation is up to date
I explained why this PR updates go.mod in detail with reasoning why it's required
I would like a code coverage CI quality gate exception and have explained why

PR Type

Enhancement, Other

Description

Added new middleware LLMResponseReporter to handle and report token usage from OpenAI and Anthropic API responses.
Introduced constants and context functions to manage LLM response token counts.
Updated quickstart app configuration to include LLM response reporting.

Changes walkthrough 📝

Relevant files

Enhancement

ctx.go `Add constants for LLM response token reporting` ctx/ctx.go Added constants for LLM response token reporting.	+4/-0
api.go `Add functions to retrieve LLM response token counts` gateway/api.go Added functions to get LLM response token counts from the request context.	+21/-0
middleware.go `Integrate LLM response reporter middleware` gateway/middleware.go Added case for `llm_response_reporter` in response processor switch.	+3/-0
res_handler_llm_reporting.go `Implement LLMResponseReporter middleware for token usage reporting` gateway/res_handler_llm_reporting.go Created new LLMResponseReporter middleware. Implemented response handling for OpenAI and Anthropic APIs. Extracted and set token usage information in the request context.	+119/-0

Configuration changes

quickstart.json `Update quickstart app configuration for LLM response reporting` apps/quickstart.json Added tags for LLM and Anthropic. Configured response processor for LLM response reporting.	+9/-2

💡 PR-Agent usage:
Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

github-actions · 2024-06-11T05:43:44Z

API Changes

--- prev.txt	2024-06-11 05:43:44.281717256 +0000
+++ current.txt	2024-06-11 05:43:41.337705087 +0000
@@ -6860,6 +6860,10 @@
 	GraphQLIsWebSocketUpgrade
 	OASOperation
 
+	LLMResponseReporterInputTokens
+	LLMResponseReporterOutputTokens
+	LLMResponseReporterTotalTokens
+
 	// CacheOptions holds cache options required for cache writer middleware.
 	CacheOptions
 	OASDefinition
@@ -8790,6 +8794,26 @@
 
 func (l *LDAPStorageHandler) SetRollingWindow(keyName string, per int64, val string, pipeline bool) (int, []interface{})
 
+type LLMResponseReporter struct {
+	BaseTykResponseHandler
+	// Has unexported fields.
+}
+
+func (h *LLMResponseReporter) Base() *BaseTykResponseHandler
+
+func (h *LLMResponseReporter) Enabled() bool
+
+func (h *LLMResponseReporter) HandleError(rw http.ResponseWriter, req *http.Request)
+
+func (h *LLMResponseReporter) HandleResponse(rw http.ResponseWriter, res *http.Response, req *http.Request, ses *user.SessionState) error
+
+func (h *LLMResponseReporter) Init(c interface{}, spec *APISpec) error
+
+func (*LLMResponseReporter) Name() string
+
+type LLMResponseReporterOptions struct {
+}
+
 type LogMessageEventHandler struct {
 	Gw *Gateway `json:"-"`
 	// Has unexported fields.

github-actions · 2024-06-11T05:43:54Z

PR Reviewer Guide 🔍

⏱️ Estimated effort to review [1-5]	3
🧪 Relevant tests	No
🔒 Security concerns	No
⚡ Key issues to review	Possible Bug: Typo in the middleware name "llm_reponse_reporter" should be "llm_response_reporter".
	Error Handling: The `HandleResponse` method in `res_handler_llm_reporting.go` does not handle the case where the response body cannot be unmarshalled into the expected structures (`openAIResponse` or `anthropicResponse`). This could lead to unhandled errors if the response format is not as expected.
	Performance Concern: The method `HandleResponse` reads the entire response body into memory which might not be efficient for large responses. Consider streaming or chunk processing if applicable.

github-actions · 2024-06-11T05:44:32Z

PR Code Suggestions ✨

Category	Suggestion	Score
Best practice	Use the comma-ok idiom for type assertion to prevent potential runtime panics The type assertion `v.(int)` can cause a panic if `v` is not actually of type `int`. It's safer to use the comma-ok idiom to assert the type safely. gateway/api.go [3109-3110] -if v := r.Context().Value(ctx.LLMResponseReporterInputTokens); v != nil { - return v.(int) +if v, ok := r.Context().Value(ctx.LLMResponseReporterInputTokens).(int); ok { + return v } Suggestion importance[1-10]: 10 Why: Using the comma-ok idiom is a best practice in Go to prevent potential runtime panics, making the code more robust and reliable.	10
Possible bug	Correct the typo in the middleware handler case label There appears to be a typo in the case label for the middleware handler. It should be "llm_response_reporter" instead of "llm_reponse_reporter" to maintain consistency and avoid potential bugs due to misspelling. gateway/middleware.go [1025-1026] -case "llm_reponse_reporter": +case "llm_response_reporter": return &LLMResponseReporter{BaseTykResponseHandler: baseHandler} Suggestion importance[1-10]: 9 Why: Correcting the typo is crucial for the middleware handler to function correctly. Misspellings can lead to runtime errors and unexpected behavior, making this a high-priority fix.	9
Consistency	Update the handler name in the JSON configuration to match the corrected name in the Go code The handler name "llm_reponse_reporter" in the JSON configuration should match the corrected handler name in the Go code. Assuming the Go code is corrected to "llm_response_reporter", update this JSON configuration accordingly. apps/quickstart.json [37] -"name": "llm_reponse_reporter" +"name": "llm_response_reporter" Suggestion importance[1-10]: 8 Why: Ensuring consistency between the Go code and JSON configuration is important for the correct functioning of the system. This change aligns the configuration with the corrected handler name.	8
Enhancement	Implement error handling or logging in the `HandleError` method The `HandleError` method is implemented but does nothing. It's a good practice to at least log the error or handle it in some meaningful way to aid in debugging and maintain robustness. gateway/res_handler_llm_reporting.go [73-74] func (h LLMResponseReporter) HandleError(rw http.ResponseWriter, req http.Request) { + log.Println("Error handling response in LLMResponseReporter") } Suggestion importance[1-10]: 7 Why: Adding error logging or handling in the `HandleError` method improves debugging and robustness. While not critical, it enhances maintainability and provides useful information during failures.	7

sonarcloud · 2024-06-11T05:56:43Z

Quality Gate failed

Failed conditions
0.0% Coverage on New Code (required ≥ 80%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

Catch issues before they fail your Quality Gate with our IDE extension SonarLint

lonelycode added 2 commits June 11, 2024 08:58

wip

baa7ee2

added response middleware for reporting

815ae19

github-actions bot added enhancement Other labels Jun 11, 2024

github-actions bot added the Review effort [1-5]: 3 label Jun 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reciords actual token counts for OpenAI and Anthropic reponses #6339

Reciords actual token counts for OpenAI and Anthropic reponses #6339

lonelycode commented Jun 11, 2024 •

edited by github-actions bot

Loading

github-actions bot commented Jun 11, 2024

github-actions bot commented Jun 11, 2024

github-actions bot commented Jun 11, 2024

sonarcloud bot commented Jun 11, 2024

Reciords actual token counts for OpenAI and Anthropic reponses #6339

Are you sure you want to change the base?

Reciords actual token counts for OpenAI and Anthropic reponses #6339

Conversation

lonelycode commented Jun 11, 2024 • edited by github-actions bot Loading

User description

Description

Motivation and Context

How This Has Been Tested

Types of changes

Checklist

PR Type

Description

Changes walkthrough 📝

github-actions bot commented Jun 11, 2024

github-actions bot commented Jun 11, 2024

PR Reviewer Guide 🔍

github-actions bot commented Jun 11, 2024

PR Code Suggestions ✨

sonarcloud bot commented Jun 11, 2024

Quality Gate failed

lonelycode commented Jun 11, 2024 •

edited by github-actions bot

Loading