Predict token use before workflow #2343
eugeneboms
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Proposal
Consider a feature that estimates HVE agent/prompt/workflow invocation cost before it happens.
Justification
Listening through customer chat at the 06/23 HVE Hackathon, I heard the same concern repeated probably 3-4 times: if I run this, how many tokens will it burn? What would it cost me?
This is a very reasonable question to ask. Some customers have token limits on their subscriptions. Some models are expensive and invoking complicated workflows on them can cost a fortune. We do ask for cost estimate before car repair, why can't we do the same before an AI invocation?
Related: #1305
Challenges
Proposed Approach
Fortunately, we do not need a precise solution. People mostly care about avoiding catastrophic expenses, not predicting the exact figure. A heuristic that is correct within a ±2x error margin 95% of the time should likely be enough to meet their need.
Such a heuristic could hopefully be built as a classic ML model using a small number of relatively simple features: the agent or the prompt involved, its cyclomatic complexity, the customer’s intent, the size of the codebase and relevant data, the number of bullet points in the created memory file, the expected structure of the output, and, probably, the model’s token cost.
We can generate training data for the heuristic by running an agent that invokes different HVE agents, records the relevant features, and collects the observed costs.
Current State
Mostly an idea. I did test something similar though for individual prompts, so it is solvable at least in simple cases with well-defined response structure.
Beta Was this translation helpful? Give feedback.
All reactions