Predict token use before workflow #2343

eugeneboms · 2026-07-01T17:56:43Z

eugeneboms
Jul 1, 2026
Collaborator

Proposal

Consider a feature that estimates HVE agent/prompt/workflow invocation cost before it happens.

Justification

Listening through customer chat at the 06/23 HVE Hackathon, I heard the same concern repeated probably 3-4 times: if I run this, how many tokens will it burn? What would it cost me?

This is a very reasonable question to ask. Some customers have token limits on their subscriptions. Some models are expensive and invoking complicated workflows on them can cost a fortune. We do ask for cost estimate before car repair, why can't we do the same before an AI invocation?

Related: #1305

Challenges

Many flows depend on customer responses. Some end after just one Q&A, other could become a 30-turn conversations. We can still try to provide estimates per turn. Or at least warn the users if a turn is likely to become particularly expensive.
The second challenge is more interesting. Measuring the cost of an arbitrary work without running that work is analogous to solving the Halting Problem, which cannot be done algorithmically in the general case. Therefore, whatever we would do would be a heuristic, at best.
Converting token costs into dollar costs may not be trivial as token pricing could depend on the details on the specific group subscriptions.

Proposed Approach

Fortunately, we do not need a precise solution. People mostly care about avoiding catastrophic expenses, not predicting the exact figure. A heuristic that is correct within a ±2x error margin 95% of the time should likely be enough to meet their need.

Such a heuristic could hopefully be built as a classic ML model using a small number of relatively simple features: the agent or the prompt involved, its cyclomatic complexity, the customer’s intent, the size of the codebase and relevant data, the number of bullet points in the created memory file, the expected structure of the output, and, probably, the model’s token cost.

We can generate training data for the heuristic by running an agent that invokes different HVE agents, records the relevant features, and collects the observed costs.

Current State

Mostly an idea. I did test something similar though for individual prompts, so it is solvable at least in simple cases with well-defined response structure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Predict token use before workflow #2343

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Predict token use before workflow #2343

Uh oh!

Uh oh!

eugeneboms Jul 1, 2026 Collaborator

Replies: 0 comments

eugeneboms
Jul 1, 2026
Collaborator