-
Notifications
You must be signed in to change notification settings - Fork 6k
Semantic search token usage tracking #63591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3b8e11b to
f39c72d
Compare
f39c72d to
ba98ce2
Compare
e2e tests failed on
|
| File | Test Name |
|---|---|
embedding-reproductions.cy.spec.js |
(flaky) issue 40660 > static dashboard content shouldn't overflow its container (#40660) |
| constraints: | ||
| nullable: false | ||
| - column: | ||
| name: total_tokens |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we not want to store in/out separately? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@piranha probably no need as it's an embedding model, so tracking total tokens is enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the logic seems fine, but I kind of want confirmation that total is fine and we don't want separate in/out figures.
| (is (= 1 (t2/count :model/SemanticSearchTokenTracking))) | ||
| (let [{:keys [request_type total_tokens]} | ||
| (t2/select-one :model/SemanticSearchTokenTracking)] | ||
| (is (= :index request_type)) | ||
| (is (= 13 total_tokens)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is by no means a request to change, but I personally find it easier to read =? stuff, like this:
(is (=? [{:request_type :index :total_tokens 13}]
(t2/select :model/SemanticSearchTokenTracking)))
* Pass type of embedding request * Initial migration * Add remaining opts args * Add SemanticSearchTokenTracking module * Connect token tracking to ai-service impl * Add test for token tracking writes. * Remove prompt_tokens including migration * Add usage trimmer job * Trimmer test * Activate trimmer job * test * Record tokens for openai * Update test * Use total-tokens * Comment * linter * Add index * Exclude from copy
* Pass type of embedding request * Initial migration * Add remaining opts args * Add SemanticSearchTokenTracking module * Connect token tracking to ai-service impl * Add test for token tracking writes. * Remove prompt_tokens including migration * Add usage trimmer job * Trimmer test * Activate trimmer job * test * Record tokens for openai * Update test * Use total-tokens * Comment * linter * Add index * Exclude from copy
* Pass type of embedding request * Initial migration * Add remaining opts args * Add SemanticSearchTokenTracking module * Connect token tracking to ai-service impl * Add test for token tracking writes. * Remove prompt_tokens including migration * Add usage trimmer job * Trimmer test * Activate trimmer job * test * Record tokens for openai * Update test * Use total-tokens * Comment * linter * Add index * Exclude from copy Co-authored-by: lbrdnk <lbrdnk@users.noreply.github.com>
* Pass type of embedding request * Initial migration * Add remaining opts args * Add SemanticSearchTokenTracking module * Connect token tracking to ai-service impl * Add test for token tracking writes. * Remove prompt_tokens including migration * Add usage trimmer job * Trimmer test * Activate trimmer job * test * Record tokens for openai * Update test * Use total-tokens * Comment * linter * Add index * Exclude from copy
Closes https://linear.app/metabase/issue/BOT-410/implement-token-usage-tracking-for-semantic-search
This PR adds tracking of tokens consumed by embeddings api calls initiated from semantic search. Number of tokens of every request is stored in appdb. The tables is trimmed every day, storing rolling 2 months of data.
This PR
:model/SemanticSearchTokenTracking,get-embedding...implementations to take opts,upsert-index!,query-index) with appropriate request:type, that is stored with token count in the new table,