-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api: Reduce amount of updates done by DB metrics logic #1179
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
624de1b
to
637b9ef
Compare
Codecov Report
@@ Coverage Diff @@
## master #1179 +/- ##
===================================================
+ Coverage 50.39654% 50.51423% +0.11768%
===================================================
Files 66 66
Lines 4161 4181 +20
Branches 736 740 +4
===================================================
+ Hits 2097 2112 +15
- Misses 1816 1820 +4
- Partials 248 249 +1
Continue to review full report at Codecov.
|
First steps towards creating the flush logic on shutdown.
Will be called on clean up
a51b716
to
94d59cc
Compare
42e06f6
to
c63a23e
Compare
These update some other properties in the stream like isActive etc, so we don't want to change those. Just make sure that we do flush the updates before deleting the entries.
Apparently the primary key is not really suitable for that, just changing that makes some queries much faster. This is unrelated to the change here, just including it cause I'm already touching the context.
Avoids a round-trip and a query in the DB.
q.append(sql`) WHERE id = ${id}`); | ||
q.append(`)`); | ||
if (set) { | ||
q.append(sql` || ${JSON.stringify(set)}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thinking here is just to do as much as possible in one query?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah exactly! This small change reduces the number of transactions/updates in half as well! This was slightly necessary since we didn't get as much of a reduction only by buffering the updates. I expected 30x on the best case, but it was something like ~8x instead. With this we get ~16x less transactions which I think will be enough for now.
This reverts commit 0d6d759.
@@ -15,7 +15,7 @@ import { | |||
FieldSpec, | |||
} from "./types"; | |||
|
|||
const DEFAULT_SORT = "id ASC"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR: This was not really worth it. It (luckily) broke tests and when I went to do some further testing I noticed it's not really making queries faster (sometimes they're worse, cause not all objects have an indexed data->>'id'
). Let's stick to just the id
column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What does this pull request do? Explain your changes. (required)
This to reduce the amount of
UPDATE
queries that we do in our database. These queriesare performed mostly for some "metric" functionalities that we created on top of it, mainly:
100k/hour
: The logic to track thelastSeen
timestamp of both users and API keys (but especially API keys)600k/hour
: The logic to track the amount of transcoded segments that have been streamed a given Stream object.The goal of this pull request is to fix both of those update logics by creating some "buffer" in memory
and then combining updates done on the database.
The expectation is that this will also fix a couple of issues we've been having with our databases regarding
replication lags everytime a VACCUUM operation is run on the
streams
table. By doing 30x less updateson it we'll hopefully be able to tune it better and avoid disruptions in replication.
Specific updates (required)
stream-info-service
not send updates for every transcoded segment. Keep them in memory instead and only flush each record every 60 seconds instead.tracking
not send a transaction on the database on every observation of the API key. Instead, keep the last seen value in memory and flush every 60s as well.SIGTERM
handler)add
andset
into a single query onstream-info-service
-
Check metrics are still updated on the database with these versions running.
Does this pull request close any open issues?
Hopefully fixes https://github.com/livepeer/livepeer-infra/issues/851
Checklist: