Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

api: Reduce amount of updates done by DB metrics logic #1179

Merged
merged 16 commits into from
Aug 2, 2022

Conversation

victorges
Copy link
Member

@victorges victorges commented Jul 30, 2022

What does this pull request do? Explain your changes. (required)

This to reduce the amount of UPDATE queries that we do in our database. These queries
are performed mostly for some "metric" functionalities that we created on top of it, mainly:

  • 100k/hour: The logic to track the lastSeen timestamp of both users and API keys (but especially API keys)
  • 600k/hour: The logic to track the amount of transcoded segments that have been streamed a given Stream object.

The goal of this pull request is to fix both of those update logics by creating some "buffer" in memory
and then combining updates done on the database.

The expectation is that this will also fix a couple of issues we've been having with our databases regarding
replication lags everytime a VACCUUM operation is run on the streams table. By doing 30x less updates
on it we'll hopefully be able to tune it better and avoid disruptions in replication.

Specific updates (required)

  • Make the stream-info-service not send updates for every transcoded segment. Keep them in memory instead and only flush each record every 60 seconds instead.
  • Make tracking not send a transaction on the database on every observation of the API key. Instead, keep the last seen value in memory and flush every 60s as well.
  • Make sure to flush the metrics on process termination (so added some cleanup calls on SIGTERM handler)
  • Make add and set into a single query on stream-info-service

-

  • How did you test each of these updates (required)
    Check metrics are still updated on the database with these versions running.

Does this pull request close any open issues?
Hopefully fixes https://github.com/livepeer/livepeer-infra/issues/851

Checklist:

  • I have read the CONTRIBUTING document.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have added tests to cover my changes.

@vercel
Copy link

vercel bot commented Jul 30, 2022

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Updated
livepeer-studio ✅ Ready (Inspect) Visit Preview Sep 2, 2022 at 2:44PM (UTC)

@codecov
Copy link

codecov bot commented Aug 1, 2022

Codecov Report

Merging #1179 (55449ee) into master (7d4f42e) will increase coverage by 0.11769%.
The diff coverage is 67.74194%.

Impacted file tree graph

@@                 Coverage Diff                 @@
##              master       #1179         +/-   ##
===================================================
+ Coverage   50.39654%   50.51423%   +0.11768%     
===================================================
  Files             66          66                 
  Lines           4161        4181         +20     
  Branches         736         740          +4     
===================================================
+ Hits            2097        2112         +15     
- Misses          1816        1820          +4     
- Partials         248         249          +1     
Impacted Files Coverage Δ
packages/api/src/store/table.ts 66.89189% <0.00000%> (-1.38397%) ⬇️
packages/api/src/middleware/tracking.ts 80.95238% <80.95238%> (ø)
packages/api/src/index.ts 50.00000% <100.00000%> (+0.90908%) ⬆️
packages/api/src/middleware/auth.ts 87.61905% <100.00000%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7d4f42e...55449ee. Read the comment docs.

First steps towards creating the flush logic on shutdown.
@victorges victorges changed the title Vg/fix/update machinegun api: Reduce amount of updates done by DB metrics logic Aug 1, 2022
@victorges victorges marked this pull request as ready for review August 1, 2022 19:59
@victorges victorges requested a review from a team as a code owner August 1, 2022 19:59
These update some other properties in the stream like
isActive etc, so we don't want to change those.

Just make sure that we do flush the updates before deleting
the entries.
Apparently the primary key is not really suitable for that,
just changing that makes some queries much faster.

This is unrelated to the change here, just including it cause
I'm already touching the context.
Avoids a round-trip and a query in the DB.
q.append(sql`) WHERE id = ${id}`);
q.append(`)`);
if (set) {
q.append(sql` || ${JSON.stringify(set)}`);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thinking here is just to do as much as possible in one query?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah exactly! This small change reduces the number of transactions/updates in half as well! This was slightly necessary since we didn't get as much of a reduction only by buffering the updates. I expected 30x on the best case, but it was something like ~8x instead. With this we get ~16x less transactions which I think will be enough for now.

@@ -15,7 +15,7 @@ import {
FieldSpec,
} from "./types";

const DEFAULT_SORT = "id ASC";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FTR: This was not really worth it. It (luckily) broke tests and when I went to do some further testing I noticed it's not really making queries faster (sometimes they're worse, cause not all objects have an indexed data->>'id'). Let's stick to just the id column.

Copy link
Member

@gioelecerati gioelecerati left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@victorges victorges merged commit 6814be1 into master Aug 2, 2022
@victorges victorges deleted the vg/fix/update-machinegun branch August 2, 2022 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants