Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[usage] Fix usage report overflow #10583

Merged
merged 1 commit into from
Jun 10, 2022
Merged

[usage] Fix usage report overflow #10583

merged 1 commit into from
Jun 10, 2022

Conversation

easyCZ
Copy link
Member

@easyCZ easyCZ commented Jun 10, 2022

Description

My testing dataset was not large enough, and we'd run into int64 overflows for very long running instances. Using Seconds as the baseline unit fixes that.

Related Issue(s)

Fixes #

How to test

Unit tests

Release Notes

NONE

Documentation

NONE

@easyCZ easyCZ changed the title [usage] Fix report generation [usage] Fix usage report overflow Jun 10, 2022
@easyCZ easyCZ marked this pull request as ready for review June 10, 2022 13:45
@easyCZ easyCZ requested a review from a team June 10, 2022 13:45
@github-actions github-actions bot added the team: webapp Issue belongs to the WebApp team label Jun 10, 2022
@@ -131,16 +131,16 @@ func (u *UsageReconciler) ReconcileTimeRange(ctx context.Context, from, to time.
func generateUsageReport(teams []teamWithWorkspaces, maxStopTime time.Time) ([]TeamUsage, error) {
var report []TeamUsage
for _, team := range teams {
var teamTotalRuntime time.Duration
var teamTotalRuntime float64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the overkill floating-point precision on seconds? 🤔 Rounded-up seconds would probably be precise enough, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to not introduce any rounding before we discuss it and capture so that we can then explain to customers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Credits are likely to map to hours anyway. I think rounding to the next second is extremely fine. 👍

Plus, it's a lot more readable for humans, and doesn't have any built-in micro-imprecision (remember that 0.1 + 0.2 === 0.30000000000000004 🙈)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do that in a follow-up PR.

Copy link
Contributor

@jankeromnes jankeromnes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmf, approving to unblock progress, but I really don't think we want float64 precision on our seconds.

Please switch to rounding runtimes to the next second in a follow-up PR. 🙏

Report: []TeamUsage{
{
TeamID: teamID.String(),
WorkspaceSeconds: 9.223372036854766e+11,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is completely unreadable 🙄

Comment on lines 48 to 52
start := i.StartedTime.Time()
stop := maxStopTime
if i.StoppedTime.IsSet() {
stop = i.StoppedTime.Time()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Drive-by comment: So the maxStopTime logic makes sense (if we ask for the usage on 2nd day of month, and some instance is not stopped yet, we can count like if it was stopped right now to get the current usage up to now).

However, you also need to trim:

  • i.CreationTime to MAX(beginningOfMonth, i.CreationTime)
  • any existing i.StoppedTime to MIN(maxStopTime, i.StoppedTime)

The reason is that, if the instance has been running through May and into June, and we're currently computing the usage of June -- we don't want to double-count any runtime from May (which was presumably already counted and invoiced).

Same for running a usage report retroactively -- if an instance was started in April, and ran well into May, and you're getting the usage report of April -- you only want the usage that happened in April.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trimming already happens here, after the workspace instances are loaded and validated.

Here, it is a general purpose method which works on the data on the struct, the trimming is specific to the usage report for billing and therefore done there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha, cool! 🎯

if instance.CreationTime.Time().Before(maximumStart) {
instance.CreationTime = db.NewVarcharTime(maximumStart)
}
if instance.StoppedTime.Time().After(minimumStop) {
instance.StoppedTime = db.NewVarcharTime(minimumStop)
}

I guess these should actually be called minimumStart and maximumStop 😅 but the logic looks correct. 👍

@roboquat roboquat merged commit e3f9ca5 into main Jun 10, 2022
@roboquat roboquat deleted the mp/usage/report-fixes branch June 10, 2022 14:13
@easyCZ
Copy link
Member Author

easyCZ commented Jun 10, 2022

Follow-up in #10591

@roboquat roboquat added deployed: webapp Meta team change is running in production deployed Change is completely running in production labels Jun 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deployed: webapp Meta team change is running in production deployed Change is completely running in production release-note-none size/L team: webapp Issue belongs to the WebApp team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants