Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cli): log cache full when on debug level #3085

Merged
merged 9 commits into from Aug 3, 2023
9 changes: 9 additions & 0 deletions internal/cache/persistent_lru_cache.go
Expand Up @@ -37,6 +37,7 @@
storageProtection cacheprot.StorageProtection
sweep SweepSettings
timeNow func() time.Time
lastCacheWarning time.Time

description string

Expand Down Expand Up @@ -231,6 +232,13 @@
// MUST NOT go over the specified limit for the cache space to avoid
// snapshots/restores from getting affected by the cache's storage use.
if c.isCacheFullLocked() {
// Limit warnings to one per minute max.
if clock.Now().Sub(c.lastCacheWarning) > time.Minute {
c.lastCacheWarning = clock.Now()

log(ctx).Warnf("Cache is full, unable to add %v into cache.", key)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should make this Debug down the road.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be pretty noisy, even once per minute. Some workflows take several hours.

If the cache is full, are not we making the situation worse by filling up the logs ??

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shrekster yes, we should

  • reduce the frequency
  • add counters

Copy link
Collaborator

@Shrekster Shrekster Aug 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed this partly offline. I think the write rate could be reduced, but even better I'd propose we just log once and then mute.

If we can settle on logging this once then this doesn't even need to be a debug log and that could be more useful upfront than having to turn on debug logs on a second run.

The alternate question to "how much logging is bad logging?" is maybe "what frequency is useful ?"

IMO once is ok for this becuase logging this the second time is only telling me that nothing got expired, but what debugging info do I get from it ? If cache was full once and the perf was crappy then that probably explains it.

Is the second log going to add more info ? Yes!

Does that more info help ? I don't know... doesn't seem like yet

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Shrekster PTAL at #3193

}

Check warning on line 240 in internal/cache/persistent_lru_cache.go

View check run for this annotation

Codecov / codecov/patch

internal/cache/persistent_lru_cache.go#L235-L240

Added lines #L235 - L240 were not covered by tests

return
}
}
Expand Down Expand Up @@ -464,6 +472,7 @@
metricsStruct: initMetricsStruct(mr, description),
listCache: newContentMetadataHeap(),
timeNow: timeNow,
lastCacheWarning: time.Time{},
}

if c.timeNow == nil {
Expand Down