Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add viewer limit #1296

Merged
merged 15 commits into from
Jun 21, 2024
Merged

Add viewer limit #1296

merged 15 commits into from
Jun 21, 2024

Conversation

leszko
Copy link
Contributor

@leszko leszko commented Jun 19, 2024

Related to: https://www.notion.so/livepeer/Cap-50k-Fishtank-viewers-a828ef8d037a4f27b9a6769d91c50b23

Related PRs:

Comments:

  • The solution is not 100% bullet-proof (we may encounter viewers exceeding the limit); that's done on purpose, because we prefer to allow a few viewers above the limit rather than increasing the TTFF. The reasons why viewers may be allowed above the limit:
    • data about the concurrent view count is note refresh (livepeer data takes data from Victoria Metrics nad it has some delay)
    • we cache the concurrent view count for 30s
    • we refresh the concurrent view count after allowing the user (because we don't want to make a sync call)
    • viewer limit may not be refreshed (10 min cache)
  • We should do some refactoring and changes (but I prefer to do it in a separate PRs), in particular:
    • rate limit can be removed and replaced with the viewer limit mechanism
    • parsing Gate API response should use the standard json umarshall mechanism

@leszko leszko marked this pull request as ready for review June 19, 2024 13:03
Copy link
Member

@victorges victorges left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one suggestion on the access control flow

Comment on lines +81 to +90
// ViewerLimitCache comes from Gate API
type ViewerLimitCache struct {
data map[string]*ViewerLimitCacheEntry
mux sync.RWMutex
}

type ViewerLimitCacheEntry struct {
ViewerLimitPerUser int32
UserID string
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do we need to cache these separately from the gate API response?
  • Shouldn't this have some LastRefresh or similar field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Do we need to cache these separately from the gate API response?

AFAIU we don't have one place to cache the whole Gate API response. We do have the cache for rate limit (but we plan to remove it so, I think it's better to keep it separately). Plus, I think it has its own logic, locking, etc. So it's fine as a separate cache.

  • Shouldn't this have some LastRefresh or similar field?

I don't believe we need it. The cache is refreshed anyway with every call to Gate API. So, the only reason we could have LastRefresh would be the cleanup. But TBH, I don't think it's needed. I know that technically it's a memory leak, but this structure is so small that it's not an issue. Especially comparing to all other places where we don't clean the mem. So, IMO, here it would be over-optimization.

@@ -452,6 +584,20 @@ func (g *GateClient) QueryGate(body []byte) (bool, GateConfig, error) {
}
refreshInterval = int32(refreshIntervalFloat64)
}
if ri, ok := result["viewer_limit_per_user"]; ok {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a regression, but any idea why we parse the JSON manually here instead of using json:"field_name" tags?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that was one of my comments in the PR description. I'll address if while removing the rate limit logic. In this Linear ticket: https://linear.app/livepeer/issue/ENG-2120/change-rate-limite-to-viewer-limit

Comment on lines 501 to 512
// cache viewer limit per user data
viewerLimitCache.mux.Lock()
if gateConfig.ViewerLimitPerUser != 0 && gateConfig.UserID != "" {
if _, ok := viewerLimitCache.data[playbackID]; !ok {
viewerLimitCache.data[playbackID] = &ViewerLimitCacheEntry{}
}
viewerLimitCache.data[playbackID].ViewerLimitPerUser = gateConfig.ViewerLimitPerUser
viewerLimitCache.data[playbackID].UserID = gateConfig.UserID
} else {
delete(viewerLimitCache.data, playbackID)
}
viewerLimitCache.mux.Unlock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, WDYT of changing PlaybackAccessControlEntry to store the GateConfig object instead and avoiding this separate cache? Then to get the viewer limit for a given playback ID (e.g. L378), we would instead do GetPlaybackAccessControlInfo (which can change to return to be the full PlaybackAccessControlEntry, not just the boolean Allow).

This would mean we change the viewer limit check to after L328, which IMO makes more sense: we only check the viewer limit after fetching it (rn from what I could tell, the very first playback won't be checked against the limit right? the separate cache only gets initialized after it is used)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that could work this way. Though, I'm not sure I want to touch that flow right now.

This code is quite messy and very sensitive, because: 1) It makes a blocking call 2) It affects all viewers for all customers. Also, the rate limit is implemented with its own cache, so it would still be inconsistent. I'll reconsider it while removing rate limit code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool. Yeah, I think this only makes sense if the rate_limit gets refactored too. Let's keep it separate.

handlers/accesscontrol/data_client.go Outdated Show resolved Hide resolved
@leszko leszko merged commit 92aa6dd into main Jun 21, 2024
10 checks passed
@leszko leszko deleted the rafal/viewer-limit branch June 21, 2024 07:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants