Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
Performance tuning: scale_from_zero #1076
When a function is not in the scaling cache and concurrent requests arrive, we should lock on a Mutex so that we don't make too many calls to the back-end to query the current amount of replicas
We could have a "thundering herd" situation where all 1000 +/- requests call to the back end until the cache is updated for subsequent calls.
We could use a RWMutex in the scaler or in the cache for each unique function name rather than one for the whole cache.
Do you want, we only call GetReplicas once and only scaling one time (this log scale from 0 to 1, 5 times)?