New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance tuning: scale_from_zero #1076

Open
alexellis opened this Issue Feb 1, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@alexellis
Copy link
Member

alexellis commented Feb 1, 2019

Expected Behaviour

When a function is not in the scaling cache and concurrent requests arrive, we should lock on a Mutex so that we don't make too many calls to the back-end to query the current amount of replicas

Current Behaviour

We could have a "thundering herd" situation where all 1000 +/- requests call to the back end until the cache is updated for subsequent calls.

Possible Solution

We could use a RWMutex in the scaler or in the cache for each unique function name rather than one for the whole cache.

https://github.com/openfaas/faas/blob/master/gateway/scaling/function_scaler.go#L42

https://github.com/openfaas/faas/blob/master/gateway/scaling/function_cache.go#L49

@wahyuoi

This comment has been minimized.

Copy link
Contributor

wahyuoi commented Feb 11, 2019

Hi @alexellis
I tried to repro this one and logs like this

func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.004336s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.004380s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.012845s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.012889s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.013766s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.010717s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 [Scale] function=go-echo 0 => 1 successful - 5.687057 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.010781s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 [Scale] function=go-echo 0 => 1 successful - 5.686727 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.003448 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.003748 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.002102 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.001983 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.001967 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.001708 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.014082s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.014079s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 [Scale] function=go-echo 0 => 1 successful - 5.704287 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 [Scale] function=go-echo 0 => 1 successful - 5.704139 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.001374 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 Forwarded [GET] to /function/go-echo - [200] - 0.001679 seconds
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 GetReplicas took: 0.014460s
func_gateway.1.tv6jt4q9cmk4@pan    | 2019/02/11 11:43:21 [Scale] function=go-echo 0 => 1 successful - 5.706875 seconds

Do you want, we only call GetReplicas once and only scaling one time (this log scale from 0 to 1, 5 times)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment