-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kungfu job is hang in a inconsistent version when i scale down/up mutiple times #297
Comments
@zrss could you also share the |
What are the flags that you passed to
I suspect the it should be set to |
thanks for the reply ~
kungfu-run params are all the same among my scale up/down cases ... |
@lgarithm , can i make this conclusion
|
@lgarithm can we just set the init-version to -1,not only for the newly added kungfu-run,but also for the first generation kungfu-run. It is hard for cluster to distinguish wether it is the first generation. |
Yes, this is correct. |
We can consider this as future improvement. But currently I can't think of how to do it in a clean way. |
If you can manually initialize the first generation |
i.e. start the first generation KungFu/srcs/go/kungfu/peer/peer.go Lines 191 to 205 in 06d742e
in your cluster manager. |
@lgarithm thanks for the reply, i'd like to try, to clarify in our current arch, a host file (the file only records the ip of containers) is generated by cluster manager, and we import a the the cluster manager will update the host file and bootstrap (shutdown) the new container when we scale up/down the kungfu-job
the
then
|
What if the config.json restored to the origin after two scaling operations?
|
How about add a version field in the |
we (platform) should limit the number of instances that cannot be smaller than the default value when scaling down, and this can simplify the scene
good idea, we can post a feature request to cluster manager for adding a |
scale up from 1 instance to 2 instances
A container log
B container log
A/B running well
scale down from 2 instance to 1 instances
A container log
B container log
A running well, B is closed
scale up to 2 instances again
A container log
i found the runner of A is exited as exit on error: inconsistent update detected at 7031490:/home/work/KungFu/srcs/go/kungfu/runner/handler.go:102
B container log
now A/B is hang ...
currently, can kung-fu support my test case ? or how should i handle this case ...
The text was updated successfully, but these errors were encountered: