New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
on windows 10, devspace keeps exiting and throwing an error after running fine for a few minutes. #341
Comments
After doing So I believe this is happening for some reason during the sync process, when there is a large volume of files to sync. |
@KaelBaldwin I think we need more information to reproduce this issue, as it does not happen currently on our machines. I'm not completely sure where this error originates from in the source code. Is there any error in the sync.log when this happens? Is there a fatal message in the console? |
@FabianKramm No fatal message or anything in the console, other than I'm suddenly disconnected from the container. The error that was in the error.log is in my original post, but there isn't anything very helpful in the sync.log other than a "Sync stopped" message. I suspect it only occurs when there is a lot of data to sync? If your repo isn't as large it might not occur. Maybe I could create a public repo with a ton of files and try to reproduce it with that. It very well could be something specific to my machine, or even a temporary connectivity issue with the server it is syncing to. Might I suggest just making the sync process more robust? Would there be any drawbacks to having it retry to continue syncing if there is an error that causes the sync to quit? There must be some sort of panic going on somewhere that's causing devspace to quit, but it is weird that the console isn't getting any error message. I can try to debug this locally and see exactly where it panics out, I'll look into it. |
Nevermind, I just had it happen again with me just leaving the console running for a while. (~20 minutes) Going to try debugging in goland. |
@KaelBaldwin Thanks for all the information! I analyzed the issue a bit and the errors you see in the errors.log actually originates from the portforwarding functionality directly in the kubectl package and is not related to the sync functionality. If you don't see any Since you don't see any error message prior to the terminal exiting, I suspect two possible causes for this behavior:
I guess we need some more information from your side about the circumstances this keeps happening, because I'm not really sure why the terminal should exit without any error and we cannot reproduce it unfortunately. |
Sure I'll keep looking into it, thanks |
So noticed today while starting it up, that I got kicked out again. The first time I got kicked out, I decided to run exec into the container via kubectl as you mentioned while also running devspace up in another terminal. I ran devspace up first and it connected fine. I ran kubectl exec and it failed to connect. I then checked the devspace terminal and it did indeed disconnect. So this does seem to be a connectivity issue. I wonder if the data transfer going on during the sync process is causing a timeout that makes devspace give up it's connection somewhere and kick out. FYI this is a on premises bare metal cluster so connectivity to the nodes should be fine. |
I have just discovered that kubectl exec, if established before devspace disconnects will persist through the disconnect though devspace does not. |
So basically I think this goes down to whether or not you all want to try to make devspace more robust to temporarily stalled connections. If that's not feasible, feel free to close this issue and I'll just deal with the disconnects. |
@KaelBaldwin Thanks for all this detailed information! Regarding your last question, we definitively want to make devspace more robust. You've got really interesting results, because we internally already call the kubectl exec function (The call is in pkg/devspace/kubectl/client.go and the kubectl function is exec.Stream from the kubernetes project (see kubernetes). So this is really weird behavior and I guess as you already assumed is somehow caused by a side-effect of the port-forwarding/sync services. To verify this assumption can I ask you to do another test for us? Could you run |
Sure thing, I'm trying it out now |
@FabianKramm they all failed this time, including kubectl exec, which had persisted every other time. I did some monitoring after getting that result: Looks like the data transfer is spiking very high at a certain point, which is impressive! Haha. I'm thinking what's happening here is my network interface is getting 100% used by the sync and losing connections. Further looking into it, I have a rather large file in my repo at the moment that I was using for test data. I'm betting when it gets transferred it's causing the overload. |
That file might be what's been the problem all along. Removing it and seeing if that resolves everything. |
ok, yeah things were much more tame after removing the file. Highest it got up to was around 70% usage, and no disconnects. |
@KaelBaldwin Thanks for this information! I'm still wondering why exactly this happens though, because the sync opens only two kubectl exec shells to sync all files and I don't really understand why a high network usage would result in the os dropping network connections, that somehow sounds odd, because an usual browser file down/upload is normally also unrestricted in bandwidth usage. I would suspect that maybe the remote provider has a bandwidth limit and somehow aborts the connections if they use up too much bandwidth. I'm not sure how to test that though. |
But I guess restricting the sync to use only a certain amount of network bandwidth would probably solve your problem. So maybe we can implement a feature where you can specify an upper limit for the sync to use. |
@FabianKramm as far as the remote provider goes, this is a baremetal kubernetes cluster so I'm on the same network |
I do think your idea for being able to specify an upper limit would resolve it. As to if the OS is dropping connections or not, I'm not familiar enough with it's handling to speculate on that, other than maybe it doesn't drop the connections, but perhaps there is enough of a delay for kubectl exec to reach a timeout and disconnect. |
@KaelBaldwin Okay I'll open a new issue for that and close this one. That is also the only solution I see currently that we can implement, while we cannot exactly find out who and why the connections are getting closed. EDIT: Should you find out, why the connections are getting closed and that there is a better solution, feel free to reopen the issue |
What happened?
devspace up
.devspace up
again to reconnect and keep working, but it keeps happening.{"level":"error","msg":"Runtime error occurred: error copying from remote stream to local connection: readfrom tcp4 127.0.0.1:61572-\u003e127.0.0.1:61574: write tcp4 127.0.0.1:61572-\u003e127.0.0.1:61574: wsasend: An established connection was aborted by the software in your host machine.","time":"2018-10-31T11:11:26-05:00"}
What did you expect to happen instead?
I should be able to keep working in the container however long I need to.
How can we reproduce the bug? (as minimally and precisely as possible)
Follow the same steps I did in the "What happened" section.
Local Environment:
Kubernetes Cluster:
Anything else we need to know?
This might be happening during the sync operation, I'm not sure, it seems more stable after everything is synced up. I'll update further when I'm more sure about that.
/kind bug
The text was updated successfully, but these errors were encountered: