New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BlockPut context canceled #1192
Comments
|
Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
Finally, remember to use https://discuss.ipfs.io if you just need general support. |
|
Weird... this looks like the request gets cancelled very soon after starting. As far as I can tell, the context getting cancelled comes all the way from the HTTP request, so I am suspecting something is not right with ipfs-deploy (or the ipfs-cluster-api library that it uses, since that has been giving people problems before (ipfs-cluster/js-cluster-client#3)). Is there any chance you can inspect the /add request and the JS console when adding (developer tools?)? Even a tcpdump might be useful to see exactly what is getting posted, without any other info.. but I suspect it's the js-cluster-client library. Does it work with a single file? |
|
Thanks for quick response and the input! Can you shed any light onto what exactly "context canceled" means? Is the connection getting reset, or is there something else to it? I couldn't find a lot of information on this except that it probably comes from net/http. I didn't really suspect the client on this, assuming it was happening between ipfs-cluster and the ipfs daemon. I'm trying to narrow it down with a bare-minimum script to repro. I'll make sure to try with ipfs-cluster-client separately too. I can't currently inspect the request since it's being done server-side(nodejs) but when I get this repro script working I'll figure something out to get the raw HTTP request. |
|
The http request that we handle has a "context" associated. If the request dies or is interrupted the context is "cancelled". That context is passed along as cluster does his things (like triggering other requests). When it comes to trigger a blockPut request it realizes that the context it's trying to use was cancelled, so it errors and interrupts the whole process. It is basically aborting things because the things it's trying to do are linked to a request that somehow stopped. We use |
|
It wasn't the client cancelling it, but an nginx reverse proxy sitting in front of ipfs-cluster. If anyone stumbles across this issue with a similar issue, I turned off proxy request buffering. That way, nginx doesn't read the entire file being added into memory before sending it along to ipfs-cluster. Still not sure if it was a timeout issue or buffer exhaustion, but turning that off cleared up this issue. Thanks, @hsanjuan for the background information. That was a lot of help. |
|
Might be related to ipfs/kubo#6402 . nginx is broken for anything that streams a response without having read the full request. If it works now it might just be because of timing (so essentially getting lucky). |
mikeshultz commentedJul 15, 2020
Additional information:
ipfs/ipfs-cluster:v0.13.0)Describe the bug:
Using the
ipfs-deploypackage, I'm trying to add a directory of files, which appears to call/addwith all the files. This appears to fail during a call to the ipfs daemon at/aip/v0/block/putbut has no useful information about the error that occurred, even with debug logging.Furthest I got investigating the source of the error was to this line which appears to be the source of the
context canceledmessage in the ipfs-cluster code:https://github.com/ipfs/ipfs-cluster/blob/d2a83e45f1ad5f84a3eae70b46fb9b521b90ab03/adder/util.go#L58
It doesn't seem to happen with the first call either. Here's an example of debug log output during this request:
Connectivity seems fine between ipfs-cluster and the ipfs daemon (different containers, same kube pod). The error occurs whether or not HTTP basic auth is enabled. This may be an error on my cluster configuration, but I'm having trouble narrowing down the cause of this. Any ideas?
The text was updated successfully, but these errors were encountered: