Skip to content

Conversation

@achingbrain
Copy link
Member

@achingbrain achingbrain commented May 7, 2020

Builds on the work in #221

I see intermittant but frequent CI errors with js-ipfs, usually after a test has finished and the nodes are being torn down. Getting it to do some additional logging reveals bitswap is crashing when it cannot send a message to a remote peer due to the libp2p dial failing:

ipfs: [stdout] Error: stream ended before 1 bytes became available
ipfs:     at /home/travis/build/ipfs/js-ipfs/node_modules/it-reader/index.js:37:9
ipfs:     at processTicksAndRejections (internal/process/task_queues.js:97:5)
ipfs:     at async /home/travis/build/ipfs/js-ipfs/node_modules/it-length-prefixed/src/decode.js:80:20
ipfs:     at async oneChunk (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/multistream.js:12:20)
ipfs:     at async Object.exports.read (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/multistream.js:34:15)
ipfs:     at async module.exports (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/select.js:21:19)
ipfs:     at async ClassIsWrapper.newStream [as _newStream] (/home/travis/build/ipfs/js-ipfs/node_modules/libp2p/src/upgrader.js:251:40)
ipfs:     at async ClassIsWrapper.newStream (/home/travis/build/ipfs/js-ipfs/node_modules/libp2p-interfaces/src/connection/connection.js:172:34)
ipfs:     at async Network.sendMessage (/home/travis/build/ipfs/js-ipfs/node_modules/ipfs-bitswap/src/network.js:147:34)
ipfs:     at async DecisionEngine._processTasks (/home/travis/build/ipfs/js-ipfs/node_modules/ipfs-bitswap/src/decision-engine/index.js:124:5) {
ipfs:   code: 'ERR_UNSUPPORTED_PROTOCOL',
ipfs:   buffer: BufferList { _bufs: [], length: 0 }
ipfs: }

This PR adds a try/catch around network send operations. At the moment it just dumps the request, I'm not sure if we want to add a retry in there or something.

When tasks are added to an existing list of tasks for a given peer,
we need to sort the queue to ensure the order is correct, otherwise
we never process pending tasks as task lists with pending tasks need
to be moved up the queue.

Fixes the build problems exposed in ipfs/js-ipfs#2992

Also upgrade aegir to a safe version.
Builds on the work in #221

I see intermittant but frequent CI errors with js-ipfs, usually after
a test has finished and the ndoes are being torn down.  Getting it
to do some additional logging reveals bitswap is crashing when it
cannot send a message to a remote peer due to the libp2p dial failing:

```console
ipfs: [stdout] Error: stream ended before 1 bytes became available
ipfs:     at /home/travis/build/ipfs/js-ipfs/node_modules/it-reader/index.js:37:9
ipfs:     at processTicksAndRejections (internal/process/task_queues.js:97:5)
ipfs:     at async /home/travis/build/ipfs/js-ipfs/node_modules/it-length-prefixed/src/decode.js:80:20
ipfs:     at async oneChunk (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/multistream.js:12:20)
ipfs:     at async Object.exports.read (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/multistream.js:34:15)
ipfs:     at async module.exports (/home/travis/build/ipfs/js-ipfs/node_modules/multistream-select/src/select.js:21:19)
ipfs:     at async ClassIsWrapper.newStream [as _newStream] (/home/travis/build/ipfs/js-ipfs/node_modules/libp2p/src/upgrader.js:251:40)
ipfs:     at async ClassIsWrapper.newStream (/home/travis/build/ipfs/js-ipfs/node_modules/libp2p-interfaces/src/connection/connection.js:172:34)
ipfs:     at async Network.sendMessage (/home/travis/build/ipfs/js-ipfs/node_modules/ipfs-bitswap/src/network.js:147:34)
ipfs:     at async DecisionEngine._processTasks (/home/travis/build/ipfs/js-ipfs/node_modules/ipfs-bitswap/src/decision-engine/index.js:124:5) {
ipfs:   code: 'ERR_UNSUPPORTED_PROTOCOL',
ipfs:   buffer: BufferList { _bufs: [], length: 0 }
ipfs: }
```

This PR adds a try/catch around network send operations.  At the moment
it just dumps the request, I'm not sure if we want to add a retry in
there or something.
@achingbrain achingbrain requested review from dirkmc and hugomrdias May 7, 2020 10:19
@achingbrain achingbrain changed the title Fix/survive bad network requests fix: survive bad network requests May 7, 2020
Copy link
Contributor

@dirkmc dirkmc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a network error, then the client should reconnect, and resend all its wants anyway, so I think it's ok to just log the error as you're doing here 👍

Could you rebase onto master now that I've merged #221

@achingbrain
Copy link
Member Author

Could you rebase onto master

Done! Well, merged I've master into this PR..

@dirkmc dirkmc merged commit 2fc7023 into master May 7, 2020
@dirkmc dirkmc deleted the fix/survive-bad-network-requests branch May 7, 2020 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants