Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libp2p crashes with ABORT_ERR #2462

Closed
christroutner opened this issue Apr 1, 2024 · 5 comments
Closed

libp2p crashes with ABORT_ERR #2462

christroutner opened this issue Apr 1, 2024 · 5 comments
Labels
need/triage Needs initial labeling and prioritization

Comments

@christroutner
Copy link

  • Version:
    1.3.1

  • Platform:
    Linux pop-os 6.6.6-76060606-generic #202312111032170230614322.04~d28ffec SMP PREEMPT_DYNAMIC Mon D x86_64 x86_64 x86_64 GNU/Linux

  • Subsystem:

Severity:

  • Critical - System crash, application panic.

Description:

During normal operation of finding and connecting with nodes, the libp2p node will crash. This appears to be due to a race condition. Here is is the error message received from v1.3.1 (latest) version of libp2p:

status: getCRGist() Connecting to Circuit Relay /ip4/143.198.70.59/tcp/5101/p2p/12D3KooWMbU9R49aiYUeFBpxFYK6PggacoeMydaZaR2dzDpWgcA6
file:///home/trout/work/psf/code/helia-coord/node_modules/race-signal/dist/src/index.js:22
		return Promise.reject(new AbortError(opts?.errorMessage, opts?.errorCode));

AbortError: The operation was aborted
	at raceSignal (file:///home/trout/work/psf/code/helia-coord/node_modules/race-signal/dist/src/index.js:22:31)
	at YamuxStream.closeWrite (file:///home/trout/work/psf/code/helia-coord/node_modules/@libp2p/utils/dist/src/abstract-stream.js:230:19)
	at YamuxStream.close (file:///home/trout/work/psf/code/helia-coord/node_modules/@libp2p/utils/dist/src/abstract-stream.js:189:18)
	at stream.close (file:///home/trout/work/psf/code/helia-coord/node_modules/@libp2p/utils/dist/src/stream-to-ma-conn.js:13:15)
	at ConnectionImpl.close [as _close] (file:///home/trout/work/psf/code/helia-coord/node_modules/libp2p/dist/src/upgrader.js:443:30)
	at processTicksAndRejections (node:internal/process/task_queues:95:5)
	at runNextTicks (node:internal/process/task_queues:64:3)
	at listOnTimeout (node:internal/timers:540:9)
	at process.processTimers (node:internal/timers:514:7)
	at async ConnectionImpl.close (file:///home/trout/work/psf/code/helia-coord/node_modules/libp2p/dist/src/connection/index.js:121:13) {
  type: 'aborted',
  code: 'ABORT_ERR'
}

Node.js v20.11.0

This is a similar error from an older version of libp2p (v1.2.1):

file:///home/safeuser/ipfs-service-metrics/node_modules/race-signal/dist/src/index.js:22
		return Promise.reject(new AbortError(opts?.errorMessage, opts?.errorCode));
							  ^

AbortError: The operation was aborted
	at raceSignal (file:///home/safeuser/ipfs-service-metrics/node_modules/race-signal/dist/src/index.js:22:31)
	at YamuxStream.closeWrite (file:///home/safeuser/ipfs-service-metrics/node_modules/@libp2p/utils/dist/src/abstract-stream.js:230:19)
	at YamuxStream.close (file:///home/safeuser/ipfs-service-metrics/node_modules/@libp2p/utils/dist/src/abstract-stream.js:189:18)
	at file:///home/safeuser/ipfs-service-metrics/node_modules/libp2p/dist/src/connection/index.js:118:63
	at Array.map (<anonymous>)
	at ConnectionImpl.close (file:///home/safeuser/ipfs-service-metrics/node_modules/libp2p/dist/src/connection/index.js:118:44)
	at initiateConnection (file:///home/safeuser/ipfs-service-metrics/node_modules/@libp2p/webrtc/dist/src/private-to-private/initiate-connection.js:125:34)
	at runMicrotasks (<anonymous>)
	at processTicksAndRejections (node:internal/process/task_queues:96:5)
	at async WebRTCTransport.dial (file:///home/safeuser/ipfs-service-metrics/node_modules/@libp2p/webrtc/dist/src/private-to-private/transport.js:83:35)
	at async DefaultTransportManager.dial (file:///home/safeuser/ipfs-service-metrics/node_modules/libp2p/dist/src/transport-manager.js:81:20)
	at async Job.queue.add.peerId.peerId [as fn] (file:///home/safeuser/ipfs-service-metrics/node_modules/libp2p/dist/src/connection-manager/dial-queue.js:153:38)
	at async raceSignal (file:///home/safeuser/ipfs-service-metrics/node_modules/race-signal/dist/src/index.js:28:16)
	at async Job.run (file:///home/safeuser/ipfs-service-metrics/node_modules/@libp2p/utils/dist/src/queue/job.js:56:28) {
  type: 'aborted',
  code: 'ABORT_ERR'
}

Steps to reproduce the error:

This error can be reproduced by cloning the helia-coord library, deps-04-24 branch. Install dependencies, then run this javascript file with node.js. After a period of time, the error and crash will occur.

@christroutner christroutner added the need/triage Needs initial labeling and prioritization label Apr 1, 2024
@christroutner
Copy link
Author

Sometimes the libp2p app runs for over an hour without an issue, an other times this Issue occurs within a few seconds after startup. If someone is trying to reproduce this error, restart the app after 5 minutes if the error does not appear. It's a race condition, so it's not easy to reproduce. It appears to involve the connection between two nodes.

@christroutner
Copy link
Author

In an attempt to debug the root cause, I reverted back to js-libp2p v1.2.1, from the latest v1.3.1. I started to see the same error as above. I realized what had changed is that I was using node.js v20 when I was previously using node.js v16.

I've been doing some testing with js-libp2p v1.3.1 and node.js v16. So far I have not see the error in this Issue.

@christroutner
Copy link
Author

I'm closing this Issue as I think it's tied to some combination of switching node versions, node_modules, and package-lock.json.

I've successfully gotten the error to go away on node.js v16 on Ubuntu 22. And I've gotten it to run on node.js v20 on Ubutnu 20.

@luizzvinicius
Copy link

Guys, I'm having the same problem. Nowadays I'm using node 20.11, but in Windows and I tried return to v16 however the problem continues.
To be more specific, I'm working with IPFS (helia) and orbitDb, this happens when the terminal reloads, in other words, when the application creates more than one connection (I believe this is the cause of the problem).

@christroutner
Copy link
Author

If you haven't tried it yet, delete your node_modules folder and the package-lock.json file. Then reinstall dependencies with npm install. That seemed to make a difference for me. It not a silver bullet, but it was definitely one of the factors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/triage Needs initial labeling and prioritization
Projects
None yet
Development

No branches or pull requests

2 participants