Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client: Invalid remainder Sync Error #1416

Closed
holgerd77 opened this issue Aug 23, 2021 · 7 comments · Fixed by #1484
Closed

Client: Invalid remainder Sync Error #1416

holgerd77 opened this issue Aug 23, 2021 · 7 comments · Fixed by #1484

Comments

@holgerd77
Copy link
Member

holgerd77 commented Aug 23, 2021

We currently have a lot of the following errors occuring directly at the beginning of sync:

ERROR [08-23|10:12:39] Error: invalid remainder
    at Object.decode (/ethereumjs-monorepo/node_modules/rlp/src/index.ts:75:11)
    at Peer._handleMessage (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:430:25)
    at Peer._handleBody (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:511:20)
    at Peer._onSocketData (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:540:18)
    at Socket.emit (events.js:315:20)
    at Socket.EventEmitter.emit (domain.js:467:12)
    at addChunk (internal/streams/readable.js:309:12)
    at readableAddChunk (internal/streams/readable.js:284:9)
    at Socket.Readable.push (internal/streams/readable.js:223:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)

Tested this with npm run client:start --network=rinkeby --minPeers=1 (minPeers was for other reasons not relevant here) and it directly occurs along the beginning of the sync process, often several times (side note: so this is just confirmed for Rinkeby yet, I didn't test on other networks). The sync process than normally continues.

I first suspected the changes from the tip of the chain PR #1132 to be responsible for this, but I tested by checking out a previous commit and the error is still there, so this is therefore not the case.

Won't dig deeper here for now but just opening as a tracking issue. If you've got new findings around this please add here so that we can keep track.

@acolytec3
Copy link
Contributor

I've seen it too and I wonder if it isn't related to #1399 somehow. I've seen it on mainnet as well so will research some today.

@holgerd77
Copy link
Member Author

@acolytec3 oh yeah, I guess this would be a very fitting candidate for this kind of bug behavior. So it might be good that we give this another more close review and/or writing some more tests.

(as some general note: we might have gotten a bit loose on reviews lately (including myself) and waving things through a bit too light heartedly. Eventually we should generally lift up here again, otherwise we'll reintroduce hard-to-track bugs in our libraries and especially the client).

As some first step it should be easy to test if things work "normal" again if one is manually downgrading to RLPx v4 in devp2p and therefore deactivating the snappy compression.

@holgerd77
Copy link
Member Author

Did another round of tests here and couldn't reproduce the error with current mainnet - argh master - state, lol. 😛 (so, to clarify: I tested rinkeby and goerli). The error didn't show up on 3-4 test runs (which is unusual, when I initially tested it showed up every single sync run).

Will nevertheless keep open for now, we might want to observe a bit more.

I have a new connection issue which is likely (but not for sure) not connected to the issue here. Will therefore open a new issue on this.

@acolytec3
Copy link
Contributor

Likewise, I ran several tests on different networks yesterday and was only able to reproduce it once. I had additional logging to see if we were just not correctly applying snappy decompression prior to the rlp.decode operation that produces this error and that didn't appear to be it on the one instance I was able to actually encounter this scenario.

@holgerd77
Copy link
Member Author

Short update: this is still present in the client on master with 6d80656 (Sept 1) being the latest commit and run the client against mainnet.

@holgerd77
Copy link
Member Author

(so this is not solved by the snappy compression ping/pong fix from #1442 )

@holgerd77
Copy link
Member Author

So this is still not solved.

Some deepened analysis, here a reposted error stack trace for reference:

INFO [09-21|10:23:31] Server listening: rlpx - enode://e70f94b7bf0b4d475091be0e0e71bdeb34b07ea24968d80686c22628e27eb21d30eb7364d47879b3b7f2532d31fc647ae00b613593a449cbcf3a82c66fb14ff5@[::]:30303
ERROR [09-21|10:23:35] Error: invalid remainder
    at Object.decode (/ethereumjs-monorepo/node_modules/rlp/src/index.ts:75:11)
    at Peer._handleMessage (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:455:25)
    at Peer._handleBody (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:542:28)
    at Peer._onSocketData (/ethereumjs-monorepo/packages/devp2p/src/rlpx/peer.ts:571:18)
    at Socket.emit (events.js:315:20)
    at Socket.EventEmitter.emit (domain.js:467:12)
    at addChunk (internal/streams/readable.js:309:12)
    at readableAddChunk (internal/streams/readable.js:284:9)
    at Socket.Readable.push (internal/streams/readable.js:223:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)

This finally throws in the RLP decode function with invalid remainder:

export function decode(input: Buffer, stream?: boolean): Buffer {
  if (!input || (<any>input).length === 0) {
    return Buffer.from([])
  }

  const inputBuffer = toBuffer(input)
  const decoded = _decode(inputBuffer)

  if (stream) {
    return decoded
  }
  if (decoded.remainder.length !== 0) {
    throw new Error('invalid remainder')
  }

  return decoded.data
}

When running a mainnet client with full devp2p debugging with DEBUG=devp2p:* npm run client:start one is getting the following output on such an error:

grafik

I also added console.log messages in node_modules/rlp/dist/index.js:

if (decoded.remainder.length !== 0) {
  console.log(input)
  console.log(inputBuffer)
  console.log(decoded.remainder)
  throw new Error('invalid remainder');
}

So this might occur in the case of two conflicting DISCONNECT messages? 🤔 Not sure if this will hold, but will do some continued investigation on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants