already have block - causing bandwidth to be wasted - vicious circle? #1120

rebroad · 2012-04-17T22:43:38Z

It occurs to me that this happens as the node asks several other nodes for the same blocks when it does not see the block appear within a certain time. Due to the speed of the nodes providing blocks, they both respond with the blocks requested, but one will send the block first and usually stay ahead. This isn't a major issue, but it does mean that bandwidth is wasted, and since it's probably the bandwidth which is causing the problem in the first place, I expect this will continue to get worse due to being a vicious circle. My proposed solution probably requires a BIP, which would allow a node to request to another node to stop sending blocks (i.e. to the node sending the blocks which it already has). It could do this after one duplicate block, or perhaps after several.

Sample case:

04/17/12 22:08:49 askfor block 000000000000096fd80c 0
04/17/12 22:08:50 sending getdata: block 000000000000096fd80c
04/17/12 22:14:12 askfor block 000000000000096fd80c 1334700530000000
04/17/12 22:14:12 sending getdata: block 000000000000096fd80c
04/17/12 22:16:39 received block 000000000000096fd80c
04/17/12 22:16:40 SetBestChain: new best=000000000000096fd80c height=176035 work=29615794588444859740
04/17/12 22:20:07 received block 00000000000009856606
04/17/12 22:20:07 ERROR: ProcessBlock() : already have block 176020 00000000000009856606
04/17/12 22:20:56 received block 00000000000008862df7
04/17/12 22:20:56 ERROR: ProcessBlock() : already have block 176021 00000000000008862df7
04/17/12 22:21:36 received block 000000000000035a84ac
04/17/12 22:21:36 ERROR: ProcessBlock() : already have block 176022 000000000000035a84ac
04/17/12 22:21:59 received block 0000000000000a881c26
04/17/12 22:21:59 ERROR: ProcessBlock() : already have block 176023 0000000000000a881c26
04/17/12 22:22:18 received block 00000000000004ed81e6
04/17/12 22:22:18 ERROR: ProcessBlock() : already have block 176024 00000000000004ed81e6
04/17/12 22:22:52 received block 000000000000068bbe48
04/17/12 22:22:52 ERROR: ProcessBlock() : already have block 176025 000000000000068bbe48
04/17/12 22:23:12 received block 00000000000003425f71
04/17/12 22:23:12 ERROR: ProcessBlock() : already have block 176026 00000000000003425f71
04/17/12 22:23:39 received block 00000000000000f89962
04/17/12 22:23:39 ERROR: ProcessBlock() : already have block 176027 00000000000000f89962
04/17/12 22:23:48 received block 000000000000083ec278
04/17/12 22:23:48 ERROR: ProcessBlock() : already have block 176028 000000000000083ec278
04/17/12 22:24:06 received block 000000000000088984c6
04/17/12 22:24:06 ERROR: ProcessBlock() : already have block 176029 000000000000088984c6
04/17/12 22:24:29 received block 00000000000002857e8b
04/17/12 22:24:29 ERROR: ProcessBlock() : already have block 176030 00000000000002857e8b
04/17/12 22:24:45 received block 000000000000075834ed
04/17/12 22:24:45 ERROR: ProcessBlock() : already have block 176031 000000000000075834ed
04/17/12 22:25:10 received block 000000000000033b7b3e
04/17/12 22:25:10 ERROR: ProcessBlock() : already have block 176032 000000000000033b7b3e
04/17/12 22:25:56 received block 00000000000001f32a95
04/17/12 22:25:56 ERROR: ProcessBlock() : already have block 176033 00000000000001f32a95
04/17/12 22:26:20 received block 00000000000004dffd6d
04/17/12 22:26:20 ERROR: ProcessBlock() : already have block 176034 00000000000004dffd6d
04/17/12 22:26:20 received block 000000000000096fd80c
04/17/12 22:26:20 ERROR: ProcessBlock() : already have block 176035 000000000000096fd80c

Much of this output has been cut, but if I grep my debug.log for "already have", I can see that blocks 174591 to 176061 (1470 blocks and counting) were being sent by two nodes.

sipa · 2012-04-17T22:55:12Z

Is this unmodified source code?

rebroad · 2012-04-17T22:58:02Z

@sipa, yes, stock 0.6.0.6-beta, running on Windows 7, using tor.

rebroad · 2012-04-28T13:55:53Z

My proposal for solving this would be to introduce a "stop sending me blocks" command, or a "I already had those blocks" command, which would cause the sending node to stop sending. For backwards compatibility, old nodes would ignore these commands, so nodes would give them a chance to stop, and if they continued to send duplicate blocks (perhaps 6 already haves in a row) they would simply disconnect from the node in question.

rebroad · 2012-04-28T21:52:48Z

I can also confirm that this problem exists in a recent version of litecoin.

rebroad · 2012-04-28T22:26:30Z

Is it possible for ProcessMessages() to determine which block is being sent before it's finished downloading it? Perhaps get advance notification of its hash rather than wait for the block to completely download before realising it's wasted time and bandwidth receiving it?

sipa · 2012-04-28T22:35:36Z

That would require some structural changes to the code, and I'm not sure it's necessary. I'd much rather like to know why your node requests blocks twice.

rebroad · 2012-04-28T23:18:49Z

Isn't it standard behaviour that it requests the same block more than once if the nodes requested do not respond within a certain time period?

rebroad · 2012-04-29T00:58:47Z

I've added some debugging code to work out what's going on, and I've noticed that in ProcessMessages() that the blocks seem to get stuck downloading. The vRecv.size() sometimes never reaches nMessageSize. i.e. every time ProcessMessages() is called, the vRecv.size() is still the same as last time for the same node. I'm pretty sure my added code for debugging isn't the cause of this.

I'll upload this code as soon as I obtain sufficient intelligence to navigate git....

rebroad · 2012-05-07T15:43:12Z

One solution to this problem is to add the ability to include a checksum/hash of the block in the block header, so the node can reject it without downloading it if it already has it. Only one other solution I can think of, other than putting it with receiving blocks that the node already has, but which took so long to arrive that the node ended up requesting it elsewhere.

The other solution I can think of is to enable ProcessBlock to be threaded, as it may be due to this causing nodes to take a long time to respond to messages that's causing them to time-out and ask elsewhere.

rebroad · 2013-05-12T05:15:27Z

I'd like to understand exactly why pull request #1382 was closed. The issue was that nodes were being disconnected when they were simply doing as they were told - but what is the alternative to this? How do we avoid receiving duplicate blocks? The only way I see this is to NEVER request the same block from another node unless the nodes previously requested from have disconnected. Surely there can be circumstances in which this could take a very long time for these nodes to disconnect, during which time only orphan blocks can be requested, and the blockchain would appear stuck during this time. Or am I missing something?

sipa · 2013-05-12T08:46:16Z

@rebroad If a node is slow in responding, you need to switch to another peer for downloading. If a node is really slow, you can choose to disconnect him. If a node doesn't respond at all, he'll timeout after one minute anyway. But without logic to at least prevent unintentional double requests, disconnecting a peer simply because they do what you asked seems pretty much cutting in your own flesh.

laanwj · 2013-10-26T07:53:21Z

Closing, as it "works as intended". Pull requests to speed up block downloading are on the way and could use testing, see #2964.

* refactoring PS, cleanup, more log output in debug mode (part of bitcoin#1120) * fix dsq rate limit calculations on new dsq

@sstone

* depends: fix fontconfig with newer glibc See comment for more detail * Create dependencies.md, and link dependencies file from README & build docs * [Docs] Fix broken Markdown table in dependencies.md. Cleanups. Use the correct capitalization for the dependencies Sort dependencies Fix header formatting. Minor style cleanups. * Add required package dependencies for depends cross compilation [skip-ci] * [depends] expat 2.2.5 * [depends] ccache 3.4.1 * [depends] miniupnpc 2.0.20180203 * Update mac_alias to 2.0.7 * fixme: depends: Add D_DARWIN_C_SOURCE to miniupnpc CFLAGS * update config.guess and config.sub to latest version * [depends] Fix Qt compilation with Xcode 8 * [depends] ZeroMQ 4.2.2 * [depends] Qt 5.7.1 * [depends] Don't build libevent sample code * [depends] native_ds_store 1.1.2 * depends: fix zmq build with mingw < 4.0 * depends: fix libzmq's needless linking against libstdc++ This is broken for a number of reasons, including: - g++ understands "-static-libstdc++ -lstdc++" to mean "link against whatever libstdc++ exists, probably shared", which in itself is buggy. - another stdlib (libc++ for example) may be in use * [depends] Allow depends system to support armv7l * depends: fix qt translations build Their buildsystem insists on using the installed ltranslate, but gets confused about how to find it. Since we manually control the build order, just drop the dependency. * depends: qt: disable printer for all platforms, not just osx This also fixes the native osx build. * depends: add a zlib build qt5.7 changed the location of some of its symbols, creating a circular dependency in Qt5Core. Rather than trying to fix that up, build our own zlib rather than having it built for us. * qt: fix build with zlib for target This contains a few hacks very specific to Qt's buildsystem. These can be reverted once we split the build between native and target builds. Qt's build contains a circular dependency when not using a system zlib. By far the easiest fix is to switch to a system zlib, rather than Qt's own. However, that confuses Qt's cross build which assumes that when using a system zlib, it should also find a system (native) zlib for native tools. The build breaks if that zlib is not present. To solve this: 1. Always use a system zlib rather than the one provided by qt 2. Set force_bootstrap, which instructs the build tools to be built as though we're cross-compiling (build != target) 3. For build tools, use qt's internal zlib so that a native zlib is not required. Step 3 means that if any zlib headers are found by the native build, it will confuse Qt's internal zlib build. So we also need to make sure that the target headers/libs aren't found. To do so, specify that our cflags/cxxflags/cppflags/ldflags only apply for non-host builds. * [depends] dbus 1.10.18 * [Depends] Fix Qt build with Xcode 9.2 * depends: zeromq 4.2.3 * depends: patch pthread_set_name_np out of zeromq * depends: Fix Qt build with XCode 9.3 * depends: Add libevent compatibility patch for windows Add a patch that seems to be necessary for compatibilty of libevent 2.0.22 with recent mingw-w64 gcc versions (at least GCC 5.3.1 from Ubuntu 16.04). Without this patch the Content-Length in the HTTP header ends up as `Content-Length: zu`, causing communication between the RPC client and server to break down. See discussion in bitcoin#8653. Source: https://sourceforge.net/p/levent/bugs/363/ Thanks to @sstone for the suggestion. * [depends] Set OSX_MIN_VERSION to 10.8 * depends: use new variable layout for qt sdk * [depends] Boost 1.64.0 * [depends] libevent 2.1.8-stable * depends: Add 'make clean' and 'make clean-all' rules It's useful to have a standard way to clean up the work done by the depends system when testing changes to it. The `make clean-all` rule removes build artifacts for all supported architectures (in addition to sources/), while `make clean` only removes artifacts for current architecture (`BUILD`). * [depends] bump nativce_ccache to 3.4.2 * [depends] remove libevent patches (not needed since 2.1.7rc) * [depends] Fix zeromq make step * [depends] Remove OBJCXX define from config.site.in * [depends] fontconfig 2.12.1 * [depends] FreeType 2.7.1 * Update doc/dependencies.md

…con in dark theme fix. e06925f [GUI] Dashboard navigation icon bad top padding pushing the icon down in dark theme fix. (furszy) Pull request description: Auto-explainable title, The dashboard nav button css class, in the dark theme, is pushing down the icon when the button gets checked, this fixes it. ACKs for top commit: random-zebra: ACK e06925f Fuzzbawls: ACK e06925f Tree-SHA512: 1a3c8efa2709ea5e56d2bc170fa743adc69953b0bfb81500daa9476d45513f56af5803abba66ea203207452309c80c82d57ab3aba7f554b420cc7408a47c300b

dbe285d Add security policy (dexX7) Pull request description: This pull request adds a basic security policy. Tree-SHA512: 2009049a7a318fbaf4847a77be4947509ac3080b5b8ef7e7220feab11c34a330480a010345f1e0b65eee532437c88324c0e35426301712fc342bd1bb674b937f

rebroad mentioned this issue May 24, 2012

Sometimes nodes send many duplicate blocks. This patch disconnects the p... #1382

Closed

laanwj closed this as completed Oct 26, 2013

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

already have block - causing bandwidth to be wasted - vicious circle? #1120

already have block - causing bandwidth to be wasted - vicious circle? #1120

rebroad commented Apr 17, 2012

sipa commented Apr 17, 2012

rebroad commented Apr 17, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 28, 2012

sipa commented Apr 28, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 29, 2012

rebroad commented May 7, 2012

rebroad commented May 12, 2013

sipa commented May 12, 2013

laanwj commented Oct 26, 2013

already have block - causing bandwidth to be wasted - vicious circle? #1120

already have block - causing bandwidth to be wasted - vicious circle? #1120

Comments

rebroad commented Apr 17, 2012

sipa commented Apr 17, 2012

rebroad commented Apr 17, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 28, 2012

sipa commented Apr 28, 2012

rebroad commented Apr 28, 2012

rebroad commented Apr 29, 2012

rebroad commented May 7, 2012

rebroad commented May 12, 2013

sipa commented May 12, 2013

laanwj commented Oct 26, 2013