Skip to content
This repository

Clean up code dependencies #2154

Merged
merged 12 commits into from 10 months ago

9 participants

Eric Lombrozo Gavin Andresen Pieter Wuille BitcoinPullTester P. Kaufmann Luke-Jr Jeff Garzik Wladimir J. van der Laan robbak
Eric Lombrozo

I realize this introduction is a bit long, so if you don't feel like reading skip to the bullet points below under "strategy".

The codebase as it exists right now has a number of unnecessary dependencies which makes code modularization much more difficult. In particular, the satoshi client was built to handle all bitcoin-related tasks, but its value as a "reference implementation" lies primarily in doing verification and acting as a relay agent as these are the most essential tasks that participating nodes must perform to keeping the network in operation. Things like wallets, historical databases, mining, and notification agents could be written as entirely separate third-party applications without risk to the network's fundamental integrity.

The basic architecture of a bitcoin node is as follows:

At the core there exist fundamental bitcoin message structures, along with the code necessary for serialization/deserialization. These structures belong in their own source files with minimal dependencies so they can be reused for applications that needn't perform verification and relay - for instance, filtering and notification agents. Unfortunately, these core structures currently reside for the most part in main.h/main.cpp, one of the central problems this pull request attempts to fix.

On top of these core structures sits a network component that manages sockets, does peer discovery, and handles queueing and dispatching of messages. This component is clearly dependent on the core message structures but does not depend on the specific logic used to verify blocks and transactions nor to identify misbehaving peers nor sign transactions nor maintain a block chain database.

Then we have a scripting engine, signature verification component, and a signing component. Historical database applications do not need signature verification/signing functionality at all. Filtering messages and sending alerts generally does not even require a scripting engine and does fine with basic pattern matching.

The most critical high-level operations needed by a verification/relay node such as the satoshi client are transaction verification; block chain and memory pool management; and detection/management of misbehaving peers. These things are currently primarily implemented in main.h/main.cpp. These are indeed the main operations of the satoshi client - but the core low-level structures should not depend at all on this logic.

Then there's the UI, but let's leave that aside for a moment.

Finally there's init.h/init.cpp, which sets up the particular environment in which the satoshi client runs.

This branch takes the following strategy:

  • Remove source file dependencies on main.h and init.h by only including necessary headers wherever possible.
  • If source files depend on definitions in main or init, either move the dependent portions into main/init or move the depended-upon portions into separate files.
  • If the dependent source files use global variables or functions that clearly belong in either main or init, copy the value over to a class member or a variable with file scope in the dependent source or expose a registration function to set a callback.
  • If moving a core class out of main is impossible because its methods depend on variables or functions defined in main, isolate the methods that depend on main and either move them to another class that does belong in main or convert them into regular functions in main.

It is important that all modifications made in this branch are easy to review and to test. This branch does not encourage rewriting things from scratch - only moving them and rearranging them in easily identifiable chunks. Furthermore, the focus of the branch is not so much on coding style and style consistency - but on isolation of functional units and elimination of unnecessary dependencies.

Gavin Andresen

Why namespace db_cpp ? Seems to me the database copy of pchMessageStart could be a static member of the DbEnv.

Eric Lombrozo

I'd rather avoid adding more dependencies in CAddrDB for something that isn't really used elsewhere.

Pieter Wuille
Collaborator

@gavinandresen That doesn't make sense. This is for peers.dat, which doesn't use BDB at all, and I suppose CDbEnv will be gone as soon as we kick out BDB-based wallets.

Eric Lombrozo

I'm still not clear on exactly where in the code it's best to:
1) set the magic bytes for CAddrDB
2) Set the call handlers for net.cpp

I tried running and got this error: http://pastebin.com/tJU9Gsi2
But tried restarting and it initialized correctly, and now it seems to be running fine.

BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/03014e2ff26d4ab6195f8b0bfc29c712dec509de for binaries and test log.

Gavin Andresen

RE: namespace db_cpp:

Fine, make it static in CAddrDB, I forgot that peers.dat is not a BDB file (and pass it into CAddrDB constructors? or is CAddrDB a singleton class? )

Eric Lombrozo

@gavinandresen Noted and done.

Eric Lombrozo

cp: writing out/bitcoind.exe': No space left on device
cp: writing
out/test_bitcoin.exe': No space left on device

WTF?!?!?!

Shall we all chip in and get BlueMatt some more hard disk space?

Eric Lombrozo

I am going to keep pushing commits as an extra backup despite BitcoinPullTester being out of disk space.

P. Kaufmann

I think you are doing good work, but I'm sure you will get faster ACKs or merges, if you try to keep pulls smaller. Perhaps @sipa or other core devs can comment.

Gavin Andresen

Our bottleneck is code review and testing.

If you want your pulls to get merged, then you need to help fix that bottleneck-- either recruit reviewers/testers to review/test your own code, or volunteer to help review/test other people's pulls.

Or only refactor code that has good unit tests, so we can be more confident that nothing breaks on any of the three platforms we support.

I am generally against refactoring just to improve the code, unless the refactoring comes with some significant benefit or unit tests.

Gavin Andresen

Oh, also: you might want to help test or update https://github.com/libcoin/libcoin which is a fully refactored version of the core code, that, last I checked, nobody used because nobody trusts it (because so many changes were made without thorough testing).

Eric Lombrozo

The changes I'm making will be well-documented and possible to rigorously review. And if more unit tests are needed, I'd be glad to help write some up.

I'm making incremental changes, hopefully each change easy to understand and track. I haven't changed any actual logic in the code nor implementation details - just moved code around - and in large chunks, not small pieces.

As most of the significant changes thus far are things like moving methods from one class to another or turning them into regular functions, it should be possible to do a fairly comprehensive assessment by doing pattern matches on the codebase and making sure each usage was appropriately modified. i.e. tx.CheckTransaction() becomes CheckTransaction(tx); or tx.GetOutputFor(txin, inputs) becomes inputs.GetOutputFor(txin); etc...

Eric Lombrozo

Also, I'm willing to do comprehensive testing on earlier commits with the hope of getting them merged without having to merge everything at once.

src/net.cpp
... ...
@@ -74,6 +73,28 @@ struct LocalServiceInfo {
74 73
 
75 74
 static CSemaphore *semOutbound = NULL;
76 75
 
  76
+//
  77
+// Handlers that need to be registered
  78
+//
  79
+ProcessMessagesHandler fnProcessMessages = NULL;
  80
+SendMessagesHandler fnSendMessages = NULL;
  81
+StartShutdownHandler fnStartShutdown = NULL;
1
Pieter Wuille Collaborator
sipa added a note

Can these be static?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/4af9a6c7e0d1246852195b47c200a974dff47ab1 for binaries and test log.

Pieter Wuille
Collaborator

Just for the record: I've been discussing these changes extensively with @CodeShark the past few days, and I think they are very valuable. They should make the code easier to understand and reuse.

Getting 0.8 out now certainly has priority over refactorings, but as these are almost entirely just code-movement changes, merging them shouldn't be too hard.

src/main.cpp
((18 lines not shown))
1286 1321
         return 0;
1287 1322
 
1288 1323
     int64 nResult = 0;
1289  
-    for (unsigned int i = 0; i < vin.size(); i++)
1290  
-        nResult += GetOutputFor(vin[i], inputs).nValue;
  1324
+    for (unsigned int i = 0; i < tx.vin.size(); i++)
  1325
+        nResult += this->GetOutputFor(tx.vin[i]).nValue;
2
Pieter Wuille Collaborator
sipa added a note

Is this "this->" necessary?

nah, it's mostly a stylistic thing, I guess. I can get rid of it if you don't like it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/aea31289a59166195d2d7270eacb0905b14d476a for binaries and test log.

BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/366fb8aa64a581c22064dae33a3d08769375b9f3 for binaries and test log.

Eric Lombrozo

The core commits will be reorganized into four stages:

  • 1) move core class methods that should remain class methods implemented in main.cpp out of main.cpp and into core.cpp. (move)
  • 2) move core class method implementations in main.h that should not be part of core classes out of the class declarations in main.h. (move)
  • 3) turn methods that should not be part of the core classes into regular functions in main.h and main.cpp, get rid of the method prototypes in the classes, and add function declarations in main.h where necessary. (pattern replacement, add function declarations to main.h)
  • 4) move the core class declarations to core.h (move)
BitcoinPullTester

Automatic sanity-testing: FAILED MERGE, see http://jenkins.bluematt.me/pull-tester/366fb8aa64a581c22064dae33a3d08769375b9f3 for test log.

This pull does not merge cleanly onto current master

Luke-Jr

Rebase needed

P. Kaufmann

I'm sure @CodeShark intents to rework this pull into smaller more logical pieces after 0.8 is final.

Pieter Wuille sipa commented on the diff
src/wallet.h
((5 lines not shown))
20 21
 
21 22
 class CAccountingEntry;
22 23
 class CWalletTx;
23 24
 class CReserveKey;
24 25
 class COutput;
  26
+class CWalletDB;
1
Pieter Wuille Collaborator
sipa added a note

Is this needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
src/main.h
((51 lines not shown))
1299  
-        return (nBits == 0);
1300  
-    }
1301  
-
1302  
-    uint256 GetHash() const
1303  
-    {
1304  
-        return Hash(BEGIN(nVersion), END(nNonce));
1305  
-    }
1306  
-
1307  
-    int64 GetBlockTime() const
1308  
-    {
1309  
-        return (int64)nTime;
1310  
-    }
1311  
-
1312  
-    void UpdateTime(const CBlockIndex* pindexPrev);
1313  
-};
  575
+void UpdateTime(CBlockHeader& block, const CBlockIndex* pindexPrev);
1314 576
 
1315 577
 class CBlock : public CBlockHeader
2
Pieter Wuille Collaborator
sipa added a note

Any reason why CBlock doesn't move to core.h? Or just a "first the easy parts"?

Eric Lombrozo
CodeShark added a note

I want to get what I had already done rebased correctly before moving CBlock - but yes, definitely that comes immediately after.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Jeff Garzik
Collaborator

OK, very much like where this code is going. ACK 90% of it, modulo inline comments made earlier.

One specific criticism with this rebase: the rebase problems were all fixed in a final, appended commit. That does not work, because it breaks bisection properties. Each commit needs to produce a tree that is buildable and passes tests. Otherwise, "git bisect" will not work, and attempting to find a problematic commit in past history becomes more difficult.

I know that makes rebasing more difficult for changes like this, sorry :(

We cannot have: broken commit, broken commit, broken commit, commit that fixes everything.

Pieter Wuille
Collaborator
sipa commented

I'm very much in favor of the code changes here (and they look move-only to, apart from the registration functions). It's only a first step, but it's a very needed one IMHO.

I agree with the comment about the last commit fixing everything, but apart from that, I'd like to see this merged soon. Since now seems the ideal time for this, I don't mind holding up other pullreqs for a short time, so this can get reviewed and finalized.

Jeff Garzik
Collaborator

Agreed, RE holding other pullreqs, to avoid the endless rebase pain on @CodeShark 's part.

Eric Lombrozo

I'll try to get this done today.

Eric Lombrozo

All the commits build with make -f makefile.unix now.

Still left to do:

  • Move CBlock to core
  • Remove main.h include in net.cpp.
  • Unit tests
Eric Lombrozo

The unit tests have run successfully on all the commits in linux. Need a few more eyes to review for correctness and some help with a few more tests on other platforms. As for the other two things I had listed as TODOs - moving CBlock to core and removing the main.h include from net - I'd rather merge what we have now and do these things later.

Jeff Garzik
Collaborator

OK, looks pretty good.

ACK everything except the indirect function pointer stuff. Definite NAK on the function pointers. Let's fix that up, and we can get this merged.

Eric Lombrozo

What do you suggest in place of the function pointers?

Pieter Wuille
Collaborator
sipa commented

@jgarzik Not sure you see the reason for those indirect pointers. They are there to break the dependency of net on main, and seen as such it seems perhaps weird, if the only user of net is main for now.

However, over time, net should turn into a class "CNetworkNode" or something, which exposes a way to listen for events. I suppose a boost::signal could be used right now instead, which more clearly shows its intention.

And please, the overhead of a pointer indirection is few orders of magnitude lower than even just allocating the buffer in which a message being processed is read...

Eric Lombrozo

If there's any reasonable critique of the function pointers it's about style, not performance. A more general-purpose messaging system would be very nice - but in absence of a clear design, at least the function pointers avoid any dependencies on outside libraries.

Gavin Andresen
Owner

I'd also prefer boost::signals2 over registering function pointers. We're already using them elsewhere in the code, and it is exactly the type of decoupling they are designed for.

Wladimir J. van der Laan
Owner
laanwj commented
Eric Lombrozo

Very well, I agree it's a cleaner solution.

Eric Lombrozo

I'll add signals later today...

Jeff Garzik jgarzik referenced this pull request
Merged

autotools: Gui split #2700

Pieter Wuille
Collaborator
sipa commented

Just so this isn't forgotten: @TheUni just noticed this doesn't update makefiles (yet)

robbak
robbak commented
Eric Lombrozo

@sipa I have deliberately avoided putting anything in core.cpp to avoid makefile issues for this merge. Eventually, it will probably make sense to move some of the code in core.h into core.cpp.

added some commits
Eric Lombrozo Get rid of db dependencies on main 336fe97
Eric Lombrozo Moved PushGetBlocks to main.cpp to eliminate dependence of net.cpp on…
… CBlockLocator.
8926263
Eric Lombrozo Moved unrelated-to-network calls in StartNode and StopNode into init.cpp 4751d07
Eric Lombrozo Removed net.cpp's dependency on init.h.
Added explicit include of main.h in init.cpp, changed include of init.h to include of main.h in net.cpp.

Added function registration for net.cpp in init.cpp's network initialization.

Removed protocol.cpp's dependency on main.h.

TODO: Remove main.h include in net.cpp.
663224c
Eric Lombrozo

When I had worked on this originally, net.cpp was calling StartShutdown directly. Happily, this has since been removed.

I had overlooked the fact that ProcessMessages was still being called directly from net.cpp. I'll fix the appropriate commits once everyone agrees with the messaging approach.

Pieter Wuille
Collaborator
sipa commented

@CodeShark You'll still at least need to add core.h to bitcoin-qt.pro, and while you're on it, I don't see any harm in adding core.cpp to the other makefiles too - that'll make it easier to move stuff there in the future.

added some commits
Eric Lombrozo Created core.h/core.cpp, added to makefiles. Started moving core stru…
…ctures from main to core beginning with COutPoint.
effc277
Eric Lombrozo Moved CInPoint to core. Removed GetMinFee from CTransaction and made …
…it a regular function in main.
788536f
Eric Lombrozo Removed AcceptToMemoryPool method from CTransaction. This method belo…
…ngs to the mempool instance.

Removed AreInputsStandard from CTransaction, made it a regular function in main.
Moved CTransaction::GetOutputFor to CCoinsViewCache.

Moved GetLegacySigOpCount and GetP2SHSigOpCount out of CTransaction into regular functions in main.

Moved GetValueIn and HaveInputs from CTransaction into CCoinsViewCache.

Moved AllowFree, ClientCheckInputs, CheckInputs, UpdateCoins, and CheckTransaction out of CTransaction and into main.

Moved IsStandard and IsFinal out of CTransaction and put them in main as IsStandardTx and IsFinalTx. Moved GetValueOut out of CTransaction into main. Moved CTxIn, CTxOut, and CTransaction into core.

Added minimum fee parameter to CTxOut::IsDust() temporarily until CTransaction is moved to core.h so that CTxOut needn't know about CTransaction.
05df3fc
Eric Lombrozo Moved CCoins, CTxOutCompressor, CTxInUndo, and CTxUndo to core. 65e7bbe
Eric Lombrozo Removed script.cpp's dependence on main.h 48343a0
Eric Lombrozo Moved UpdateTime out of CBlockHeader and moved CBlockHeader into core. aabdf9e
Eric Lombrozo Using boost::signals2 to message main from net.cpp. 501da25
Eric Lombrozo

@sipa done
@TheUni core.h/core.cpp will have to be considered in what you're doing

Eric Lombrozo

Might be a good idea to create a file with settings constants. Perhaps separate files for network rule configuration and for node resource utilization configuration. But please, let's get this thing merged first!

Eric Lombrozo

Someone please give me a good reason why any access to fImporting, fReindex, and nTransactionsUpdated is necessary in net.cpp and I'll work in some mechanism to support it. As it stands, it looks completely unnecessary.

Eric Lombrozo CodeShark closed this
Eric Lombrozo CodeShark reopened this
Eric Lombrozo

Sorry, hit the close button by accident.

Eric Lombrozo

Alright - code freeze until merge.

Pieter Wuille sipa commented on the diff
src/db.cpp
@@ -486,6 +488,7 @@ void CDBEnv::Flush(bool fShutdown)
486 488
 // CAddrDB
487 489
 //
488 490
 
  491
+unsigned char CAddrDB::pchMessageStart[4] = { 0x00, 0x00, 0x00, 0x00 };
1
Pieter Wuille Collaborator
sipa added a note

This duplication is quite ugly. Mike's refactor that moves these chain-dependent constants to a separate class is nicer, but you may want to check that it doesn't introduce new dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Pieter Wuille
Collaborator
sipa commented

ACK. Code changes look good to me, and I've tested syncing/mining/receiving/sending on testnet. I have a few inline comments left, but those can be dealt with in follow-up commits, so we don't need to stall the merging process too long.

Wladimir J. van der Laan laanwj commented on the diff
src/core.cpp
... ...
@@ -0,0 +1,7 @@
  1
+// Copyright (c) 2009-2010 Satoshi Nakamoto
  2
+// Copyright (c) 2009-2013 The Bitcoin developers
  3
+// Distributed under the MIT/X11 software license, see the accompanying
  4
+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
  5
+
  6
+#include "core.h"
2
Wladimir J. van der Laan Owner
laanwj added a note

Why have this implementation file that only includes a header?

Pieter Wuille Collaborator
sipa added a note

More things are intended to be moved to core (in particular, CBlock), in a second step. As those will include implementation code too, already create a file for them, so e.g. #2748 picks it up already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Wladimir J. van der Laan
Owner
laanwj commented

I've tested this commit on testnet, no problems found.

Jeff Garzik jgarzik merged commit f59530c into from
Jeff Garzik jgarzik closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 12 unique commits by 1 author.

Jun 05, 2013
Eric Lombrozo Get rid of db dependencies on main 336fe97
Eric Lombrozo Moved PushGetBlocks to main.cpp to eliminate dependence of net.cpp on…
… CBlockLocator.
8926263
Eric Lombrozo Moved unrelated-to-network calls in StartNode and StopNode into init.cpp 4751d07
Eric Lombrozo Removed net.cpp's dependency on init.h.
Added explicit include of main.h in init.cpp, changed include of init.h to include of main.h in net.cpp.

Added function registration for net.cpp in init.cpp's network initialization.

Removed protocol.cpp's dependency on main.h.

TODO: Remove main.h include in net.cpp.
663224c
Eric Lombrozo Created core.h/core.cpp, added to makefiles. Started moving core stru…
…ctures from main to core beginning with COutPoint.
effc277
Eric Lombrozo Moved CInPoint to core. Removed GetMinFee from CTransaction and made …
…it a regular function in main.
788536f
Eric Lombrozo Removed AcceptToMemoryPool method from CTransaction. This method belo…
…ngs to the mempool instance.

Removed AreInputsStandard from CTransaction, made it a regular function in main.
Moved CTransaction::GetOutputFor to CCoinsViewCache.

Moved GetLegacySigOpCount and GetP2SHSigOpCount out of CTransaction into regular functions in main.

Moved GetValueIn and HaveInputs from CTransaction into CCoinsViewCache.

Moved AllowFree, ClientCheckInputs, CheckInputs, UpdateCoins, and CheckTransaction out of CTransaction and into main.

Moved IsStandard and IsFinal out of CTransaction and put them in main as IsStandardTx and IsFinalTx. Moved GetValueOut out of CTransaction into main. Moved CTxIn, CTxOut, and CTransaction into core.

Added minimum fee parameter to CTxOut::IsDust() temporarily until CTransaction is moved to core.h so that CTxOut needn't know about CTransaction.
05df3fc
Eric Lombrozo Moved CCoins, CTxOutCompressor, CTxInUndo, and CTxUndo to core. 65e7bbe
Eric Lombrozo Removed script.cpp's dependence on main.h 48343a0
Eric Lombrozo Moved UpdateTime out of CBlockHeader and moved CBlockHeader into core. aabdf9e
Eric Lombrozo Using boost::signals2 to message main from net.cpp. 501da25
Jun 06, 2013
Eric Lombrozo Removed the main.h include from net.cpp. 6e68524
Something went wrong with that request. Please try again.