New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
URGENT : Dogecoin fork #2 #250
Comments
Submitted this to reddit, want to sticky, now it's down... Pasting here: Yes, you're right. The network has forked around block 104701. From my first investigation, it looks like the version below and above 1.5 disagree on the correct blockchain. THIS IS NOT A 51% ATTACK! We are actively looking into why this happened. I can't say much yet (I'm at work), but the safest currently is updating your client to 1.5.1. Pool owners: I know about that issue with 1.5+, but there is currently no other way, as we can't patch 1.4 so fast. We have to investigate why that fork happened. A debug.log of a 1.4 node could help. Try to reduce your transactions to a minimum while this is being worked on. Check your block count in the bottom right corner of your wallet to see on which side of the chain you are. As of writing this, my client shows ~104773 blocks. This seems to be the currently longest, and therefore "correct" chain. |
I think it's okay to assume that 1.5.1 is on the correct fork since it's the longest one. The fork occured at block 104679: Block hash on 1.4.1: 5a01ea5380f14ec1571523e36b2f3e91747749be9ed216607fc49038a55d15b2 As a quick fix I added this code to the 1.4.1 checkpoints.cpp:
I have yet to verify that this actually fixes 1.4.1, it's redownloading the blockchain as we speak. The main reason 1.5+ is not usable for many pools is because the instability coupled with 1.5+ I have yet to find an exact reason but here are some of the issues I had to go through using 1.5+:
|
There could be several reasons why it forked. 1.4 still used the Luckycoin fork. |
This transaction looks to be the culprit, accepted in 1.5.x but I don't see it in 1.4.x It is a pretty massive transaction! @GlennMR would be good to know if adding that checkpoint works, or what 1.4.1 will complain about with this very large transaction. |
@billym2k seems like 1.4.1 returns this error: "EXCEPTION: 11DbException Db::put: Cannot allocate memory dogecoin in ProcessMessages() " It's still syncing the blockchain but I suspect it will just stop at that block and throw the same error again. Also worth noting there's no reason to run out of memory, it throws this on a server with 32 GB ram. Seems like the older client has a fixed amount of memory to work with. |
Interesting... My earlier post was wrong, I do see dcea23a1395d1e89788a008b1d379c237eeffb448503d12af46ac3b646e991f5 actually in a later block on the 1.4.x chain http://dogechain.info/tx/dcea23a1395d1e89788a008b1d379c237eeffb448503d12af46ac3b646e991f5 |
I do however think these massive transactions is what's causing all the problems and dogecoind is not able to allocate the memory to process this transaction. Are you aware of a physical limit set by the dogecoin daemon? |
At the moment I suspect the block size is too big for 1.4.1. Block 104679 on the 1.5.x chain is 976615 bytes, right near a presumed limit. The Block 976615 bytes 397146492.314 DOGE 59 |
Is there any way of raising this limit? |
Seems like this is just like the bitcoin fork in March -- "0.7 and older nodes use BDB for storing the blockchain databases. It seems this database has a limit on the size of the modification it can make atomically to the database. With the larger blocks of the past days, it seems to have triggered the limit. The result is that 0.7 (by default, it can be tweaked manually) will not accept "too large" blocks (we don't yet know what exactly causes it, but it is very likely caused by many transactions in the block). Specifically, block However. 0.8 (which uses a different database system) has no such limit, and happily accepts the block. As the majority of the hash power was on 0.8, the longest chain ended up using this block, which is not accepted by older nodes." |
Interesting. Is there a way of combining 1.4.1 with 0.8? I'm sorry to keep hammering on the usage of 1.4.1 but we've given 1.5+ a fair shot and it just fails miserably on large scale operations. |
No problem, I'm trying to find the exact info. For bitcoin, the options were either upgrade, or set DB_CONFIG to 120,000 locks. Apparently there's a webpage with this data, but I haven't found it yet. |
Yeah, @GlennMR give that a shot, for your 1.4.1 test you are doing, make a file called DB_CONFIG in the dogecoin data directory, containing the lines
See if it can sync without giving that 11DbException |
Suspected it might be to do with standard transactions. It seems they were using 1.5. The large transactions just skimmed below the 100000 byte limit. 1.5 has a MAX_STANDARD_SIZE to protect against large inputs. http://bkchain.org/doge/tx/467e99ac399d53ff8377b8b970eaf684e6d416d192950bc016c8a62717adfe48 [largest one] |
@billym2k I created a file in ~/.dogecoin/DB_CONFIG with the following contents:
lk_max_locks is a bit higher but shouldn't be a problem? Re-sync is currently at block 52.000 |
@GlennMR did you delete the block files and do a reindex, or what? |
@GlennMR |
Hm @simondlr , from my skimming 1.4.1 and 1.5 both accepted that transaction, just in different blocks: http://dogechain.info/tx/467e99ac399d53ff8377b8b970eaf684e6d416d192950bc016c8a62717adfe48 Block 104697 vs Block 104679 That would lead me to conclude that the transactions are valid, just that the block size is problematic. |
Anyhoo the correct fork is the 1.5.x fork. @xnljfr it looks like dogehouse has switched to the 1.5.x fork, this is accurate? Are you running on 1.5.x now? There are many pools, but not all pools, citing issues with 1.5.x, causing many to downgrade. I'm personally out of my depth as to why some pools are running 1.5.x without issue and some pools are having problems - according to @ummjackson 1.5.x is pretty much a carbon copy of Litecoin with Dogecoin's magic numbers. If some pool owners who are successfully running on 1.5.x could pipe in that would be helpful. |
I am just chipping in right now to let you know that the p2pool's have had no issues to date. I don't know if this helps in any way but I run a local node and have been in touch with other nodes who have not experienced any issues. We are however using the latest version of dogecoind. |
@billym2k we were running 1.5 for mining long time ago , but we are still stuck on 1.5 issues on payouts - so we were running 1.4 for payouts. i panicked and turned everything down to investigate and reported this first. |
I'm now rebuilding the dogechain.info database along with some new hardware. It'll be ready in an hour or so. That way at least the site is back on the right fork. |
@GlennMR will you be running the dogechain on 1.5 from now on or ? |
@xnljfr the block explorer yes. The pool? Hell no, I had my time with 1.5 and it wasn't nice. |
I think as a community we need to figure out the pool issues with 1.5. Unfortunately, and I apologize, I'm far out of my depth in that area. There are clearly issues that some folks are running into, but no one has been able to identify a cause. |
Correct, I'm not saying 1.5 is a bad version by definition. It just doesn't scale well with pools or in other words: services that constantly bash the RPC. 1.4 is having no problems at this. |
Is Litecoin having the same issues if 1.5 was a direct fork of it? |
@GlennMR so the max locks worked out for you ? cause i tried them on 3 different dev servers running 1.4.1 and none of them manages to get past the fork! |
On my pool, the specific issue with 1.5 and 1.5.1 was that the first payment in a batch of payments would go out with no issues. Any successive payment in the batch would return a transaction number but that transaction would never make it to the blockchain. We're running the latest MPOS from git. |
So it was fast initially but then slowed down? |
Yep. Calls to getbalance currently taking 4 seconds... with this getting longer with each call to it... now taking 20 seconds. |
So basically when people started making transactions that's when it got borked? @ummjackson I really think some logging in those loops would be a good idea and some kind of profiling / thread dump to see what the daemon is actually doing to waste CPU cycles |
Trying to isolate when it happened, but we’ve not had many sends in the past couple of hours, just lots of incoming payments. |
@GlennMR Implementing logging is a little easier said than done, but looking into it. @moolah-ch Please share your .conf file - need to confirm you're using rpcthreads flag etc. |
Currently on my phone, rpcthreads is set to 128 though. |
We have a possible fix that we'll merge into the pool branch in the next 24 hours for testing. Apart from lagginess, how are we looking on the 500 error problem? |
@moolah-ch - are you setting dbcache high? Important as this governs LevelDB memory usage. @GlennMR how are we on the 500 stalls? |
Seems to be running ok. I did disable auto payouts, re-enabled them... |
@ummjackson Initial impressions are good. Just sent out three transactions with hundreds of outputs using sendmany. Seems to be running good, transactions are confirmed. I was thinking, aren't these 500 issues and lag issues directly related? Seems like the PHP json api has a timeout built in. So I'd assume it to timeout because the daemon didn't respond in time, but payments still go through since the daemon got the RPC call. UPDATE: Running stable for about an hour now |
@GlennMR That's really good news - let me know if it stays stable. Pronoob still seems to be experiencing lag, but I've spoken to several Bitcoin devs and this is a known limitation of the daemon when you're throwing 1000s of RPC calls at it. A combination of using "sendmany" (or pulling directly from the coinbase in P2Pools case) and smart queueing before you hit the daemon are the optimal answer. Just to confirm @GlennMR - this is better than you were seeing on the master branch? Cheers, |
@ummjackson Definately, I feel semi-secure leaving the payout system on this branch. It's been running for an hour now, people are doing manual payouts, auto payouts, ... Daemon responds fast and transactions are getting confirmed. Obviously we should look at this in the long run, I'll keep you posted. |
@GlennMR Thanks for the feedback, it's promising. Looking forward to hearing how it goes after a bit longer :) |
@GlennMR Did you do a fresh pull with the latest commit or are you running on a build from yesterday? We made a small change to the sorting for transaction resends earlier today. |
@ummjackson The build from yesterday |
@GlennMR Perfect! Stay on that, we're reverting today's experimental commit. Let me know how yesterday's build goes. :) |
@ummjackson The cleaner build is slowing to a crawl after about 30 minutes and is reaching the "becoming unusable" stage. rpcthreads=128 |
@GlennMR RPC has a timeout value of 30 seconds by default which you can change this by setting rpctimeout in your dogecoin.conf file, and your web server might have a maximum execution time for script as well. |
@ummjackson Pool has been running stable on the latest one. That is using sendmany instead of sendtoaddress. However the fact that @moolah-ch is still having issues means there's still something up. @asadhaider Thanks, I was aware of the 30 second timeout. Seems reasonable to think that the timeout is what's causing the double payouts. |
We use getbalance, getreceivedbyaddress, getnewaddress, listtransactions and sendfrom for the most part... |
@moolah-ch that's quite normal, I run the same on my online wallet service. I'll be switching my online wallet service to 1.5.2 to see the effects. |
As far as I can tell, the slowdown occurs (and remains) when one of the following occurs.
Not sure which it is, but this is what is appearing in the log in terms of slowdown times. Edit: I'm aware that neither of these should cause slowdown. |
As I was saying in #264 I think the "Unknown transaction" messages are a symptom but not the problem itself. Those warnings appear in any transaction created by a p2pool node when it creates a block. The output is fine (it's a no-op), but those transactions typically have hundreds of outputs which CWalletTx::GetAmounts has to scan versus the 2 or so found in more typical transactions. |
Decided to just post this here rather than opening a new issue, but I can open a new issue if that would be better. I've not been able to leave this slow wallet stuff alone and I have a solution that can make some of these RPC commands run perhaps more than 100 times faster. But before I make any changes it would be really helpful to know how pool / wallet operators are using commands like getbalance, sendfrom, sendmany, etc. These commands accept an account, but the account stuff doesn't work the way I expected. For example I assumed sendfrom/sendmany would send coins from unspent outputs belonging to the account that is specified, but that's not what happens. It basically makes an accounting notation on the transaction that debits the account, but spends whatever outputs it feels like. Maybe this is useful if all of your transactions are generated through RPC and you're careful to credit everything correctly, but if you're like me and have made transactions through the GUI then none of these balances have any bearing on reality. In my case listaccounts shows all of my "accounts" with positive values and the default account ("") with a giant negative number. Since one of my accounts has twice as many coins as I have in my wallet, sendfrom/sendmany would happily attempt to create a transaction greater than the number of coins I possess. Later on in CWallet::CreateTransaction it would fail when it figured out I actually don't have those coins. So I'm curious, is anyone actually using the accounts stuff or are we mostly just wanting to know the total balance of the wallet and not worry about this account stuff when creating a transaction with sendfrom/sendmany? If so we can modify those commands so the account is ignored. Alternatively we change these functions so these balances that are displayed are based on unspent transactions. Or lastly, and the one that the least potential to cause unwanted side-effects but would require pool operators to modify their code to take advantage of them, is to create another set of commands that don't have an account parameter. Something like fastsendfrom, fastsendmany, etc. Is that something that would be easy to change in MPOS, etc? What say ye? |
Disregard last post. Working on a solution that requires no changes to existing pool / online wallet software. |
I sent 16,327.02637274 Which does not show up in the Block Chain. I am running Wallet V 1.5.2 |
@dylanl Please create a new issue, as this does not match the topic and the thread is misleading. |
2bfb82b Merge pull request #351 06aeea5 Turn secp256k1_ec_pubkey_serialize outlen to in/out 970164d Merge pull request #348 6466625 Improvements for coordinate decompression e2100ad Merge pull request #347 8e48787 Change secp256k1_ec_pubkey_combine's count argument to size_t. c69dea0 Clear output in more cases for pubkey_combine, adds tests. 269d422 Comment copyediting. b4d17da Merge pull request #344 4709265 Merge pull request #345 26abce7 Adds 32 static test vectors for scalar mul, sqr, inv. 5b71a3f Better error case handling for pubkey_create & pubkey_serialize, more tests. 3b7bc69 Merge pull request #343 eed87af Change contrib/laxder from headers-only to files compilable as standalone C d7eb1ae Merge pull request #342 7914a6e Make lax_der_privatekey_parsing.h not depend on internal code 73f64ff Merge pull request #339 9234391 Overhaul flags handling 1a36898 Make flags more explicit, add runtime checks. 1a3e03a Merge pull request #340 96be204 Add additional tests for eckey and arg-checks. bb5aa4d Make the tweak function zeroize-output-on-fail behavior consistent. 4a243da Move secp256k1_ec_privkey_import/export to contrib. 1b3efc1 Move secp256k1_ecdsa_sig_recover into the recovery module. e3cd679 Eliminate all side-effects from VERIFY_CHECK() usage. b30fc85 Avoid nonce_function_rfc6979 algo16 argument emulation. 70d4640 Make secp256k1_ec_pubkey_create skip processing invalid secret keys. 6c476a8 Minor comment improvements. 131afe5 Merge pull request #334 0c6ab2f Introduce explicit lower-S normalization fea19e7 Add contrib/lax_der_parsing.h 3bb9c44 Rewrite ECDSA signature parsing code fa57f1b Use secp256k1_rand_int and secp256k1_rand_bits more 49b3749 Add new tests for the extra testrand functions f684d7d Faster secp256k1_rand_int implementation 251b1a6 Improve testrand: add extra random functions 31994c8 Merge pull request #338 f79aa88 Bugfix: swap arguments to noncefp c98df26 Merge pull request #319 67f7da4 Extensive interface and operations tests for secp256k1_ec_pubkey_parse. ee2cb40 Add ARG_CHECKs to secp256k1_ec_pubkey_parse/secp256k1_ec_pubkey_serialize 7450ef1 Merge pull request #328 68a3c76 Merge pull request #329 98135ee Merge pull request #332 37100d7 improve ECDH header-doc b13d749 Fix couple of typos in API comments 7c823e3 travis: fixup module configs cc3141a Merge pull request #325 ee58fae Merge pull request #326 213aa67 Do not force benchmarks to be statically linked. 338fc8b Add API exports to secp256k1_nonce_function_default and secp256k1_nonce_function_rfc6979. 52fd03f Merge pull request #320 9f6993f Remove some dead code. 357f8cd Merge pull request #314 118cd82 Use explicit symbol visibility. 4e64608 Include public module headers when compiling modules. 1f41437 Merge pull request #316 fe0d463 Merge pull request #317 cfe0ed9 Fix miscellaneous style nits that irritate overactive static analysis. 2b199de Use the explicit NULL macro for pointer comparisons. 9e90516 Merge pull request #294 dd891e0 Get rid of _t as it is POSIX reserved 201819b Merge pull request #313 912f203 Eliminate a few unbraced statements that crept into the code. eeab823 Merge pull request #299 486b9bb Use a flags bitfield for compressed option to secp256k1_ec_pubkey_serialize and secp256k1_ec_privkey_export 05732c5 Callback data: Accept pointers to either const or non-const data 1973c73 Bugfix: Reinitialise buffer lengths that have been used as outputs 788038d Use size_t for lengths (at least in external API) c9d7c2a secp256k1_context_set_{error,illegal}_callback: Restore default handler by passing NULL as function argument 9aac008 secp256k1_context_destroy: Allow NULL argument as a no-op 64b730b secp256k1_context_create: Use unsigned type for flags bitfield cb04ab5 Merge pull request #309 a551669 Merge pull request #295 81e45ff Update group_impl.h 85e3a2c Merge pull request #112 b2eb63b Merge pull request #293 dc0ce9f [API BREAK] Change argument order to out/outin/in 6d947ca Merge pull request #298 c822693 Merge pull request #301 6d04350 Merge pull request #303 7ab311c Merge pull request #304 5fb3229 Fixes a bug where bench_sign would fail due to passing in too small a buffer. 263dcbc remove unused assignment b183b41 bugfix: "ARG_CHECK(ctx != NULL)" makes no sense 6da1446 build: fix parallel build 5eb4356 Merge pull request #291 c996d53 Print success 9f443be Move pubkey recovery code to separate module d49abbd Separate ECDSA recovery tests 439d34a Separate recoverable and normal signatures a7b046e Merge pull request #289 f66907f Improve/reformat API documentation secp256k1.h 2f77487 Add context building benchmarks cc623d5 Merge pull request #287 de7e398 small typo fix 9d96e36 Merge pull request #280 432e1ce Merge pull request #283 14727fd Use correct name in gitignore 356b0e9 Actually test static precomputation in Travis ff3a5df Merge pull request #284 2587208 Merge pull request #212 a5a66c7 Add support for custom EC-Schnorr-SHA256 signatures d84a378 Merge pull request #252 72ae443 Improve perf. of cmov-based table lookup 92e53fc Implement endomorphism optimization for secp256k1_ecmult_const ed35d43 Make `secp256k1_scalar_add_bit` conditional; make `secp256k1_scalar_split_lambda_var` constant time 91c0ce9 Add benchmarks for ECDH and const-time multiplication 0739bbb Add ECDH module which works by hashing the output of ecmult_const 4401500 Add constant-time multiply `secp256k1_ecmult_const` for ECDH e4ce393 build: fix hard-coded usage of "gen_context" b8e39ac build: don't use BUILT_SOURCES for the static context header baa75da tests: add a couple tests ae4f0c6 Merge pull request #278 995c548 Introduce callback functions for dealing with errors. c333074 Merge pull request #282 18c329c Remove the internal secp256k1_ecdsa_sig_t type 74a2acd Add a secp256k1_ecdsa_signature_t type 23cfa91 Introduce secp256k1_pubkey_t type 4c63780 Merge pull request #269 3e6f1e2 Change rfc6979 implementation to be a generic PRNG ed5334a Update configure.ac to make it build on OpenBSD 1b68366 Merge pull request #274 a83bb48 Make ecmult static precomputation default 166b32f Merge pull request #276 c37812f Add gen_context src/ecmult_static_context.h to CLEANFILES to fix distclean. 125c15d Merge pull request #275 76f6769 Fix build with static ecmult altroot and make dist. 5133f78 Merge pull request #254 b0a60e6 Merge pull request #258 733c1e6 Add travis build to test the static context. fbecc38 Add ability to use a statically generated ecmult context. 4fb174d Merge pull request #263 4ab8990 Merge pull request #270 bdf0e0c Merge pull request #271 31d0c1f Merge pull request #273 eb2c8ff Add missing casts to SECP256K1_FE_CONST_INNER 55399c2 Further performance improvements to _ecmult_wnaf 99fd963 Add secp256k1_ec_pubkey_compress(), with test similar to the related decompress() function. 145cc6e Improve performance of _ecmult_wnaf 36b305a Verify the result of GMP modular inverse using non-GMP code 0cbc860 Merge pull request #266 06ff7fe Merge pull request #267 5a43124 Save 1 _fe_negate since s1 == -s2 a5d796e Update code comments 3f3964e Add specific VERIFY tests for _fe_cmov 7d054cd Refactor to save a _fe_negate b28d02a Refactor to remove a local var 55e7fc3 Perf. improvement in _gej_add_ge a0601cd Fix VERIFY calculations in _fe_cmov methods 17f7148 Merge pull request #261 7657420 Add tests for adding P+Q with P.x!=Q.x and P.y=-Q.y 8c5d5f7 tests: Add failing unit test for #257 (bad addition formula) 5de4c5d gej_add_ge: fix degenerate case when computing P + (-lambda)P bcf2fcf gej_add_ge: rearrange algebra e2a07c7 Fix compilation with C++ 873a453 Merge pull request #250 91eb0da Merge pull request #247 210ffed Use separate in and out pointers in `secp256k1_ec_pubkey_decompress` a1d5ae1 Tiny optimization 729badf Merge pull request #210 2d5a186 Apply effective-affine trick to precomp 4f9791a Effective affine addition in EC multiplication 2b4cf41 Use pkg-config always when possible, with failover to manual checks for libcrypto git-subtree-dir: src/secp256k1 git-subtree-split: 2bfb82b10edf0f0b0e366a12f94c8b21a914159d
Hello everyone :)
Reddit is constantly going down so lets start a discussion here about what were going to do ?
So far what we know is :
So far what we dont know is :
The text was updated successfully, but these errors were encountered: