Support of CryptoNight v8 ReverseWaltz based on CryptoNight V8 #234

EDDragonWolf · 2019-02-26T18:42:40Z

Added support of following tweaks of CryptoNight hashing algorithms:

CryptoNight Waltz - equal to CryptoNight but with 3/4 iterations of CryptoNight (variant=0, modifier=1)
CryptoNight v7 Waltz - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v7 (variant=1, modifier=1)
CryptoNight v8 Waltz - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 (variant=2, modifier=1)
CryptoNight v8 Reverse Waltz - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 and with reversed shuffle operation (variant=2, modifier=2)

CryptoNight v8 Reverse Waltz is planned to use as our new PoW algorithm.

Note: Hard Fork 12 scheduled on block 299200 (~2019-03-07T05:00:00+00).

Closes #208
Closes #223
Closes #224

… tweak

EDDragonWolf · 2019-02-26T19:04:26Z

@jagerman, the test which I'm added in the previous comment not satisfied me, so I added performance test here.
Our results:

Note:
CNv8-origin - it is the current implementation of cn_slow_hash from the master, without any new code.
CNv8 ReverseWaltz #1 - implementation present in here
CNv8 ReverseWaltz #2 - implementation which implements a separated version of store operation for each variant without additional reading. Something like:

#define VARIANT2_SHUFFLE_ADD_SSE2(base_ptr, offset) \
  do if (variant >= 2) \
  { \
    __m128i chunk1 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ 0x10))); \
    __m128i chunk2 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ 0x20))); \
    __m128i chunk3 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ 0x30))); \
    if (modifier & CN_MODIFIER_REVERSE) { \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x10)), _mm_add_epi64(chunk1, _b1)); \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x20)), _mm_add_epi64(chunk3, _b)); \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x30)), _mm_add_epi64(chunk2, _a)); \
    } else { \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x10)), _mm_add_epi64(chunk3, _b1)); \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x20)), _mm_add_epi64(chunk1, _b)); \
      _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x30)), _mm_add_epi64(chunk2, _a)); \
    } \
  } while (0)

There is some difference between CNv8 and CNv8-origin, and between waltz's variants, but as for me it looks like more as measurement error, than performance reduction.

jagerman · 2019-02-26T19:44:00Z

@EDDragonWolf - there is another gain to be had here by getting the conditional outside of the hashing loop by making it a compile-time constant by "loop hoisting" it. This macro would work, I think:

#define VARIANT2_SHUFFLE_ADD_SSE2_REVERSE(base_ptr, offset, REVERSE_STEP) \
  do if (variant >= 2) \
  { \
    __m128i chunk1 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ (REVERSE_STEP ? 0x30 : 0x10)))); \
    __m128i chunk2 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ 0x20))); \
    __m128i chunk3 = _mm_load_si128((__m128i *)((base_ptr) + ((offset) ^ (REVERSE_STEP ? 0x30 : 0x10)))); \
    _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x10)), _mm_add_epi64(chunk1, _b1)); \
    _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x20)), _mm_add_epi64(chunk3, _b)); \
    _mm_store_si128((__m128i *)((base_ptr) + ((offset) ^ 0x30)), _mm_add_epi64(chunk2, _a)); \
  } while (0)

you'll need to add this REVERSE_STEP constant as a macro back through the post_aes macro as well, then add a separate loop (with a compile-time constant!) for the reverse version:

    if(useAes)
    {

        if(modifier & CN_MODIFIER_REVERSE) {
            for(i = 0; i < iters; i++)
            {
                pre_aes();
                _c = _mm_aesenc_si128(_c, _a);
                post_aes(1);
            }
        }
        else {
            for(i = 0; i < iters; i++)
            {
                pre_aes();
                _c = _mm_aesenc_si128(_c, _a);
                post_aes(0);
            }
        }
    }

It's quite possible that the compiler is already doing this optimization, but given the size of the code in the loop it's quite possible that it isn't (or that it would only do so at -O3).

…ost_aes macro

EDDragonWolf · 2019-02-27T12:20:01Z

Thanks, @jagerman. I have not thought about optimization in this way. I added changes based on your suggestion with minor fixes.

EDDragonWolf · 2019-02-27T18:11:34Z

Also, we thought about using -O3 as the default option for compilation. However, we need to check if there are no side effects with it. I created an issue for it (#235). @jagerman, if you are interested in it and you have time, investigate it and add your suggestions about it. Thanks.

jagerman · 2019-02-27T23:48:11Z

I don't like -O3 unless it shows tangible benefits -- and very often it doesn't. Because it tends to make object code much larger there are a lot of code bases that end up slower rather than faster under -O3.

EDDragonWolf · 2019-02-28T10:13:41Z

@jagerman, please, add your comment to issue #235. any opinion is very important for us.

SChernykh · 2019-03-04T18:34:52Z

Any test pool available?

EDDragonWolf · 2019-03-05T22:50:39Z

@SChernykh
Sorry for the delay, we had a big delay with HF on testnet, it forked only now
testnet mining pool - http://3.83.140.241/
if you need some testnet wallet addresses:
FAaegMUw5YV9GcwNGwJsyLdc1jkVRnNWcX3zEd5e1Nmci8HmGQGt3J3NUjeWi19WQi9t52mAwxHCXUSkcufmmU7CMVpjACG
FB4ZejF4V3w8qhRgxQVENyKc8aCmgP4whaUQhZuw7zwnb5rdgEKFq1G5gbGnhUCBXKPHF3bYLDqZD5e7JG7i2Wf3LwNmXDu
F8WjfGHDBqkhtSy674bz1tjaBooPnFEgvF92ooYbrCCNCagzbT8SxogS2PiW3LKuEMhGrE6V2YJP3CgLeENd53JZLExetb6

Added support of CryptoNight v8 Reverse Waltz (named cryptonight_v8_reversewaltz here) - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 and with reversed shuffle operation We plan to use CryptoNight v8 Reverse Waltz as new PoW algorithm for Graft (graft-project/GraftNetwork#234).

rebased version of fireice-uk#2261 Added support of CryptoNight v8 Reverse Waltz (named cryptonight_v8_reversewaltz here) - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 and with reversed shuffle operation We plan to use CryptoNight v8 Reverse Waltz as new PoW algorithm for Graft (graft-project/GraftNetwork#234).

rebased version of #2261 Added support of CryptoNight v8 Reverse Waltz (named cryptonight_v8_reversewaltz here) - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 and with reversed shuffle operation We plan to use CryptoNight v8 Reverse Waltz as new PoW algorithm for Graft (graft-project/GraftNetwork#234).

rebased version of fireice-uk#2261 Added support of CryptoNight v8 Reverse Waltz (named cryptonight_v8_reversewaltz here) - equal to CryptoNight v8 but with 3/4 iterations of CryptoNight v8 and with reversed shuffle operation We plan to use CryptoNight v8 Reverse Waltz as new PoW algorithm for Graft (graft-project/GraftNetwork#234).

EDDragonWolf added 3 commits January 28, 2019 02:26

Added support of CryptoNight Waltz based on CryptoNight MoneroV8

a38cac4

Added support od reverse chunk operation and CryptoNight ReverseWaltz…

d6c329e

… tweak

Added performance tests for CNv8, CNv8 Waltz, CNv8 ReverseWaltz

244a540

EDDragonWolf requested review from mbg033 and bitkis February 26, 2019 18:42

This was referenced Feb 26, 2019

Support of CryptoNight v8 Waltz and CryptoNight v8 ReverseWaltz MoneroOcean/node-cryptonight-hashing#25

Merged

Support of CryptoNight v8 ReverseWaltz fireice-uk/xmr-stak#2261

Closed

Optimized cn reverse implementation with compiler-time constant for p…

8cb99f0

…ost_aes macro

EDDragonWolf mentioned this pull request Mar 1, 2019

Support of CryptoNight v8 ReverseWaltz xmrig/xmrig#963

Closed

Scheduled HF12 and changed the release version

dfd9b6d

bitkis approved these changes Mar 4, 2019

View reviewed changes

EDDragonWolf merged commit fec3ac9 into master Mar 4, 2019

EDDragonWolf deleted the feature/cryptonight_waltz_support branch March 4, 2019 12:55

psychocrypt mentioned this pull request Mar 7, 2019

Support of CryptoNight v8 ReverseWaltz fireice-uk/xmr-stak#2282

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support of CryptoNight v8 ReverseWaltz based on CryptoNight V8 #234

Support of CryptoNight v8 ReverseWaltz based on CryptoNight V8 #234

EDDragonWolf commented Feb 26, 2019 •

edited

Loading

EDDragonWolf commented Feb 26, 2019

jagerman commented Feb 26, 2019 •

edited

Loading

EDDragonWolf commented Feb 27, 2019

EDDragonWolf commented Feb 27, 2019

jagerman commented Feb 27, 2019

EDDragonWolf commented Feb 28, 2019

SChernykh commented Mar 4, 2019

EDDragonWolf commented Mar 5, 2019

Support of CryptoNight v8 ReverseWaltz based on CryptoNight V8 #234

Support of CryptoNight v8 ReverseWaltz based on CryptoNight V8 #234

Conversation

EDDragonWolf commented Feb 26, 2019 • edited Loading

EDDragonWolf commented Feb 26, 2019

jagerman commented Feb 26, 2019 • edited Loading

EDDragonWolf commented Feb 27, 2019

EDDragonWolf commented Feb 27, 2019

jagerman commented Feb 27, 2019

EDDragonWolf commented Feb 28, 2019

SChernykh commented Mar 4, 2019

EDDragonWolf commented Mar 5, 2019

EDDragonWolf commented Feb 26, 2019 •

edited

Loading

jagerman commented Feb 26, 2019 •

edited

Loading