Base58 en-/de-coding #1828

Closed
wants to merge 5 commits into
from

7 participants

@roques

The base58 codec uses full length bignum operations for each digit and strchr(pszBase58, *p) to find the value of each base58 digit. I've cleaned up the implementation by:

  • using bignum operations with word-sized operands
  • using a pre-computed decoding table
  • adding comments
  • getting rid of useless copy operations

As with my other recent pull-requests I think I've made the source easier to understand and the speedup is just a side benefit.

However, I could not restrain myself and added a last commit which is a much faster version, converting 5 or 10 digits at a time (depending on sizeof(BN_ULONG)). Esulting in an overall speedup of over 10 times.

If your prime interest is in simple code, please pull all but the last commit.
If you don't mind a bit longer, but much faster code, pull the whole branch.

@gavinandresen
Bitcoin member

My primary interest is neither simple code nor faster code. It is secure, backwards-compatible code.

Given the level of scrutiny the existing bitcoin codebase has had so far, I am very reluctant to commit changes to core code unless the benefits of the changes clearly outweigh the risks that the changes will have unintended side effects or open up new security holes.

That's not a "no" for these changes, but I personally don't think the benefits outweigh the cost of the hour or so it would take me to thoroughly review these changes and convince myself they're not dangerous. If you can get some other coders I trust to spend the time reviewing, then great...

@BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/80173de16776c468b187348f74cd16245c3e0416 for binaries and test log.

@laanwj
Bitcoin member

I'm divided on this one. Some changes do make the code shorter and easier to read / more maintainable, others are somewhat more doubtful shortcuts made for speed.

But for example reducing the borderline-crazy loop in DecodeBase58 to:

while ((v = pszRBase58[*p]) >= 0)
 {
    bn *= 58;
    bn += v;
    p++;
 }

is a pretty nice feat.

Is there a usecase for fast base58?

BTW: thanks for adding tests, we'll surely pull that one

@roques

Thanks for the reviews! Based on them I've pushed some improvements.

  • added a comment explaining base58-encoding and how leading 0-bytes are encoded to preserve them. (btw. should it be base-58 or base58 in comments?)
  • corrected Hungarian notation of pszRBase58 to rgi8RBase58 and declared it const (Is there a style guide? I'm unsure when it's OK not to use Hungarian notation)
  • synchronized rgi8RBase58's understanding of isspace() with Posix (0xa0 does not count as space, whereas "\f" formfeed does)
  • fixed the base59 typo
@roques
@BitcoinPullTester

Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8a4bff050503b75c35ad3acd08d78be18c6b3da6 for binaries and test log.

@luke-jr
Bitcoin member

Needs rebase.

roques added some commits Sep 11, 2012
@roques roques EncodeBase58: don't convert little- to big-endian just to undo it 8ed9c48
@roques roques CBigNum: use BN_foo_word() where possible 1d34bb3
@roques roques add test of DecodeBase58 skipping whitespace 749e898
@roques roques reimplement De-/En-codeBase58()
* Make use of faster word-sized bignum operators.
* Use table-lookup instead of linear search to
  find the value of a base58-digit.
* Explain base-58 encoding
7cb83a0
@roques roques optimize De-/En-codeBase58()
Convert chunks of digits at a time.
Bigger, but still readable code,
Additional speedup ~4.5
eb7000f
@roques

Rebased on master as of a few minutes ago.

@sipa
Bitcoin member

I do appreciate the readability of the code, and to a lesser extent the possibility for optimization. But it seems you're trying to bypass CBigNum wherever possible, directly calling OpenSSL routines. I agree CBigNum is only a thin wrapper, though, so perhaps we should see it as "C++ interface for OpenSSL's bn" and not "abstract C++ bignum type".

@jgarzik
Bitcoin member

agree w/ @sipa

@BitcoinPullTester

Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/eb7000f6e5be6cbc9b238bdd781ecb25496152ca for binaries and test log.

This could happen for one of several reasons:
1. It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts
2. It does not build on either Linux i386 or Win32 (via MinGW cross compile)
3. The test suite fails on either Linux i386 or Win32
4. The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)

@gavinandresen
Bitcoin member

Closing this as "risks outweigh benefits"

@luke-jr
Bitcoin member

Might be good to dig the tests out of this at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment