Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VIP: 'Bigmath' functions #1086

Open
charles-cooper opened this issue Nov 16, 2018 · 11 comments
Open

VIP: 'Bigmath' functions #1086

charles-cooper opened this issue Nov 16, 2018 · 11 comments
Labels
VIP: Deferred VIP is not scheduled to move forward at this time

Comments

@charles-cooper
Copy link
Member

Simple Summary

Add 'bigmath' functions which make it easier for contract authors to do more precise arithmetic.

Abstract

Add functions for dealing more precisely with overflow and underflow, especially where the implementations are not obvious. Specifically, a bigmul function which returns the result of multiplying two 256-bit uints as a pair of 256-bit uints (emulating a 512-bit uint), and a bigdiv function which returns the underflow portion of dividing the high word of a 512-bit uint into a 256-bit uint.

Notes

  • I'm open to a bigadd and bigsub, but these are much more obvious for a contract author to implement.
  • I'm torn as to whether bigdiv should be 'minimal' and only return the underflow portion as stated, or whether it should perform the full division, taking a pair of uint256 and returning a pair of uint256.
  • Perhaps divmod and/or bigmod should be added to the mix.

Motivation

Precise arithmetic is hard. For instance, the naive method to calculate multiplication by a fraction is subject to overflow attacks (consider 10 * 254/255 using 8-bit uints returns 0, instead of 9). Contract authors need a way to keep track of over and underflow without necessarily reaching for a full-fledged bignum implementation. Furthermore, efficient implementations (which take O(1) operations) are not obvious (the naive method being long multiplication or division). The two proposed functions provide a safe, efficient way to keep track of over and underflow, and in theory also provide enough machinery that higher-level abstractions like floating point arithmetic or bignum implementations could be built on top of them.

As an example, multiplying by a rational number could be implemented using these functions as follows:

# Sample implementation
def mul_frac(a: uint256, numerator: uint256, denominator: uint256) -> uint256:
  (r1, r0) = bigmul(a,numerator)
  assert r1 / denominator == 0 # overflow protection
  return bigdiv(r1, denominator) + r0 / denominator # Could be one bit of rounding error here

Specification

Implement the following interface, either as builtin functions or as part of a 'standard library'.

# Returns the result of multiplying two uint256s as an unsigned 512-bit integer, represented as a pair of uint256s.
# The high word should be the first value of the returned pair, and the low word should be the second value of the returned pair.
def bigmul(a: uint256, b: uint256) -> (uint256, uint256):

# Returns the result of ((a << 256) / b) % (2**256). If 512-arithmetic were represented using pairs of uint256, this would be the low word of the quotient of dividing the high word of a by b.
def bigdiv(a: uint256, b: uint256) -> uint256:

The following C code can be taken as a suggestion of how to implement this efficiently.

#include <stdio.h>
#include <stdint.h>

uint16_t bigmul1(uint8_t a, uint8_t b) {
  uint16_t a1 = a;
  return a1 * b;
}

// Return the result of multiplying two uint8_t as uint16_t.
uint8_t mulmod(uint8_t a, uint8_t b, uint8_t k) {
  uint16_t a1 = a;
  uint16_t c = a1*b;
  return (uint8_t)(c % k);
}

// Same thing but without depending on native uint16 multiplication
// Credits: https://medium.com/wicketh/mathemagic-full-multiply-27650fec525d
// Interestingly, the author claims that using this function, and checking the high-word for overflow supposedly takes less gas than a standard safeMul
uint16_t bigmul2(uint8_t a, uint8_t b) {
  uint8_t r0 = a * b; // i.e. mulmod(a,b, 256)
  uint8_t r1 = mulmod(a,b, UINT8_MAX);
  r1 = (r1 - r0) - (r1 < r0 ? 1 : 0); // chinese remainder

  uint16_t r = r1; // r1 is the high word, r0 is the low word.
  return (r << 8) + r0;
}

// Calculate ((a << 8) / b) mod 256, using native words
uint8_t bigdiv1(uint8_t a, uint8_t b) {
  uint16_t a1 = a;
  return (a << 8) / b;
}

// Same thing but without depending on native uint16_t division
uint8_t bigdiv2(uint8_t a, uint8_t b) {
  uint8_t m = UINT8_MAX % b;
  uint8_t r = UINT8_MAX / b;
  return r * a + (m + 1) * a / b;
}

// Run some sample cases, test bigdiv2 against bigdiv1
int main() {
  printf("%d\n", bigmul1(100,5));
  //printf("%d\n", mulmod(100,5, 7));
  printf("%d\n", bigmul2(100,5));
  printf("%d\n", bigdiv1(1,251));
  printf("%d\n", bigdiv2(1,251));
  printf("%d\n", bigdiv1(1,255));
  printf("%d\n", bigdiv2(1,255));
  printf("%d\n", bigdiv1(1,1));
  printf("%d\n", bigdiv2(1,1));
  printf("%d\n", bigdiv1(1,2));
  printf("%d\n", bigdiv2(1,2));
  printf("%d\n", bigdiv1(1,3));
  printf("%d\n", bigdiv2(1,3));
  for (int i = 0; i < 256; i++) {
    for (int j = 1; j < 256; j++) {
      int r1 = bigdiv1(i,j);
      int r2 = bigdiv2(i,j);
      if (r1 != r2) {
          printf("Bad pair (%d,%d) -> %d vs %d \n", i,j, r1,r2);
      }
    }
  }
  printf("end\n");
}

Backwards Compatibility

Existing code may have collisions with the function names.

Copyright

Copyright and related rights waived via CC0

@fubuloubu
Copy link
Member

fubuloubu commented Nov 17, 2018

2**512 is astronomically big. It is much harder to make safety guarantees once you go down this path of multiple-register arthimitic. Do you have a use case you can point to that requires doing math that large?

P.S. we have a decimals type.

@charles-cooper
Copy link
Member Author

Sure, any time you want to do accounting precisely. Consider that ERC20 tokens typically have 18 decimal places (roughly 60 bits). Once you start wanting to trade at different prices, make partial fills, do ring trades, the issue of bit truncation starts to become important.

Keep in mind, 2**256 is really big but we still check for overflow because not accounting for overflow results in bugs, and is an attack vector. This is just a generalization for more complex operations.

@fubuloubu
Copy link
Member

fubuloubu commented Nov 17, 2018

Every arthimetic operation performed in Vyper is done by first validating it cannot overflow/underflow (reverting if it does), similar to the SafeMath library in common use on Solidity contracts. So that is not a concern.

You are indeed correct that most ERC20 tokens use 18 decimal places as their default, mostly by convention. That leaves over 60 decimal places for the whole number portion, which is much larger than any reasonable project I've encountered. These numbers are large enough to be very, very precise for most reasonable use cases, but if you have a specific example for me to examine, I would like to see it.

These data types are much larger than even what IEEE 64-bit floats can account for, and are much more precise because they are tracked as hard integers.

@fulldecent
Copy link
Contributor

Here is a use case we can study:

https://github.com/compound-finance/compound-money-market

You can deposit funds, maybe a few Ether, and you earn interest PER BLOCK. This might be closer to quantization errors than other projects.


I believe adding functions to the standard library is always a huge deal. Instead, there should be a "contribs" package with all kinds of stuff like this. No need to justify adding something to contribs, just put it in. Then after the market clearly shows this is useful it gets added to the standard library.

@jacqueswww jacqueswww added VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting invalid labels Nov 27, 2018
@jacqueswww
Copy link
Contributor

As discussed in call, we will leave this ticket open and marked as 'invalid' until we have a library interface we can use. See #885 #483 and #584 for details.

@nrryuya
Copy link
Contributor

nrryuya commented Jan 5, 2019

One use-case of big integer arithmetic is RSA Accumulator.
LayerXcom/verified-vyper-contracts#70

@vbuterin
Copy link
Contributor

IMO if Vyper is going to go down this path, you should consider instead just adding bigmath functions that use bytes[n] variables and the MODEXP precompile. So possibly:

  • bigadd(x, y): does a loop that walks through x and y 32 bytes at a time, using ADD and carry
  • bigsub(x, y): similar to bigadd but does subtraction; returns an error if the result is less than zero
  • bigmul(x, y): uses MODEXP to compute x * y = ((x+y)^2 - (x-y)^2) / 4
  • bigmodadd(x, y, m): like bigadd but with a modulus
  • bigmodmul(x, y, m): like bigmul but with a modulus
  • bigmod(x, y): uses MODEXP(x, 1, y)
  • bigdiv(x, y): does x//y

You'd probably want to implement this via a library, similar to the in-Vyper RLP decoding feature.

BTW if you want an efficient way to get the high order bits of 512-bit multiplication, one possible EVM-specific trick is something like (x*y) // 2**256 = mulmod(x, y, 2**256 - 1) - (x*y) (but I haven't verified this, there may be edge cases where it doesn't work so you might need a couple extra clauses).

@charles-cooper
Copy link
Member Author

@vbuterin Thanks for the feedback! I was unaware of the MODEXP precompile and it would probably be a better fit for RSA accumulation. In fact, exposing MODEXP as a library function could be a good thing to expose generally although we would need to think about the API to expose for such arbitrary-width integer types.

BTW, the edge case you are looking for in the 512-bit multiplication is the Chinese remainder expression in the C code I added above - I tried to write it in a way which 'looks' like EVM code.

...
uint8_t r0 = a * b; // i.e. mulmod(a,b, 256)
uint8_t r1 = mulmod(a,b, UINT8_MAX);
r1 = (r1 - r0) - (r1 < r0 ? 1 : 0);
...

@charles-cooper
Copy link
Member Author

Another use case for bigdiv - calculating 1/x.

@thibauld
Copy link

thibauld commented Jul 9, 2019

Bigmath would be very useful for projects implementing bonding curves (linear and non-linear). In bonding curves, if the buy slope is very small, the token supply and the reserve can quickly become very large and the calculus required often involve to power 2 or power 3 these numbers. We are currently having the issue while implementing the continuous organization reference smart-contract.

@jacqueswww
Copy link
Contributor

jacqueswww commented Jul 9, 2019

@thibaild checkout this file https://github.com/andytudhope/Recollections/blob/master/vyper/math/math.vy for a potential power function in vyper.

@charles-cooper charles-cooper added VIP: Deferred VIP is not scheduled to move forward at this time and removed invalid VIP: Discussion Used to denote VIPs and more complex issues that are waiting discussion in a meeting labels Apr 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
VIP: Deferred VIP is not scheduled to move forward at this time
Projects
None yet
Development

No branches or pull requests

7 participants