Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Decimal object #17648

Open
asmeurer opened this issue Sep 24, 2019 · 27 comments
Open

Add a Decimal object #17648

asmeurer opened this issue Sep 24, 2019 · 27 comments
Labels

Comments

@asmeurer
Copy link
Member

It would be useful to have a Decimal object that works similar to Float but works with base-10 arithmetic. It can use the standard library decimal library as a backend, to do all the hard work, similar to how Float uses mpmath. You can already use decimal if you only want to do basic arithmetic, but there's no easy way to combine that with symbolic expressions without converting the decimal numbers to another representation (either Float or Rational).

A problem would be how to handle evaluation outside of basic arithmetic. I think we can use the Float evaluation, making sure to return enough digits. I'm curious if @fredrik-johansson has any thoughts on this.

@oscarbenjamin
Copy link
Contributor

As long as the decimal type has the same kind of rounding problems that Float has I don't think many of our users would think it very significant. What I do think a lot of our users want is a kind of number that is input as (and displays as) decimal while having exact arithmetic in calculations. I constantly see users in issues and on StackOverflow using e.g. 0.5 or 9.81 without realising the effect that is has on solve/polys/etc compared to using S.Half or Rational('9.81').

@asmeurer
Copy link
Member Author

Decimal still rounds, but it's in a more expected way. I think it's intuitive to people that decimal results are correct but rounded to the n-th digit, whereas for floats they sort of work like this but not quite, because the rounding happens in base 2, and most decimal numbers aren't exactly representable. Decimal numbers are also used in real life applications like finance where you do want rounding, but things need to be done in base 10 for full correctness.

Outside of basic arithmetic, you have to round or else stay unevaluated, as the decimal expansions are infinite in general. So in that sense the kind of decimal you describe is identical to a Rational, except for the printing. So you could get what you want by using a custom printer for Rational, or a shallow subclass if you want to mix and match.

The other problem is that even with basic arithmetic, without rounding (i.e., rational numbers) the numbers grow in size with each operation. A rounded decimal, just like a rounded float, always takes up a fixed amount of memory for a given number of digits.

@oscarbenjamin
Copy link
Contributor

The purpose of the decimal module is to provide precise decimal rounding in e.g. certain financial calculations but we can't support that in SymPy since users don't have that level of control over the elementary arithmetic that takes place (e.g. in Add).

What I see as the potential usage for SymPy is that a user can input a number in decimal say as

from decimal import Decimal as D
x = D('9.81')

and then pass that number exactly to SymPy exactly with no rounding. The benefits of exactness are lost here though if SymPy then treats the number like a Float and allows rounding in subsequent calculations.

Elementary arithmetic in decimal floating point can be exact provided you don't divide by anything other than 2 or 5 (or any multiples of). SymPy can prevent the inexactness by doing e.g. 0.1/30->0.01/3 so that inexact divisors are not kept unevaluated.

@sylee957
Copy link
Member

sylee957 commented Sep 25, 2019

I think that computation between machine precision floating point numbers should follow a completely different algebraic construct, than the symbolic rational or real numbers, because the additions between IEEE floating numbers may not always follow the associativity law.

I think that it would not be that much difficult to combine different algebras together in SymPy.
For example, things can always follow the mpmath convention if mpf + mpf or python decimals if decimal + decimal, but type casting may only need to be done if mpf + decimal or such.

I see there are a lot of different number types in polys.domains, so some recipe for decimal can go there.

@smichr
Copy link
Member

smichr commented Sep 28, 2019

pass that number exactly to SymPy exactly with no rounding

This is what D = lambda s: S(s, rational=True); D('9.81') does. If the issue is that you always want to see the number rounded to 2 decimal places, then perhaps the new number class could be a subclass of Rational which retains the rounding position and displays the number to that position, e.g. 2*D('9.81')would give19.62`.

@oscarbenjamin
Copy link
Contributor

perhaps the new number class could be a subclass of Rational

Yes, it could be. I think if SymPy was going to do anything special for Decimal then it should sympify decimals into something like that.

It wouldn't need to be a subclass of Rational to be exact though. Large/small numbers can be more efficiently represented as a*10**b rather than in dense Rational format. The point is just no rounding so the internals of SymPy can cope with it.

@smichr
Copy link
Member

smichr commented Sep 29, 2019

just no rounding so the internals of SymPy can cope with it

I can understand no rounding in all but the division case: what do you have in mind for D('0.1')/3?

@asmeurer
Copy link
Member Author

So I think there's two separate things here.

  1. Rational printed as a decimal expansion, including printing an overbar for repeated decimals. This is strictly a printing issue. The object itself would still be just Rational and behave as Rational does. So for instance sqrt(1/2) would print as (0.5)**0.5 or something similar (but not as 1.414...).

  2. A Decimal class, which is a base-10 floating point equivalent of Float. It would have a fixed (but arbitrarily large) precision, and automatically numerically evaluate in certain contexts. The arithmetic would only be exact in cases where it happens to be exact, and in general things round. It would wrap the decimal.Decimal object, and use mpmath for functions that aren't implemented by decimal. It would basically act exactly like Float, except certain rounding behaviors would be more expected because the printed representation would correspond exactly to the internal representation.

Both are potentially useful. The first would actually be pretty easy to implement. To be clear, my original intention for this issue was for idea 2.

@oscarbenjamin
Copy link
Contributor

what do you have in mind for D('0.1')/3

@smichr It should remain unevaluated. Reduction would be possible in some cases like D('0.1') / 300 can evaluate to D('0.001') / 3. We can cancel factors of 2 and 5 from the divisor but nothing else.

@asmeurer I realise that you were proposing 2) but I don't think any of our users wants that.

If we didn't have Float and were considering whether to use decimal or binary floating point in the first place then I would argue for decimal floating point not because of the rounding behaviour during operations but just because it makes it possible to represent numbers like 0.1 exactly so e.g. no rounding is needed in S('0.1'). Now that we have Float I don't see any of our users asking for a second floating point type.

However I do see a lot of our users on stackoverflow, github issues etc having problems with Float. The problem they have is not that they want decimal rounding but that they actually don't want to use Float at all: they just want to use decimal literals in their code like 9.81. SymPy turns that into a Float and then can't cope with inexact arithmetic in many situations so we end up having issues with polys, integrals, solve, and so on. A very common first response to a user having trouble in these situations is that they should use Rational rather than Float. So the problem is: using decimal literals forces calculations into floating point for users who don't actually want to use floating point.

In some cases we have a kludge to "solve" the problem which is that we implicitly convert inputs to Rational. This happens e.g. with solve or with the geometry module. There are open issues coming from users who do want Float about how slow the geometry module is as a result of using nsimplify on its inputs and I think there must be cases where e.g. rational=True in solve messes up genuine floating point calculations.

This is a major problem for our users and it would be great if there were some way to fix it. One potential possibility for the future that I see is if Python itself gains Decimal literals written as something like 0.1D which could sympify to something that respects the exact decimal value. Then users can use decimals in their expressions and see decimal in the outputs and polys, integrals, geometry, solve etc can all work seamlessly with exact arithmetic.

Obviously that hinges in Python itself being changed but I think a lot of the core Python people like the idea of Decimal literals. A few years ago I said I would write a PEP about it but I didn't end up getting round to it (this discussion is making me think I should really go back and do that). One of the things I most struggled with when I did briefly try to write the PEP was that most of my arguments for Decimal literals made more sense as arguments for using decimal floating point as the default floating point in the language which would be a very different proposal.

Although you are proposing 2) I am raising this suggestion because sympify(D('0.1')) can only do one of these things so these ideas are mutually exclusive.

@jksuom
Copy link
Member

jksuom commented Sep 29, 2019

It seems to me that a new Decimal class would not be needed if we could convert Python floats to reasonably sized Rationals for solve. So we would like to have 9.81 converted to 981/100 instead of

>>> Rational(9.81)
5522539043063071/562949953421312

Some time ago, I suggested using continued fractions (#12088 (comment)) for the conversion. The idea is that the coefficients of a continued fraction are typically small, mostly one digit numbers. If there is a big coefficient, then that is the sign of rounding and the fraction can be truncated at that point with little (or no) error.

For example, with 9.81 we get

In [1]: list(continued_fraction_iterator(Rational(9.81)))
Out[1]: [9, 1, 4, 3, 1, 3, 1, 201053554792, 1, 1, 6, 2]

In [2]: it = continued_fraction_convergents(continued_fraction_iterator(Rational(9.81)))

In [3]: print(list(it))
[9, 10, 49/5, 157/16, 206/21, 775/79, 981/100, 197233537251727/20105355479279, 197233537252708/20105355479379, 394467074504435/40210710958658, 2564035984279318/261369621231327, 5522539043063071/562949953421312]

The last reasonably sized convergent 981/100 is what we would expect (and the remaining ones are results of rounding from base 10 to base 2)

@oscarbenjamin
Copy link
Contributor

Having a better conversion/approximation from Float to Rational in nsimplify would be good if it's faster or more accurate. I don't think though that it solves the problem most users have unless it's used everywhere so that a float in an expression is equivalent to a Rational of the same value as the float's literal e.g.:

0.1 * x  ->  Rational('0.1') * x

That means that sympify(0.1) would need to be changed. I expect that doing that everywhere would be problematic because:

  1. It would slow down many things (we already have complaints here and on stackoverflow that geometry is slow because of this).
  2. It would lead to numerical inaccuracy for users who do know what they are doing and really do want to use floats.
  3. It opens a new can of worms around how == and __hash__ should treat floats that sympify to the same Rational e.g. {0.1, 0.10000000000000001, Rational(0.1)}.

@meganly
Copy link
Contributor

meganly commented Nov 20, 2019

Rational printed as a decimal expansion, including printing an overbar for repeated decimals. This is strictly a printing issue. The object itself would still be just Rational and behave as Rational does. So for instance sqrt(1/2) would print as (0.5)**0.5 or something similar (but not as > 1.414...).

How would one go about trying to implement this?

@asmeurer
Copy link
Member Author

I think just subclass whatever printer you care about (str/pretty/latex), and redefine _print_Rational. There's probably already code somewhere in SymPy that does the Rational -> repeated decimal logic.

We could also add it as an option to the existing printers if that's something people want.

@oscarbenjamin
Copy link
Contributor

How would one go about trying to implement this?

I'm not sure exactly but the obvious way would be to subclass rational and change the printing code for the new class. So you have

class DecimalRational(Rational):  # Probably a better name can be found...
    pass

Then you go through the printing code and change in places where you find print_Rational you maybe add a print_DecimalRational to change the printing:

$ git grep print_Rational
sympy/polys/numberfields.py:    def _print_Rational(self, expr):
sympy/polys/numberfields.py:        return "mpi('%s')" % super(PythonCodePrinter, self)._print_Rational(expr)
sympy/polys/numberfields.py:        return "mpi('%s')" % super(PythonCodePrinter, self)._print_Rational(expr)
sympy/printing/ccode.py:    def _print_Rational(self, expr):
sympy/printing/fcode.py:    def _print_Rational(self, expr):
sympy/printing/glsl.py:    def _print_Rational(self, expr):
sympy/printing/jscode.py:    def _print_Rational(self, expr):
sympy/printing/latex.py:    def _print_Rational(self, expr):
sympy/printing/latex.py:    def _print_RationalField(self, expr):
sympy/printing/maple.py:    def _print_Rational(self, expr):
sympy/printing/mathml.py:    def _print_Rational(self, e):
sympy/printing/mathml.py:    def _print_Rational(self, e):
sympy/printing/pretty/pretty.py:    _print_Rationals = _print_Atom
sympy/printing/pretty/pretty.py:    def _print_Rational(self, expr):
sympy/printing/pretty/pretty.py:    def _print_RationalField(self, expr):
sympy/printing/printer.py:        |-- p._print_Rational(expr)
sympy/printing/printer.py:    if ``._print_Rational`` method exists in the printer, then it is called,
sympy/printing/pycode.py:    def _print_Rational(self, expr):
sympy/printing/pycode.py:        return self._print_Rational(expr)
sympy/printing/pycode.py:    def _print_Rational(self, e):
sympy/printing/pycode.py:        return self._print_Rational(e)
sympy/printing/rcode.py:    def _print_Rational(self, expr):
sympy/printing/repr.py:    def _print_RationalConstant(self, expr):
sympy/printing/repr.py:    def _print_Rational(self, expr):
sympy/printing/rust.py:    def _print_Rational(self, expr):
sympy/printing/str.py:    def _print_Rationals(self, expr):
sympy/printing/str.py:    def _print_Rational(self, expr):
sympy/printing/theanocode.py:    def _print_Rational(self, expr, **kwargs):

Probably pretty.py, str.py and latex.py are the main ones to change.

@asmeurer
Copy link
Member Author

Most of those printers wouldn't matter for this change. It doesn't need to be part of the code printers. The only printers that matter for human readable output are str, pretty, latex, mathml, and repr (and probably repr wouldn't change here either).

@oscarbenjamin
Copy link
Contributor

Most of those printers wouldn't matter for this change.

Agreed but it should be checked that the new type at least prints like Rational does in the other printers. I'm not sure if the printers have to match the exact type name so that a subclass can not inherit the printing methods.

@asmeurer
Copy link
Member Author

There's probably already code somewhere in SymPy that does the Rational -> repeated decimal logic.

I did some grepping and didn't find anything. If there is something there it is well hidden. We have code in the parser to go the other way, but I didn't see anything to go from Rational -> decimal that handles repeated decimals. Here are some suggestions on an algorithm https://softwareengineering.stackexchange.com/questions/192070/what-is-a-efficient-way-to-find-repeating-decimal.

If you don't care about the printed decimal being exactly correct, you can just use the _print_Float method to print the approximate form.

@meganly
Copy link
Contributor

meganly commented Feb 5, 2020

As a user, I want to have decimal literals in my code so that I can see decimals printed in the output while using exact arithmetic. I started a branch creating a DecimalRational subclass. The difficulty is that many SymPy functions are hard-coded to use Rational so most of the things I did with DecimalRationals were getting changed back to Rational again.

@oscarbenjamin
Copy link
Contributor

Maybe it would be better to have an option for the printers.

@meganly
Copy link
Contributor

meganly commented Feb 5, 2020

I thought about subclassing the printers but I want to mix and match decimal literals and fractions.

@asmeurer
Copy link
Member Author

asmeurer commented Feb 6, 2020

Maybe it would work better if Decimal just subclassed Number instead of Rational?

@meganly
Copy link
Contributor

meganly commented Feb 6, 2020

But would it inherit enough structure so that arithmetic and simplification methods do the right thing? I've made good progress on my branch. It wasn't that bad to change a couple places in the code from Rational to self.__class__ and be careful with S.Half.

@asmeurer
Copy link
Member Author

asmeurer commented Feb 6, 2020

Yes, thinking about it, it might not work so well because Rational is hard-coded in places like Add and Mul to auto-simplify.

For the problem of a function that returns a new expression using Rational again, I'm not sure if there's a simple solution. I think that issue would occur with any kind of class that extends another class. There isn't a clear way to maintain the same class as the original. It isn't even always obvious when that should even happen in some cases.

@meganly
Copy link
Contributor

meganly commented Feb 6, 2020

Yes, thinking about it, it might not work so well because Rational is hard-coded in places like Add and Mul to auto-simplify.

Sorry what might not work well? I would want arithmetic with DecimalRationals to simplify except in cases like sqrt(DecimalRational(1,2)) which would print as sqrt(0.5).

For the problem of a function that returns a new expression using Rational again, I'm not sure if there's a simple solution. I think that issue would occur with any kind of class that extends another class. There isn't a clear way to maintain the same class as the original. It isn't even always obvious when that should even happen in some cases.

True. It isn't obvious what should happen in some cases. For instance I think cancel(DecimalRational('1.5')*x) should be (3*x)/2, but expand(DecimalRational('1.5')*x) should print as 1.5x. I programmed DecimalRationals to behave how I wanted them to behave, which is similar to how Floats and Rationals behave (e.g. DecimalRational * Rational = DecimalRational).

@smichr
Copy link
Member

smichr commented Feb 19, 2020

code somewhere in SymPy that does the Rational -> repeated decimal

I don't recall there being any, either, and did some experimenting with an basimal routine here. The docstring:

    """print fraction n/d in basimal form (default is decimal) using parentheses to surround the repetend and an underscore followed by the base-10 value of the base. If ``b`` > 16 then ``pretty`` defaults to False and [b, [i], [r]] is returned where
    ``b`` is the base, ``i`` represents the digits of the integer
    portion of the result and ``r`` represents the digits of the
    repetend.

    Examples
    ========

    >>> basimal(11, 3)
    '3.(6)_10'
    >>> basimal(121, 9, 16)
    'D.(71C)_16'
    >>> basimal(121, 9, 16, pretty=False)
    [16, [13], [7, 1, 12]]
CLICK ME TO SEE THE CODE

def floyd(f, x0):
    tortoise = f(x0)
    hare = f(tortoise)
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(f(hare))
    mu = 0
    tortoise = x0
    while tortoise != hare:
        tortoise = f(tortoise)
        hare = f(hare)
        mu += 1
    lam = 1
    hare = f(tortoise)
    while tortoise != hare:
        hare = f(hare)
        lam += 1
    return (lam, mu)


def basimal(n, d, b=10, pretty=True):
    """print fraction n/d in basimal form (default is decimal) using parentheses to surround the repetend and
    an underscore followed by the base-10 value of the base. If ``b`` > 16 then ``pretty`` defaults to False and
    [b, [i], [r]] is returned where ``b`` is the base, ``i`` represents the digits of the integer
    portion of the result and ``r`` represents the digits of the  repetend.

    Examples
    ========

    >>> basimal(11, 3)
    '3.(6)_10'
    >>> basimal(121, 9, 16)
    'D.(71C)_16'
    >>> basimal(121, 9, 16, pretty=False)
    [16, [13], [7, 1, 12]]

    Note: the remainder while dividing by ``d`` can be no larger
    than ``d - 1`` so that is the longest that the repetend can
    be for a given ``d``. When ``d`` has no prime factors other
    than those in the base, there will be no repetend. The length
    of the repetend is not affected by ``n``; it is determined
    solely by the value of ``d`` once multiples of the prime
    factors of the base have been discarded. e.g. in base 10,
    the repetend for 1/(4*5*7) will be the same length as for 1/7 (though the digits will be in a different order).

    >>> basimal(1, 4*5*7)
    '0.00(714285)_10'
    >>> basimal(1, 7)
    '0.(142857)_10'

    To compute only the length of the repetend, one may do the
    following:

    >>> def repetend_length(d, b):
    ...     from symyp import n_order, igcd
    ...     while 1:  # remove factors in common between b and d
    ...         g = igcd(b, d)
    ...         if g == 1: break
    ...         d //= g
    ...     return n_order(b, d)
    ...
    >>> repetend_length(4*5*7,10)
    6
    """
    from sympy.ntheory.factor_ import digits
    n = as_int(n)
    d = as_int(d)
    b = as_int(b)
    assert n >= 0 and d >= 1 and b >= 2
    q, r = divmod(n, d)
    q = digits(q, b)
    if pretty and b <= 16:
        key = '0123456789ABCDEF'
        q = ''.join([key[i] for i in q[1:]])
        decimal = [q, '.']
    else:
        pretty = False
        decimal = [q.pop(0), q]
    period, first_repeat = floyd(lambda r: b * r % d, r)
    for i in range(first_repeat + period):
        q, r = divmod(b * r, d)
        decimal.append(q if not pretty else key[q])
    if pretty:
        return '{}({})_{}'.format(
            ''.join(decimal[:2 + first_repeat]),
            ''.join(decimal[2 + first_repeat:]),
            str(b))
    else:
        decimal[2 + first_repeat:] = [decimal[2 + first_repeat:]]
        return decimal

@asmeurer
Copy link
Member Author

That seems worth including somewhere regardless of the other pull request (one note though, I would use [] for the repeating decimals since sympify already parses that notation).

@smichr
Copy link
Member

smichr commented Feb 20, 2020

seems worth including

It should probably have a precision keyword to avoid printing long results for large prime denominators.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants