### Decimal Objects

Python docs: https://docs.python.org/3/library/decimal.html#context-objects

Decimal standard: http://speleotrove.com/decimal/damodel.html

Let's start by importing the `Decimal` class from the `decimal` module.

In [1]:
import decimal
from decimal import Decimal

Now we can create a `Decimal` object:

In [2]:
d = Decimal(100)

In [3]:
d

Decimal('100')

We can use integer literals to create `Decimal` objects.

We can even use `float` objects, but with a major caveat:

In [4]:
Decimal(0.1)

Decimal('0.1000000000000000055511151231257827021181583404541015625')

As you can see, the problem is that `0.1` is a float, and therefore does not have an exact representation.

When we use a float to create a `Decimal` object, we are therefore using that inexact `float` to start off with.

So, we need to tell `Decimal` that we want the **exact** number `0.1` - not the float.

We do so by essentially using a string literal that `Decimal` will parse and create an exact representation:

In [5]:
d = Decimal('0.1')

In [6]:
d

Decimal('0.1')

So now we have an exact representation of `0.1`, and, in fact:

In [7]:
Decimal('0.1') + Decimal('0.1') + Decimal('0.1') == Decimal('0.3')

True

Whereas we have seen before that this is not the case with `floats`:

In [8]:
0.1 + 0.1 + 0.1 == 0.3

False

The normal arithmetic operators work just fine with `Decimal` objects:

In [9]:
Decimal('0.1') * Decimal('0.3')

Decimal('0.03')

Division works too, but of course we still can end up with loss of precision:

In [10]:
Decimal(1) / Decimal(8)

Decimal('0.125')

works fine, but this does not have a finite representation:

In [11]:
Decimal(1) / Decimal(3)

Decimal('0.3333333333333333333333333333')

So, how many digits do we have in the approximate representation?

It depends on your context. We'll come back to contexts in detail, but for now we can see the global arithmetic context:

In [12]:
decimal.getcontext()

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[Inexact, FloatOperation, Rounded], traps=[InvalidOperation, DivisionByZero, Overflow])

We can see that our default precision is `28`, and the default rounding method is `ROUND_HALF_EVEN`, which is Banker's rounding.

We can use the `round()` function, and Python will use the `Decimal` objects special rounding method (called `quantize`)internally:

In [13]:
round(Decimal('1.135'), 2)

Decimal('1.14')

In [14]:
round(Decimal('1.145'), 2)

Decimal('1.14')

We discussed significant digits in the lecture. Let's see this:

In [15]:
d1 = Decimal('1.20')

In [16]:
d1

Decimal('1.20')

As you can see the trailing `0` is preserved - it is significant. `1.20` holds more information that `1.2` - `1.2` could be `1.2x` (x usually less than 5, depending on rounding mechanism), and `1.20` is more precise - and `Decimal` objects will take that into account.

Now, we can multiply two decimal objects:

In [17]:
d2 = Decimal('2.00')

In [18]:
d1 * d2

Decimal('2.4000')

As you can see, the multiplication result now has `5` significant digits.

Siginificant figures affect calculations - not the storage of Decimal objects created with literals:

In [19]:
d1 = Decimal('1.123456789012345678901234567890')  # 31 significant digits

In [20]:
d1

Decimal('1.123456789012345678901234567890')

As you can see the precision is retained - even though our context has a precision of `28`.

But watch what happens is we perform a calculation on the number:

In [21]:
+d1

Decimal('1.123456789012345678901234568')

As you can see the result was limited to the context precision.

Many of the standard arithmetic operators are supported by `Decimal` objects:

In [22]:
Decimal(10) // Decimal(3)

Decimal('3')

In [23]:
Decimal(10) % Decimal(3)

Decimal('1')

Even the power operator:

In [24]:
Decimal('0.1') ** Decimal('5')

Decimal('0.00001')

Functions like `abs`, `min`, `max`, and `sum` work too:

In [25]:
min(Decimal('0.1'), Decimal('0.2'), 0.3)

Decimal('0.1')

In [26]:
sum([Decimal('0.1'), Decimal('0.1'), Decimal('0.1')])

Decimal('0.3')

We *can* use the standard math functions - but they are designed to work woth floats, and so our `Decimal` objects will be converted to floats when those functions are called - maybe not what we intended.

In [27]:
import math

In [28]:
d1 = Decimal('2.0')

In [29]:
result = math.sqrt(d1)

In [30]:
type(result), result

(float, 1.4142135623730951)

Instead, `Decimal` objects do implement some specialized math functions:

In [31]:
d1.sqrt()

Decimal('1.414213562373095048801688724')

Again, note the precision in the calculation, that was based on our global context precision.

Let's process some data using both `floats` and `decimals`.

Recall the FOREX data we used a while back:

In [32]:
f_name = 'DEXUSEU.csv'

In [33]:
with open(f_name) as f:
    for _ in range(5):
        print(next(f).strip())

DATE,DEXUSEU
2015-04-03,1.0990
2015-04-06,1.1008
2015-04-07,1.0850
2015-04-08,1.0818


Let's load this up using the `csv` module:

In [34]:
import csv

with open(f_name) as f:
    reader = csv.reader(f)
    for _ in range(5):
        print(next(reader))

['DATE', 'DEXUSEU']
['2015-04-03', '1.0990']
['2015-04-06', '1.1008']
['2015-04-07', '1.0850']
['2015-04-08', '1.0818']


We'll want to convert the first field to a `datetime`, and the second field to either a `float` or a `Decimal`.

Let's write a function to do this:

In [35]:
from datetime import datetime


def load_data(f_name, dt_format, use_decimal=False):
    with open(f_name) as f:
        reader = csv.reader(f)
        next(reader)  # skip header row
        
        data = [
            (
                datetime.strptime(row[0], dt_format), 
                Decimal(row[1]) if use_decimal else float(row[1])
            )
            for row in reader
            if row[1] != '.'
        ]
    return data

In [36]:
dt_format = '%Y-%m-%d'

In [37]:
datetime.strptime('2010-01-31', dt_format)

datetime.datetime(2010, 1, 31, 0, 0)

Now let's load our data with `floats` and `Decimals`:

In [38]:
data_float = load_data(f_name, dt_format)
data_dec = load_data(f_name, dt_format, use_decimal=True)

In [39]:
data_float[0]

(datetime.datetime(2015, 4, 3, 0, 0), 1.099)

In [40]:
data_dec[0]

(datetime.datetime(2015, 4, 3, 0, 0), Decimal('1.0990'))

Now let's sum up all the values in the data and time the performance difference (we'll run the calculations a few times so we can see the time differences more clearly):

In [41]:
from time import perf_counter

In [42]:
start = perf_counter()
for _ in range(10_000):
    result = sum(row[1] for row in data_float)
end = perf_counter()
print(end - start, result)

0.6894855999998981 1411.6124


In [43]:
start = perf_counter()
for _ in range(10_000):
    result = sum(row[1] for row in data_dec)
end = perf_counter()
print(end - start, result)

2.1389683000015793 1411.6124


As we can see there is a performance impact when using `Decimal` objects. But, we can see that the precision of each value in the data was maintained during the computation.

There is also a storage impact - more memory is needed to store a `Decimal` object than a `float`.

We can use `sys.getsizeof` to see this.

In [44]:
from sys import getsizeof

In [45]:
getsizeof(0.1)

24

In [46]:
getsizeof(Decimal('0.1'))

104

So we can see how much memory our `data_float` and `data_dec` objects use:

In [47]:
sum(getsizeof(el[1]) for el in data_float)

30024

In [48]:
sum(getsizeof(el[1]) for el in data_dec)

130104