## Python Rounding

https://realpython.com/python-rounding/

In [22]:
round(2.5), round(1.5)

(2, 2)

## How Much Impact Can Rounding Have?

Suppose you have an incredibly lucky day and find $100 on the ground. Rather than spending all your money at once, you decide to play it smart and invest your money by buying some shares of different stocks.

The value of a stock depends on supply and demand. The more people there are who want to buy a stock, the more value that stock has, and vice versa. In high volume stock markets, the value of a particular stock can fluctuate on a second-by-second basis.

Let’s run a little experiment. We’ll pretend the overall value of the stocks you purchased fluctuates by some small random number each second, say between $0.05 and -$0.05. This fluctuation may not necessarily be a nice value with only two decimal places. For example, the overall value may increase by $0.031286 one second and decrease the next second by $0.028476.

You don’t want to keep track of your value to the fifth or sixth decimal place, so you decide to chop everything off after the third decimal place. In rounding jargon, this is called truncating the number to the third decimal place. There’s some error to be expected here, but by keeping three decimal places, this error couldn’t be substantial. Right?

To run our experiment using Python, let’s start by writing a truncate() function that truncates a number to three decimal places:

In [2]:
def truncate(n):
    return int(n * 1000) / 1000

In [8]:
print(truncate(100+0.04624658))

100.046


The truncate() function works by first shifting the decimal point in the number n three places to the right by multiplying n by 1000. The integer part of this new number is taken with int(). Finally, the decimal point is shifted three places back to the left by dividing n by 1000.

Next, let’s define the initial parameters of the simulation. You’ll need two variables: one to keep track of the actual value of your stocks after the simulation is complete and one for the value of your stocks after you’ve been truncating to three decimal places at each step.

Start by initializing these variables to 100:

In [6]:
actual_value, truncated_value = 100, 100

Now let’s run the simulation for 1,000,000 seconds (approximately 11.5 days). For each second, generate a random value between -0.05 and 0.05 with the uniform() function in the random module, and then update actual and truncated:

In [7]:
import random
random.seed(100)

for _ in range(1000000):
    randn = random.uniform(-0.05, 0.05)
    actual_value = actual_value + randn
    truncated_value = truncate(truncated_value + randn)
print(actual_value, truncated_value)

96.45273913513529 0.239


Ignoring for the moment that round() doesn’t behave quite as you expect, let’s try re-running the simulation. We’ll use round() this time to round to three decimal places at each step, and seed() the simulation again to get the same results as before:

In [9]:
random.seed(100)
actual_value, rounded_value = 100, 100

for _ in range(1000000):
    randn = random.uniform(-0.05, 0.05)
    actual_value = actual_value + randn
    rounded_value = round(rounded_value + randn, 3)
print(actual_value, rounded_value)

96.45273913513529 96.258


Shocking as it may seem, this exact error caused quite a stir in the early 1980s when the system designed for recording the value of the Vancouver Stock Exchange truncated the overall index value to three decimal places instead of rounding. Rounding errors have swayed elections and even resulted in the loss of life.

How you round numbers is important, and as a responsible developer and software designer, you need to know what the common issues are and how to deal with them. Let’s dive in and investigate what the different rounding methods are and how you can implement each one in pure Python.

A Menagerie of Methods

There are a plethora of rounding strategies, each with advantages and disadvantages. In this section, you’ll learn about some of the most common techniques, and how they can influence your data.


In [10]:
#1. Truncation
def truncate(n, decimals=0):
    multiplier = 10 ** decimals
    return int(n * multiplier) / multiplier

#2. Rounding Up
import math
math.ceil(1.2)
def round_up(n, decimals=0):
    multiplier = 10 ** decimals
    return math.ceil(n * multiplier) / multiplier


#3. Rounding Down
math.floor(1.2)
def round_down(n, decimals=0):
    multiplier = 10 ** decimals
    return math.floor(n * multiplier) / multiplier

Interlude: Rounding Bias

You’ve now seen three rounding methods: truncate(), round_up(), and round_down(). All three of these techniques are rather crude when it comes to preserving a reasonable amount of precision for a given number.

There is one important difference between truncate() and round_up() and round_down() that highlights an important aspect of rounding: symmetry around zero.

Recall that round_up() isn’t symmetric around zero. In mathematical terms, a function f(x) is symmetric around zero if, for any value of x, f(x) + f(-x) = 0. For example, round_up(1.5) returns 2, but round_up(-1.5) returns -1. The round_down() function isn’t symmetric around 0, either.

On the other hand, the truncate() function is symmetric around zero. This is because, after shifting the decimal point to the right, truncate() chops off the remaining digits. When the initial value is positive, this amounts to rounding the number down. Negative numbers are rounded up. So, truncate(1.5) returns 1, and truncate(-1.5) returns -1.

The concept of symmetry introduces the notion of rounding bias, which describes how rounding affects numeric data in a dataset.

The “rounding up” strategy has a round towards positive infinity bias, because the value is always rounded up in the direction of positive infinity. Likewise, the “rounding down” strategy has a round towards negative infinity bias.

The “truncation” strategy exhibits a round towards negative infinity bias on positive values and a round towards positive infinity for negative values. Rounding functions with this behavior are said to have a round towards zero bias, in general.

Let’s see how this works in practice. Consider the following list of floats:

In [11]:
data = [1.25, -2.67, 0.43, -1.79, 4.32, -8.19]
import statistics

statistics.mean(data)

-1.1083333333333332

Now apply each of round_up(), round_down(), and truncate() in a list comprehension to round each number in data to one decimal place and calculate the new mean:

In [12]:
ru_data = [round_up(n, 1) for n in data]
rd_data = [round_down(n, 1) for n in data]
tr_data = [truncate(n, 1) for n in data]
statistics.mean(ru_data), statistics.mean(rd_data), statistics.mean(tr_data)

(-1.0333333333333332, -1.1333333333333333, -1.0833333333333333)

After every number in data is rounded up, the new mean is about -1.033, which is greater than the actual mean of about 1.108. Rounding down shifts the mean downwards to about -1.133. The mean of the truncated values is about -1.08 and is the closest to the actual mean.

This example does not imply that you should always truncate when you need to round individual values while preserving a mean value as closely as possible. The data list contains an equal number of positive and negative values. The truncate() function would behave just like round_up() on a list of all positive values, and just like round_down() on a list of all negative values.

What this example does illustrate is the effect rounding bias has on values computed from data that has been rounded. You will need to keep these effects in mind when drawing conclusions from data that has been rounded.

Typically, when rounding, you are interested in rounding to the nearest number with some specified precision, instead of just rounding everything up or down.

For example, if someone asks you to round the numbers 1.23 and 1.28 to one decimal place, you would probably respond quickly with 1.2 and 1.3. The truncate(), round_up(), and round_down() functions don’t do anything like this.

What about the number 1.25? You probably immediately think to round this to 1.3, but in reality, 1.25 is equidistant from 1.2 and 1.3. In a sense, 1.2 and 1.3 are both the nearest numbers to 1.25 with single decimal place precision. The number 1.25 is called a tie with respect to 1.2 and 1.3. In cases like this, you must assign a tiebreaker.

The way that most people are taught break ties is by rounding to the greater of the two possible numbers.

Rounding Half Up

The “rounding half up” strategy rounds every number to the nearest number with the specified precision, and breaks ties by rounding up. Here are some examples:

In [7]:
def round_half_up(n, decimals=0):
    multiplier = 10 ** decimals
    return math.floor(n*multiplier + 0.5) / multiplier
-1.225 * 100

-122.50000000000001

In [14]:
0.1 + 0.1 + 0.1

0.30000000000000004

Seeing this for the first time can be pretty shocking, but this is a classic example of floating-point representation error. It has nothing to do with Python. The error has to do with how machines store floating-point numbers in memory.

Most modern computers store floating-point numbers as binary decimals with 53-bit precision. Only numbers that have finite binary decimal representations that can be expressed in 53 bits are stored as an exact value. Not every number has a finite binary decimal representation.

For example, the decimal number 0.1 has a finite decimal representation, but infinite binary representation. Just like the fraction 1/3 can only be represented in decimal as the infinitely repeating decimal 0.333..., the fraction 1/10 can only be expressed in binary as the infinitely repeating decimal 0.0001100110011....

A value with an infinite binary representation is rounded to an approximate value to be stored in memory. The method that most machines use to round is determined according to the IEEE-754 standard, which specifies rounding to the nearest representable binary fraction.

The Python docs have a section called Floating Point Arithmetic: Issues and Limitations which has this to say about the number 0.1:

In [15]:
0.1

0.1

The fact that Python says that -1.225 * 100 is -122.50000000000001 is an artifact of floating-point representation error. You might be asking yourself, “Okay, but is there a way to fix this?” A better question to ask yourself is “Do I need to fix this?”

Floating-point numbers do not have exact precision, and therefore should not be used in situations where precision is paramount. For applications where the exact precision is necessary, you can use the Decimal class from Python’s decimal module. You’ll learn more about the Decimal class below.

If you have determined that Python’s standard float class is sufficient for your application, some occasional errors in round_half_up() due to floating-point representation error shouldn’t be a concern.

Now that you’ve gotten a taste of how machines round numbers in memory, let’s continue our discussion on rounding strategies by looking at another way to break a tie.

Rounding Half Down

The “rounding half down” strategy rounds to the nearest number with the desired precision, just like the “rounding half up” method, except that it breaks ties by rounding to the lesser of the two numbers. Here are some examples:

In [16]:
def round_half_down(n, decimals=0):
    multiplier = 10 ** decimals
    return math.ceil(n*multiplier - 0.5) / multiplier


Both round_half_up() and round_half_down() have no bias in general. However, rounding data with lots of ties does introduce a bias. For an extreme example, consider the following list of numbers:

In [17]:
data = [-2.15, 1.45, 4.35, -12.75]
statistics.mean(data)

-2.275

In [18]:
rhu_data = [round_half_up(n, 1) for n in data]
rhd_data = [round_half_down(n, 1) for n in data]
statistics.mean(rhu_data), statistics.mean(rhd_data)

(-2.2249999999999996, -2.325)

Every number in data is a tie with respect to rounding to one decimal place. The round_half_up() function introduces a round towards positive infinity bias, and round_half_down() introduces a round towards negative infinity bias.

The remaining rounding strategies we’ll discuss all attempt to mitigate these biases in different ways.

Rounding Half Away From Zero

If you examine round_half_up() and round_half_down() closely, you’ll notice that neither of these functions is symmetric around zero:

To implement the “rounding half away from zero” strategy on a number n, you start as usual by shifting the decimal point to the right a given number of places. Then you look at the digit d immediately to the right of the decimal place in this new number. At this point, there are four cases to consider:

    If n is positive and d >= 5, round up
    If n is positive and d < 5, round down
    If n is negative and d >= 5, round down
    If n is negative and d < 5, round up

After rounding according to one of the above four rules, you then shift the decimal place back to the left.

Given a number n and a value for decimals, you could implement this in Python by using round_half_up() and round_half_down():

In [None]:
if n >= 0:
    rounded = round_half_up(n, decimals)
else:
    rounded = round_half_down(n, decimals)

That’s easy enough, but there’s actually a simpler way!

If you first take the absolute value of n using Python’s built-in abs() function, you can just use round_half_up() to round the number. Then all you need to do is give the rounded number the same sign as n. One way to do this is using the math.copysign() function.

math.copysign() takes two numbers a and b and returns a with the sign of b:

In [3]:
import math
math.copysign(1, -2)

-1.0

Notice that math.copysign() returns a float, even though both of its arguments were integers.

Using abs(), round_half_up() and math.copysign(), you can implement the “rounding half away from zero” strategy in just two lines of Python:

In [4]:
def round_half_away_from_zero(n, decimals=0):
    rounded_abs = round_half_up(abs(n), decimals)
    return math.copysign(rounded_abs, n)


In round_half_away_from_zero(), the absolute value of n is rounded to decimals decimal places using round_half_up() and this result is assigned to the variable rounded_abs. Then the original sign of n is applied to rounded_abs using math.copysign(), and this final value with the correct sign is returned by the function.

The round_half_away_from_zero() function rounds numbers the way most people tend to round numbers in everyday life. Besides being the most familiar rounding function you’ve seen so far, round_half_away_from_zero() also eliminates rounding bias well in datasets that have an equal number of positive and negative ties.

Let’s check how well round_half_away_from_zero() mitigates rounding bias in the example from the previous section:

In [5]:
import statistics
data = [-2.15, 1.45, 4.35, -12.75] 
statistics.mean(data)

-2.275

In [8]:
rhaz_data = [round_half_away_from_zero(n, 1) for n in data]
statistics.mean(rhaz_data)

-2.2750000000000004

However, round_half_away_from_zero() will exhibit a rounding bias when you round every number in datasets with only positive ties, only negative ties, or more ties of one sign than the other. Bias is only mitigated well if there are a similar number of positive and negative ties in the dataset.

How do you handle situations where the number of positive and negative ties are drastically different? The answer to this question brings us full circle to the function that deceived us at the beginning of this article: Python’s built-in round() function.

Rounding Half To Even

One way to mitigate rounding bias when rounding values in a dataset is to round ties to the nearest even number at the desired precision. Here are some examples of how to do that:

The “rounding half to even strategy” is the strategy used by Python’s built-in round() function and is the default rounding rule in the IEEE-754 standard. This strategy works under the assumption that the probabilities of a tie in a dataset being rounded down or rounded up are equal. In practice, this is usually the case.

Now you know why round(2.5) returns 2. It’s not a mistake. It is a conscious design decision based on solid recommendations.

To prove to yourself that round() really does round to even, try it on a few different values:

The round() function is nearly free from bias, but it isn’t perfect. For example, rounding bias can still be introduced if the majority of the ties in your dataset round up to even instead of rounding down. Strategies that mitigate bias even better than “rounding half to even” do exist, but they are somewhat obscure and only necessary in extreme circumstances.

Finally, round() suffers from the same hiccups that you saw in round_half_up() thanks to floating-point representation error:

In [10]:
# expected 2.68
round(2.675, 2)

2.67

You shouldn’t be concerned with these occasional errors if floating-point precision is sufficient for your application.

When precision is paramount, you should use Python’s Decimal class.

The Decimal Class

Python’s decimal module is one of those “batteries-included” features of the language that you might not be aware of if you’re new to Python. The guiding principle of the decimal module can be found in the documentation:

Decimal “is based on a floating-point model which was designed with people in mind, and necessarily has a paramount guiding principle – computers must provide an arithmetic that works in the same way as the arithmetic that people learn at school.” – excerpt from the decimal arithmetic specification.

The benefits of the decimal module include:

    Exact decimal representation: 0.1 is actually 0.1, and 0.1 + 0.1 + 0.1 - 0.3 returns 0, as you’d expect.
    Preservation of significant digits: When you add 1.20 and 2.50, the result is 3.70 with the trailing zero maintained to indicate significance.
    User-alterable precision: The default precision of the decimal module is twenty-eight digits, but this value can be altered by the user to match the problem at hand.


In [11]:
import decimal
decimal.getcontext()

Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999, Emax=999999, capitals=1, clamp=0, flags=[], traps=[InvalidOperation, DivisionByZero, Overflow])

decimal.getcontext() returns a Context object representing the default context of the decimal module. The context includes the default precision and the default rounding strategy, among other things.

As you can see in the example above, the default rounding strategy for the decimal module is ROUND_HALF_EVEN. This aligns with the built-in round() function and should be the preferred rounding strategy for most purposes.

Let’s declare a number using the decimal module’s Decimal class. To do so, create a new Decimal instance by passing a string containing the desired value:

In [12]:
from decimal import Decimal
Decimal("0.1")

Decimal('0.1')

It is possible to create a Decimal instance from a floating-point number, but doing so introduces floating-point representation error right off the bat. For example, check out what happens when you create a Decimal instance from the floating-point number 0.1:

In [13]:
Decimal(0.1)

Decimal('0.1000000000000000055511151231257827021181583404541015625')

In [14]:
Decimal('0.1') + Decimal('0.1') + Decimal('0.1')

Decimal('0.3')

Rounding a Decimal is done with the .quantize() method:

In [15]:
Decimal("1.65").quantize(Decimal("1.0"))

Decimal('1.6')

Okay, that probably looks a little funky, so let’s break that down. The Decimal("1.0") argument in .quantize() determines the number of decimal places to round the number. Since 1.0 has one decimal place, the number 1.65 rounds to a single decimal place. The default rounding strategy is “rounding half to even,” so the result is 1.6.

Recall that the round() function, which also uses the “rounding half to even strategy,” failed to round 2.675 to two decimal places correctly. Instead of 2.68, round(2.675, 2) returns 2.67. Thanks to the decimal modules exact decimal representation, you won’t have this issue with the Decimal class:

In [16]:
Decimal("2.675").quantize(Decimal("1.00"))

Decimal('2.68')

Another benefit of the decimal module is that rounding after performing arithmetic is taken care of automatically, and significant digits are preserved. To see this in action, let’s change the default precision from twenty-eight digits to two, and then add the numbers 1.23 and 2.32:

In [17]:
decimal.getcontext().prec = 2
Decimal("1.23") + Decimal("2.32")

Decimal('3.6')

To change the precision, you call decimal.getcontext() and set the .prec attribute. If setting the attribute on a function call looks odd to you, you can do this because .getcontext() returns a special Context object that represents the current internal context containing the default parameters used by the decimal module.

The exact value of 1.23 plus 2.32 is 3.55. Since the precision is now two digits, and the rounding strategy is set to the default of “rounding half to even,” the value 3.55 is automatically rounded to 3.6.

To change the default rounding strategy, you can set the decimal.getcontect().rounding property to any one of several flags. The following table summarizes these flags and which rounding strategy they implement:

In [18]:
import numpy as np
np.random.seed(444)
data = np.random.randn(3, 4)
data

array([[ 0.35743992,  0.3775384 ,  1.38233789,  1.17554883],
       [-0.9392757 , -1.14315015, -0.54243951, -0.54870808],
       [ 0.20851975,  0.21268956,  1.26802054, -0.80730293]])

In [19]:
np.ceil(data)

array([[ 1.,  1.,  2.,  2.],
       [-0., -1., -0., -0.],
       [ 1.,  1.,  2., -0.]])

Actually, the IEEE-754 standard requires the implementation of both a positive and negative zero. What possible use is there for something like this? Wikipedia knows the answer:

Informally, one may use the notation “−0” for a negative value that was rounded to zero. This notation may be useful when a negative sign is significant; for example, when tabulating Celsius temperatures, where a negative sign means below freezing.

You might have noticed that a lot of the rounding strategies we discussed earlier are missing here. For the vast majority of situations, the around() function is all you need. If you need to implement another strategy, such as round_half_up(), you can do so with a simple modification:

In [20]:
def round_half_up(n, decimals=0):
    multiplier = 10 ** decimals
    # Replace math.floor with np.floor
    return np.floor(n*multiplier + 0.5) / multiplier


## Applications and Best Practices

## Store More and Round Late

If you have the space available, you should store the data at full precision. If storage is an issue, a good rule of thumb is to store at least two or three more decimal places of precision than you need for your calculation.

Finally, when you compute the daily average temperature, you should calculate it to the full precision available and round the final answer.

## Obey Local Currency Regulations

When you order a cup of coffee for 2.40 at the coffee shop, the merchant typically adds a required tax. The amount of that tax depends a lot on where you are geographically, but for the sake of argument, let’s say it’s 6%. The tax to be added comes out to 0.144. Should you round this up to 0.15 or down to 0.14? The answer probably depends on the regulations set forth by the local government!

Situations like this can also arise when you are converting one currency to another. In 1999, the European Commission on Economical and Financial Affairs codified the use of the “rounding half away from zero” strategy when converting currencies to the Euro, but other currencies may have adopted different regulations.

Another scenario, “Swedish rounding”, occurs when the minimum unit of currency at the accounting level in a country is smaller than the lowest unit of physical currency. For example, if a cup of coffee costs $2.54 after tax, but there are no 1-cent coins in circulation, what do you do? The buyer won’t have the exact amount, and the merchant can’t make exact change.

How situations like this are handled is typically determined by a country’s government. You can find a list of rounding methods used by various countries on Wikipedia.

If you are designing software for calculating currencies, you should always check the local laws and regulations in your users’ locations.


## When In Doubt, Round Ties To Even

When you are rounding numbers in large datasets that are used in complex computations, the primary concern is limiting the growth of the error due to rounding.

Of all the methods we’ve discussed in this article, the “rounding half to even” strategy minimizes rounding bias the best. Fortunately, Python, NumPy, and Pandas all default to this strategy, so by using the built-in rounding functions you’re already well protected!
