# [Chapter 3. Numbers, Dates, and Times](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html)

Performing mathematical calculations with integers and floating-point numbers is easy in Python.  
However, if you need to perform calculations with fractions, arrays, or dates and times, a bit more work is required.  
The focus of this chapter is on such topics.

## [Rounding Numerical Values](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_rounding_numerical_values)

### Problem

You want to round a floating-point number to a fixed number of decimal places.

### Solution

For simple rounding, use the built-in `round(value, ndigits)` function.

In [363]:
print(round(1.23, 1))
print(round(1.27, 1))
print(round(-1.23, 1))
print(round(1.25361, 3))

1.2
1.3
-1.2
1.254


When a value is exactly halfway between two choices, the behavior of round is to round to the nearest even digit.  
That is, values such as 1.5 or 2.5 both get rounded to 2.  
The number of digits given to `round()` can be negative, in which case rounding takes place for tens, hundreds, thousands, and so on.

In [364]:
a = 1627731
print(round(a, -1))
print(round(a, -2))
print(round(a, -3))

1627730
1627700
1628000


### Discussion

Don't confuse rounding with formatting a value for output.  
If your goal is simply to output a numerical value with a certain number of decimal places, you don't usually need to use `round()`.  
Instead, just specify the desired precision when formatting, like this:

In [365]:
x = 1.23456
print(format(x, '0.2f'))
print(format(x, '0.3f'))
print('Value is {:0.3f}'.format(x))

1.23
1.235
Value is 1.235


Also, resist the urge to round floating-point numbers to "fix" perceived accuracy problems.  
Don't do this:

In [366]:
a = 2.1
b = 4.2
c = a + b
print(c)
# "Fix" result?:
c = round(c, 2)
print(c)

6.300000000000001
6.3


For most applications involving floating point, it’s simply not necessary (or recommended) to do this.  
Although there are small errors introduced into calculations, the behavior of those errors are understood and tolerated.  
If avoiding such errors is important (e.g., in financial applications, perhaps), consider the use of the `decimal` module, which is discussed in the next recipe.

## [Performing Accurate Decimal Calculations](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_performing_accurate_decimal_calculations)

### Problem

You need to perform accurate calculations with decimal numbers, and you don't want the small errors that naturally occur with floats.

### Solution

One issue with floating-point numbers is that they can't accurately represent all base-10 decimals.  
Moreover, even simple mathematical calculations introduce small errors.

In [367]:
a = 4.2
b = 2.1
print(a + b)
print((a + b) == 6.3)

6.300000000000001
False


These errors are a "feature" of the underlying CPU and the IEEE 754 arithmetic performed by its floating-point unit.  
Since Python's float data type stores data using the native representation, there's nothing you can do to avoid such errors if you write your code using `float` instances.

If you want more accuracy (and are willing to give up some performance), you can use the `decimal` module:

In [368]:
from decimal import Decimal

a = Decimal('4.2')
b = Decimal('2.1')
print(a + b)
print((a + b) == Decimal('6.3'))

6.3
True


At first glance, it might look a little wierd (i.e., specifying numbers as strings).  
However, `Decimal` objects work in every way that you would expect them to (like supporting all of the usual math operations, and so on).  
If you print them or use them in string formatting functions, they look like normal numbers.

A major feature of `decimal` is that it allows you to control different aspects of calculations, including number of digits and rounding.  
To do this, you create a local context and change its settings.

In [369]:
from decimal import localcontext

a = Decimal('1.3')
b = Decimal('1.7')
print(a / b)

0.7647058823529411764705882353


In [370]:
with localcontext() as ctx:
    ctx.prec = 3
    print(a / b)

0.765


In [371]:
with localcontext() as ctx:
    ctx.prec = 50
    print(a / b)

0.76470588235294117647058823529411764705882352941176


### Discussion

The decimal module implements [IBM’s "General Decimal Arithmetic Specification."](http://speleotrove.com/decimal/) Needless to say, there are a huge number of configuration options that are beyond the scope of this book.  
Newcomers to Python might be inclined to use the decimal module to work around perceived accuracy problems with the float data type.  
However, it’s really important to understand your application domain.  
If you’re working with science or engineering problems, computer graphics, or most things of a scientific nature, it’s simply more common to use the normal floating-point type.  
For one, very few things in the real world are measured to the 17 digits of accuracy that floats provide.  
Thus, tiny errors introduced in calculations just don’t matter.  
Second, the performance of native floats is significantly faster -- something that’s important if you’re performing a large number of calculations.  
That said, you can’t ignore the errors completely.  
Mathematicians have spent a lot of time studying various algorithms, and some handle errors better than others.  
You also have to be a little careful with effects due to things such as [subtractive cancellation](https://ece.uwaterloo.ca/~dwharder/NumericalAnalysis/02Numerics/Weaknesses/#howto) and adding large and small numbers together.

In [372]:
nums = [1.23e+18, 1, -1.23e+18]
# Watch, ladies and gentleman, as the 1 magically disappears:
sum(nums)

0.0

This latter example can be addressed by using a more accurate implementation in math.fsum():

In [373]:
import math
math.fsum(nums)

1.0

However, for other algorithms, you really need to study the algorithm and understand its error propagation properties.  
All of this said, the main use of the decimal module is in programs involving things such as finance.  
In such programs, it is extremely annoying to have small errors creep into the calculation.  
Thus, decimal provides a way to avoid that.  
It is also common to encounter `Decimal` objects when Python interfaces with databases -- again, especially when accessing financial data.

## [Formatting Numbers for Output](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_formatting_numbers_for_output)

### Problem

You need to format a number for output, controlling the number of digits, alignment, inclusion of a thousands separator, and other details.

### Solution

To format a single number for output, use the built-in `format()` function.

In [374]:
x = 1234.56789
# Two decimal places of accuracy:
format(x, '0.2f')

'1234.57'

In [375]:
# Right justified in 10 characters, one-digit accuracy:
format(x, '>10.1f')

'    1234.6'

In [376]:
# Left justified:
format(x, '<10.1f')

'1234.6    '

In [377]:
# Centered:
format(x, '^10.1f')

'  1234.6  '

In [378]:
# Inclusion of thousands separator:
format(x, ',')

'1,234.56789'

In [379]:
format(x, '0,.1f')

'1,234.6'

If you want to use exponential notation, change the `f` to an `E` or `e`, depending on the case you want used for the exponential specifier.

In [380]:
format(x, 'e')

'1.234568e+03'

In [381]:
format(x, '0.2E')

'1.23E+03'

The general form of the width and precision in both cases is `'[<>^]?width[,]?(.digits)?'` where width and digits are integers and `?` signifies optional parts.  
The same format codes are also used in the `.format()` method of strings.

In [382]:
'The value is {:0,.2f}'.format(x)

'The value is 1,234.57'

### Discussion

Formatting numbers for output is usually straightforward.  
The technique shown works for both floating-point numbers and `Decimal` numbers in the `decimal` module.  
When the number of digits is restricted, values are rounded away according to the same rules of the `round()` function.

In [383]:
x

1234.56789

In [384]:
format(x, '0.1f')

'1234.6'

In [385]:
format(-x, '0.1f')

'-1234.6'

Formatting of values with a thousands separator is not locale aware.  
If you need to take that into account, you might investigate functions in the `locale` module.  
You can also swap separator characters using the `translate()` method of strings.

In [386]:
swap_separators = { ord('.'):',', ord(','):'.'}
format(x, ',').translate(swap_separators)

'1.234,56789'

In a lot of Python code, numbers are formatted using the % operator.

In [387]:
'%0.2f' % x

'1234.57'

In [388]:
'%10.1f' % x

'    1234.6'

In [389]:
'%-10.1f' % x

'1234.6    '

This formatting is still acceptable, but less powerful than the more modern `format()` method.  
For example, some features (e.g., adding thousands separators) aren’t supported when using the `%` operator to format numbers.

## [Working with Binary, Octal, and Hexadecimal Integers](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_working_with_binary_octal_and_hexadecimal_integers)

### Problem

You need to convert or output integers represented by binary, octal, or hexadecimal digits.

### Solution

To convert an integer into a binary, octal, or hexadecimal text string, use the `bin()`, `oct()`, or `hex()` functions, respectively.

In [390]:
x = 1234
bin(x)

'0b10011010010'

In [391]:
oct(x)

'0o2322'

In [392]:
hex(x)

'0x4d2'

Alternatively, you can use the `format()` function if you don't want the `0b`, `0o`, or `0x` prefixes to appear.

In [393]:
format(x, 'b')

'10011010010'

In [394]:
format(x, 'o')

'2322'

In [395]:
format(x, 'x')

'4d2'

Integers are signed, so if you are working with negative numbers, the output will also include a sign.

In [396]:
x = -1234
format(x, 'b')

'-10011010010'

In [397]:
format(x, 'x')

'-4d2'

If you need to produce an unsigned bit value instead, you'll need to add in the maximum value to set the bit length.  
For example, to show a 32-bit value, use the following:

In [398]:
x = -1234
format(2**32 + x, 'b')

'11111111111111111111101100101110'

In [399]:
format(2**32 + x, 'x')

'fffffb2e'

To convert integer strings in different bases, simply use the `int()` function with an appropriate base.

In [400]:
int('4d2', 16)

1234

In [401]:
int('10011010010', 2)

1234

### Discussion

For the most part, working with binary, octal, and hexadecimal integers is straightforward.  
Just remember that these conversions only pertain to the conversion of integers to and from a textual representation.  
Under the covers, there’s just one integer type.  
Finally, there is one caution for programmers who use octal.  
The Python syntax for specifying octal values is slightly different than many other languages.  
For example, if you try something like this, you’ll get a syntax error:

Make sure you prefix the octal value with 0o, as shown here:

## [Packing and Unpacking Large Integers from Bytes](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_packing_and_unpacking_large_integers_from_bytes)

### Problem

You have a byte string and you need to unpack it into an integer value.  
Alternatively, you need to convert a large integer into a byte string.

### Solution

Suppose your program needs to work with a 16-element byte string that holds a 128-bit integer value.

In [402]:
data = b'\x00\x124V\x00x\x90\xab\x00\xcd\xef\x01\x00#\x004'

To interpret the bytes as an integer, use `int.from_bytes()`, and specify the byte ordering like this:

In [403]:
len(data)

16

In [404]:
int.from_bytes(data, 'little')

69120565665751139577663547927094891008

In [405]:
int.from_bytes(data, 'big')

94522842520747284487117727783387188

To convert a large integer value back into a byte string, use the `int.to_bytes()` method, specifying the number of bytes and the byte order.

In [406]:
x = 94522842520747284487117727783387188
x.to_bytes(16, 'big')

b'\x00\x124V\x00x\x90\xab\x00\xcd\xef\x01\x00#\x004'

In [407]:
x.to_bytes(16, 'little')

b'4\x00#\x00\x01\xef\xcd\x00\xab\x90x\x00V4\x12\x00'

### Discussion

Converting large integer values to and from byte strings is not a common operation.  
However, it sometimes arises in certain application domains, such as cryptography or networking.  
For instance, IPv6 network addresses are represented as 128-bit integers.  
If you are writing code that needs to pull such values out of a data record, you might face this problem.  
As an alternative to this recipe, you might be inclined to unpack values using the `struct` module, as described in ["Reading and Writing Binary Arrays of Structures"](http://chimera.labs.oreilly.com/books/1230000000393/ch06.html#_problem_104).  
This works, but the size of integers that can be unpacked with `struct` is limited.  
Thus, you would need to unpack multiple values and combine them to create the final value.

In [408]:
data

b'\x00\x124V\x00x\x90\xab\x00\xcd\xef\x01\x00#\x004'

In [409]:
import struct

hi, lo = struct.unpack('>QQ', data)
print(hi)
print(lo)

5124093560524971
57965157801984052


In [410]:
(hi << 64) + lo

94522842520747284487117727783387188

The specification of the byte order (`little` or `big`) just indicates whether the bytes that make up the integer value are listed from the least to most significant or the other way around.  
This is easy to view using a carefully crafted hexadecimal value:

In [411]:
x = 0x01020304
x.to_bytes(4, 'big')

b'\x01\x02\x03\x04'

In [412]:
x.to_bytes(4, 'little')

b'\x04\x03\x02\x01'

If you try to pack an integer into a byte string, but it won't fit, you'll get an error.  
You can use the `int.bit_length()` method to determine how many bits are required to store a value if needed:

In [413]:
x = 523 ** 23
x

335381300113661875107536852714019056160355655333978849017944067

In [414]:
x.bit_length()

208

In [415]:
nbytes, rem = divmod(x.bit_length(), 8)
print(nbytes)
print(rem)

26
0


In [416]:
if rem:
    nbytes += 1

In [417]:
x.to_bytes(nbytes, 'little')

b'\x03X\xf1\x82iT\x96\xac\xc7c\x16\xf3\xb9\xcf\x18\xee\xec\x91\xd1\x98\xa2\xc8\xd9R\xb5\xd0'

## [Performing Complex-Valued Math](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_performing_complex_valued_math)

### Problem

Your code for interacting with the latest web authentication scheme has encountered a singularity and your only solution is to go around it in the complex plane.  
Or maybe you just need to perform some calculations using complex numbers.

### Solution

Complex numbers can be specified using the `complex(real, imag)` function or by using floating-point numbers with a `j` suffix.

In [418]:
a = complex(2, 4)
a

(2+4j)

In [419]:
b = 3 - 5j

The real, imaginary, and conjugate values are easy to obtain, as shown here:

In [420]:
a.real

2.0

In [421]:
a.imag

4.0

In [422]:
a.conjugate()

(2-4j)

Also, all of the usual mathematical operators work:

In [423]:
a + b

(5-1j)

In [424]:
a * b

(26+2j)

In [425]:
a / b

(-0.4117647058823529+0.6470588235294118j)

In [426]:
abs(a)

4.47213595499958

To perform additional complex-valued functions such as sines, cosines, or square roots, use the `cmath` module:

In [427]:
import cmath
cmath.sin(a)

(24.83130584894638-11.356612711218174j)

In [428]:
cmath.cos(a)

(-11.36423470640106-24.814651485634187j)

In [429]:
cmath.exp(a)

(-4.829809383269385-5.5920560936409816j)

### Discussion

Most of Python's math-related modules are aware of complex values.  
For example, if you use `numpy`, it is straightforward to make arrays of complex values and perform operations on them:

In [430]:
import numpy as np
a = np.array([2+3j, 4+5j, 6-7j, 8+9j])
a

array([ 2.+3.j,  4.+5.j,  6.-7.j,  8.+9.j])

In [431]:
a + 2

array([  4.+3.j,   6.+5.j,   8.-7.j,  10.+9.j])

In [432]:
np.sin(a)

array([    9.15449915  -4.16890696j,   -56.16227422 -48.50245524j,
        -153.20827755-526.47684926j,  4008.42651446-589.49948373j])

Python's standard mathematical functions do not produce complex values by default, so it is unlikely that such a value would accidentally show up in your code.

If you want complex numbers to be produced as a result, you have to explicitly use `cmath` or declare the use of a complex type in libraries that know about them.

In [433]:
import cmath
cmath.sqrt(-1)

1j

## [Working with Infinity and NaNs](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_working_with_infinity_and_nans)

### Problem

You need to create or test for the floating-point values of infinity, negative infinity, or NaN (not a number).

### Solution

Python has no special syntax to represent these floating-point values, but they can be created using `float()`.

In [434]:
a = float('inf')
a

inf

In [435]:
b = float('-inf')
b

-inf

In [436]:
c = float('nan')
c

nan

To test for the presence of these values, use the `math.isinf()` and `math.isnan()` functions.

In [437]:
math.isinf(a)

True

In [438]:
math.isnan(c)

True

In [439]:
math.isnan(b)

False

### Discussion

For more detailed information about these special floating-point values, you should refer to the [IEEE 754 specification](https://en.wikipedia.org/wiki/IEEE_754).  
However, there are a few tricky details to be aware of, especially related to comparisons and operators.  
Infinite values will propagate in calculations in a mathematical manner.

In [440]:
a = float('inf')
a

inf

In [441]:
a + 45

inf

In [442]:
a * 10

inf

In [443]:
10 / a

0.0

However, certain operations are undefined and will result in an NaN.

In [444]:
a = float('inf')
a

inf

In [445]:
a / a

nan

In [446]:
b = float('-inf')
b

-inf

In [447]:
a + b 

nan

NaN values propagate through all operations without raising an exception.

In [448]:
c = float('nan')
c

nan

In [449]:
c + 23

nan

In [450]:
c / 2

nan

In [451]:
c * 2

nan

In [452]:
math.sqrt(c)

nan

In [453]:
c / b

nan

A subtle feature of NaN values is that they never compare as equal.

In [454]:
c = float('nan')
d = float('nan')
c == d

False

In [455]:
c is d

False

Because of this, the only safe way to test for a NaN value is to use `math.isnan()`, as shown in this recipe.  
Sometimes programmers want to change Python’s behavior to raise exceptions when operations result in an infinite or NaN result.  
The `fpectl` module can be used to adjust this behavior, but it is not enabled in a standard Python build, it’s platform-dependent, and really only intended for expert-level programmers.  
See the [online Python documentation](https://docs.python.org/3/library/fpectl.html) for further details.

## [Calculating with Fractions](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_calculating_with_fractions)

### Problem

You have entered a time machine and suddenly find yourself working on elementary-level homework problems involving fractions.  
Or perhaps you’re writing code to make calculations involving measurements made in your wood shop.

### Solution

Use the `fractions` module.

In [456]:
from fractions import Fraction

a = Fraction(5, 4)
b = Fraction(7, 16)
print(a + b)
print(a * b)

27/16
35/64


In [457]:
# Getting the numerator and denominator:
c = a * b
print(c)
print(c.numerator)
print(c.denominator)

35/64
35
64


In [458]:
# Converting to a float:
float(c)

0.546875

In [459]:
# Limiting the denominator of a value:
print(c.limit_denominator(8))

4/7


In [460]:
# Converting a float to a fraction:
x = 3.75
y = Fraction(*x.as_integer_ratio())
y

Fraction(15, 4)

### Discussion

Calculating with fractions doesn’t arise often in most programs, but there are situations where it might make sense to use them.  
For example, allowing a program to accept units of measurement in fractions and performing calculations with them in that form might alleviate the need for a user to manually make conversions to decimals or floats.

## [Calculating with Large Numerical Arrays](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#calculatingwithlargenumerical)

### Problem

You need to perform calculations on large numerical datasets, such as arrays or grids.

### Solution

For any heavy computation involving arrays, use the [NumPy library](http://www.numpy.org/).  
The major feature of `NumPy` is that it gives Python an array object that is much more efficient and better suited for mathematical calculation than a standard Python list.  
Here is a short example illustrating important behavioral differences between lists and `NumPy` arrays:

In [461]:
# Python lists:
x = [1, 2, 3, 4]
y = [5, 6, 7, 8]
x * 2

[1, 2, 3, 4, 1, 2, 3, 4]

In [462]:
x + y

[1, 2, 3, 4, 5, 6, 7, 8]

Now let's transform those lists into numpy arrays:

In [463]:
import numpy as np
ax = np.array([1, 2, 3, 4])
ay = np.array([5, 6, 7, 8])
ax * 2

array([2, 4, 6, 8])

In [464]:
ax + 10

array([11, 12, 13, 14])

In [465]:
ax + ay

array([ 6,  8, 10, 12])

In [466]:
ax * ay

array([ 5, 12, 21, 32])

As you can see, basic mathematical operations involving arrays behave differently.  
Specifically, [scalar operations](https://en.wikipedia.org/wiki/Scalar_%28mathematics%29) (e.g., ax * 2 or ax + 10) apply the operation on an element-by-element basis.  
In addition, performing math operations when both operands are arrays applies the operation to all elements and produces a new array.  
The fact that math operations apply to all of the elements simultaneously makes it very easy and fast to compute functions across an entire array.  
For example, if you want to compute the value of a polynomial:

In [467]:
def f(x):
    return 3*x**2 - 2*x + 7

f(ax)

array([ 8, 15, 28, 47])

Numpy provides a collection of "universal functions" that also allow for array operations.  
These are replacements for similar functions normally found in the `math` module.

In [468]:
np.sqrt(ax)

array([ 1.        ,  1.41421356,  1.73205081,  2.        ])

In [469]:
np.cos(ax)

array([ 0.54030231, -0.41614684, -0.9899925 , -0.65364362])

Using universal functions can be hundreds of times faster than looping over the array elements one at a time and performing calculations using functions in the `math` module.  
Thus, you should prefer their use whenever possible.  
Under the covers, `NumPy` arrays are allocated in the same manner as in C or Fortran.  
Namely, they are large, contiguous memory regions consisting of a homogenous data type.  
Because of this, it’s possible to make arrays much larger than anything you would normally put into a Python list.  
For example, if you want to make a two-dimensional grid of 10,000 by 10,000 floats, it’s not an issue:

In [470]:
grid = np.zeros(shape=(10000,10000), dtype=float)
grid

array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])

All of the usual operations still apply to all of the elements simultaneously:

In [471]:
grid += 10
grid

array([[ 10.,  10.,  10., ...,  10.,  10.,  10.],
       [ 10.,  10.,  10., ...,  10.,  10.,  10.],
       [ 10.,  10.,  10., ...,  10.,  10.,  10.],
       ..., 
       [ 10.,  10.,  10., ...,  10.,  10.,  10.],
       [ 10.,  10.,  10., ...,  10.,  10.,  10.],
       [ 10.,  10.,  10., ...,  10.,  10.,  10.]])

In [472]:
np.sin(grid)

array([[-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111],
       [-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111],
       [-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111],
       ..., 
       [-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111],
       [-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111],
       [-0.54402111, -0.54402111, -0.54402111, ..., -0.54402111,
        -0.54402111, -0.54402111]])

One extremely notable aspect of `NumPy` is the manner in which it extends Python’s list indexing functionality -- especially with multidimensional arrays.  
To illustrate, make a simple two-dimensional array and try some experiments:

In [473]:
a = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
a

array([[ 1,  2,  3,  4],
       [ 5,  6,  7,  8],
       [ 9, 10, 11, 12]])

In [474]:
# Select row 1:
a[1]

array([5, 6, 7, 8])

In [475]:
# Select column 1:
a[:,1]

array([ 2,  6, 10])

In [476]:
# Select a subregion and change it:
print(a[1:3, 1:3])
a[1:3, 1:3] += 10
a

[[ 6  7]
 [10 11]]


array([[ 1,  2,  3,  4],
       [ 5, 16, 17,  8],
       [ 9, 20, 21, 12]])

In [477]:
# Broadcast a row vector across an operation on all rows:
print(a)
print(a + [100, 101, 102, 103])

[[ 1  2  3  4]
 [ 5 16 17  8]
 [ 9 20 21 12]]
[[101 103 105 107]
 [105 117 119 111]
 [109 121 123 115]]


In [478]:
# Conditional assignment on an array:
np.where(a < 10, a, 10)

array([[ 1,  2,  3,  4],
       [ 5, 10, 10,  8],
       [ 9, 10, 10, 10]])

### Discussion

NumPy is the foundation for a huge number of science and engineering libraries in Python.  
It is also one of the largest and most complicated modules in widespread use.  
That said, it’s still possible to accomplish useful things with NumPy by starting with simple examples and playing around.  
One note about usage is that it is relatively common to use the statement import numpy as np, as shown in the solution.  
This simply shortens the name to something that’s more convenient to type over and over again in your program.  
For more information, you definitely need to visit http://www.numpy.org.

## [Performing Matrix and Linear Algebra Calculations](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_performing_matrix_and_linear_algebra_calculations)

### Problem

You need to perform matrix and linear algebra operations, such as matrix multiplication, finding determinants, solving linear equations, and so on.

### Solution

The [NumPy library](http://www.numpy.org/) has a `matrix` object that can be used for this purpose.  
Matrices are somewhat similar to the array objects described in ["Calculating with Large Numerical Arrays"](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#calculatingwithlargenumerical), but follow linear algebra rules for computation.  
Here is an example that illustrates a few essential features:

In [479]:
import numpy as np

m = np.matrix([[1,-2,3],[0,4,5],[7,8,-9]])
m

matrix([[ 1, -2,  3],
        [ 0,  4,  5],
        [ 7,  8, -9]])

In [480]:
# Return matrix transpose:
m.T

matrix([[ 1,  0,  7],
        [-2,  4,  8],
        [ 3,  5, -9]])

In [481]:
# Return matrix inverse:
m.I

matrix([[ 0.33043478, -0.02608696,  0.09565217],
        [-0.15217391,  0.13043478,  0.02173913],
        [ 0.12173913,  0.09565217, -0.0173913 ]])

In [482]:
# Create a vector and multiply it by our matrix:
v = np.matrix([[2],[3],[4]])
print(v)
print(m * v)

[[2]
 [3]
 [4]]
[[ 8]
 [32]
 [ 2]]


More operations can be found in the `numpy.linalg` subpackage.

In [483]:
import numpy.linalg

# Determinant:
print(m)
print(numpy.linalg.det(m))

[[ 1 -2  3]
 [ 0  4  5]
 [ 7  8 -9]]
-230.0


In [484]:
# Eigenvalues:
numpy.linalg.eigvals(m)

array([-13.11474312,   2.75956154,   6.35518158])

In [485]:
# Solve for x in mx = v
x = numpy.linalg.solve(m, v)
x

matrix([[ 0.96521739],
        [ 0.17391304],
        [ 0.46086957]])

In [486]:
m * x

matrix([[ 2.],
        [ 3.],
        [ 4.]])

In [487]:
v

matrix([[2],
        [3],
        [4]])

### Discussion

Linear algebra is obviously a huge topic that’s far beyond the scope of this cookbook.  
However, if you need to manipulate matrices and vectors, NumPy is a good starting point.  
Visit http://www.numpy.org for more detailed information.

## [Picking Things at Random](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_picking_things_at_random)

### Problem

You want to pick random items out of a sequence or generate random numbers.

### Solution

The `random` module has various functions for random numbers and picking random items.  
For example, to pick a random item out of a sequence, use `random.choice()`:

In [488]:
import random

values = [1, 2, 3, 4, 5, 6]
print(random.choice(values))
print(random.choice(values))
print(random.choice(values))
print(random.choice(values))
print(random.choice(values))

2
5
3
1
6


To take a sampling of `n` items where selected items are removed from further consideration (sampling without replacement), use `random.sample()` instead.

In [489]:
print(random.sample(values, 2))
print(random.sample(values, 2))
print(random.sample(values, 3))
print(random.sample(values, 3))
print(random.sample(values, 4))

[2, 5]
[5, 1]
[6, 4, 1]
[5, 1, 2]
[6, 5, 3, 1]


If you simply want to shuffle items in a sequence in place, use `random.shuffle()`:

In [490]:
random.shuffle(values)
values

[6, 5, 3, 4, 1, 2]

In [491]:
random.shuffle(values)
values

[5, 3, 4, 6, 2, 1]

In [492]:
random.shuffle(values)
values

[2, 1, 6, 4, 5, 3]

To produce random integers, use `random.randint()`:

In [493]:
print(random.randint(0,10))
print(random.randint(0,10))
print(random.randint(0,10))
print(random.randint(0,10))
print(random.randint(0,10))

2
2
3
0
0


to produce uniform floating-point values in the range 0 to 1, use `random.random()`:

In [494]:
print(random.random())
print(random.random())
print(random.random())
print(random.random())
print(random.random())

0.1569886094818821
0.5744546715177569
0.0568203746559216
0.2938195363283753
0.9494094382950605


To get `n` random bits expressed as an integer, `use random.getrandbits()`:

In [495]:
print(random.getrandbits(10))
print(random.getrandbits(100))
print(random.getrandbits(200))
print(random.getrandbits(300))

586
663765668044339865214286291613
555824540709666369916210481933578878134958166764171590355443
573344503027149437841363898830194791026097435434483882345561286752010387264863293361059520


### Discussion

The `random` module computes random numbers using the [Mersenne Twister algorithm](https://en.wikipedia.org/wiki/Mersenne_Twister).  
This is a deterministic algorithm, but you can alter the initial seed by using the `random.seed()` function.

In addition to the functionality shown, `random()` includes functions for uniform, Gaussian, and other [probabality distributions](https://en.wikipedia.org/wiki/Probability_distribution).  
For example, `random.uniform()` computes uniformly distributed numbers, and `random.gauss()` computes normally distributed numbers.  
Consult the documentation for information on other supported distributions.  
Functions in `random()` should not be used in programs related to cryptography.  
If you need such functionality, consider using functions in the `ssl` module instead.  
For example, `ssl.RAND_bytes()` can be used to generate a cryptographically secure sequence of random bytes.

## [Converting Days to Seconds, and Other Basic Time Conversions](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_converting_days_to_seconds_and_other_basic_time_conversions)

### Problem

You have code that needs to perform simple time conversions, like days to seconds, hours to minutes, and so on.

### Solution

To perform conversions and arithmetic involving different units of time, use the `datetime` module.  
For example, to represent an interval of time, create a `timedelta` instance, like this:

In [496]:
from datetime import timedelta

a = timedelta(days=2, hours=6)
a

datetime.timedelta(2, 21600)

In [497]:
b = timedelta(hours=4.5)
b

datetime.timedelta(0, 16200)

In [498]:
c = a + b
c

datetime.timedelta(2, 37800)

In [499]:
c.days

2

In [500]:
c.seconds

37800

In [501]:
c.seconds / 3600

10.5

In [502]:
c.total_seconds() / 3600

58.5

If you need to represent specific dates and times, create `datetime` instances and use the standard mathematical operations to manipulate them.

In [503]:
from datetime import datetime

a = datetime(2017, 11, 9)
a

datetime.datetime(2017, 11, 9, 0, 0)

In [504]:
a + timedelta(days=10)

datetime.datetime(2017, 11, 19, 0, 0)

In [505]:
b = datetime(2017, 12, 25)
b

datetime.datetime(2017, 12, 25, 0, 0)

In [506]:
d = b - a
d

datetime.timedelta(46)

In [507]:
d.days

46

In [508]:
now = datetime.today()
now

datetime.datetime(2017, 11, 12, 3, 30, 31, 544057)

In [509]:
now + timedelta(minutes=10)

datetime.datetime(2017, 11, 12, 3, 40, 31, 544057)

When making calculations, it should be noted that `datetime` is aware of leap years.

In [510]:
a = datetime(2017, 3, 1)
b = datetime(2017, 2, 28)
print(a - b)
(a - b).days

1 day, 0:00:00


1

In [511]:
c = datetime(2016, 3, 1)
d = datetime(2016, 2, 28)
print((c - d).days)
c - d

2


datetime.timedelta(2)

### Discussion

For most basic date and time manipulation problems, the `datetime` module will suffice.  
If you need to perform more complex date manipulations, such as dealing with time zones, fuzzy time ranges, calculating the dates of holidays, and so forth, look at the `dateutil` module.  
To illustrate, many similar time calculations can be performed with the `dateutil.relativedelta()` function.  
However, one notable feature is that it fills in some gaps pertaining to the handling of months (and their differing number of days).

In [512]:
from dateutil.relativedelta import relativedelta

a + relativedelta(months=+1)

datetime.datetime(2017, 4, 1, 0, 0)

In [513]:
a + relativedelta(months=+4)

datetime.datetime(2017, 7, 1, 0, 0)

In [514]:
# Time between two dates:
b = datetime(2017, 12, 25)
d = b - a
d

datetime.timedelta(299)

In [515]:
d = relativedelta(b, a)
d

relativedelta(months=+9, days=+24)

In [516]:
d.months

9

In [517]:
d.days

24

## [Determining Last Friday’s Date](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_determining_last_friday_8217_s_date)

### Problem

You want a general solution for finding a date for the last occurrence of a day of the week.  
Let's use Friday as an example.

### Solution

Python's `datetime` module has utility functions and classes to help perform calcuations like this.  
A decent, generic solution to this problem looks like this:

In [518]:
from datetime import datetime, timedelta

weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

def get_previous_byday(dayname, start_date=None):
    if start_date is None:
        start_date = datetime.today()
    day_num = start_date.weekday()
    day_num_target = weekdays.index(dayname)
    days_ago = (7 + day_num - day_num_target) % 7
    if days_ago == 0:
        days_ago = 7
    target_date = start_date - timedelta(days=days_ago)
    return target_date
    
get_previous_byday('Friday')

datetime.datetime(2017, 11, 10, 3, 30, 31, 609794)

And, we can try out a couple more days:

In [519]:
get_previous_byday('Tuesday')

datetime.datetime(2017, 11, 7, 3, 30, 31, 615522)

In [520]:
get_previous_byday('Saturday')

datetime.datetime(2017, 11, 11, 3, 30, 31, 620437)

The optional `start_date` can be supplied using another `datetime` instance.

In [521]:
get_previous_byday('Sunday', datetime(2012, 12, 21))

datetime.datetime(2012, 12, 16, 0, 0)

### Discussion

This recipe works by mapping the start date and the target date to their numeric position in the week (with `Monday` as day 0).  
Modular arithmetic is then used to figure out how many days ago the target date last occurred.  
From there, the desired date is calculated from the start date by subtracting an appropriate `timedelta` instance.  
If you’re performing a lot of date calculations like this, you may be better off installing the `python-dateutil` package instead.  
For example, here is an example of performing the same calculation using the `relativedelta()` function from `dateutil`:

In [522]:
from datetime import datetime
from dateutil.relativedelta import relativedelta
from dateutil.rrule import *

d = datetime.now()
d

datetime.datetime(2017, 11, 12, 3, 30, 31, 631841)

In [523]:
print(d)

2017-11-12 03:30:31.631841


In [524]:
# Next Friday:
print(d + relativedelta(weekday=FR))

2017-11-17 03:30:31.631841


In [525]:
# Last Friday:
print(d + relativedelta(weekday=FR(-1)))

2017-11-10 03:30:31.631841


## [Finding the Date Range for the Current Month](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_finding_the_date_range_for_the_current_month)

### Problem

You have some code that needs to loop over each date in the current month, and you want an efficient way to calcualte that date range.

### Solution

Looping over the dates doesn’t require building a list of all the dates ahead of time.  
You can just calculate the starting and stopping date in the range, then use `datetime.timedelta` objects to increment the date as you go.  
Here’s a function that takes any `datetime` object, and returns a tuple containing the first date of the month and the starting date of the next month:

In [526]:
from datetime import datetime, date, timedelta
import calendar

def get_month_range(start_date=None):
    if start_date is None:
        start_date = date.today().replace(day=1)
    _, days_in_month = calendar.monthrange(start_date.year, start_date.month)
    end_date = start_date + timedelta(days=days_in_month)
    return (start_date, end_date)

With this function, it's pretty simple to loop over the date range:

In [527]:
a_day = timedelta(days=1)
first_day, last_day = get_month_range()
while first_day < last_day:
    print(first_day)
    first_day += a_day

2017-11-01
2017-11-02
2017-11-03
2017-11-04
2017-11-05
2017-11-06
2017-11-07
2017-11-08
2017-11-09
2017-11-10
2017-11-11
2017-11-12
2017-11-13
2017-11-14
2017-11-15
2017-11-16
2017-11-17
2017-11-18
2017-11-19
2017-11-20
2017-11-21
2017-11-22
2017-11-23
2017-11-24
2017-11-25
2017-11-26
2017-11-27
2017-11-28
2017-11-29
2017-11-30


### Discussion

This recipe works by first calculating a date correponding to the first day of the month.  
A quick way to do this is to use the `replace()` method of a `date` or `datetime` object to simply set the `days` attribute to 1.  
One nice thing about the `replace()` method is that it creates the same kind of object that you started with.  
Thus, if the input was a `date` instance, the result is a `date`.  
Likewise, if the input was a `datetime` instance, you get a `datetime` instance.  
After that, the `calendar.monthrange()` function is used to find out how many days are in the month in question.  
Any time you need to get basic information about calendars, the `calendar` module can be useful.  
`monthrange()` is only one such function that returns a tuple containing the day of the week along with the number of days in the month.

Once the number of days in the month is known, the ending date is calculated by adding an appropriate `timedelta` to the starting date.  
It’s subtle, but an important aspect of this recipe is that the ending date is not to be included in the range (it is actually the first day of the next month).  
This mirrors the behavior of Python’s slices and range operations, which also never include the end point.  
To loop over the date range, standard math and comparison operators are used.  
For example, `timedelta` instances can be used to increment the date.  
The `<` operator is used to check whether a date comes before the ending date.

Ideally, it would be nice to create a function that works like the built-in `range()` function, but for dates.  
Fortunately, this is extremely easy to implement using a generator:

In [528]:
def date_range(start, stop, step):
    while start < stop:
        yield start
        start += step

Here is an example of it in use:

In [529]:
for d in date_range(datetime(2017, 11, 1),
                    datetime(2017, 12, 1), timedelta(hours=6)):
    print(d)

2017-11-01 00:00:00
2017-11-01 06:00:00
2017-11-01 12:00:00
2017-11-01 18:00:00
2017-11-02 00:00:00
2017-11-02 06:00:00
2017-11-02 12:00:00
2017-11-02 18:00:00
2017-11-03 00:00:00
2017-11-03 06:00:00
2017-11-03 12:00:00
2017-11-03 18:00:00
2017-11-04 00:00:00
2017-11-04 06:00:00
2017-11-04 12:00:00
2017-11-04 18:00:00
2017-11-05 00:00:00
2017-11-05 06:00:00
2017-11-05 12:00:00
2017-11-05 18:00:00
2017-11-06 00:00:00
2017-11-06 06:00:00
2017-11-06 12:00:00
2017-11-06 18:00:00
2017-11-07 00:00:00
2017-11-07 06:00:00
2017-11-07 12:00:00
2017-11-07 18:00:00
2017-11-08 00:00:00
2017-11-08 06:00:00
2017-11-08 12:00:00
2017-11-08 18:00:00
2017-11-09 00:00:00
2017-11-09 06:00:00
2017-11-09 12:00:00
2017-11-09 18:00:00
2017-11-10 00:00:00
2017-11-10 06:00:00
2017-11-10 12:00:00
2017-11-10 18:00:00
2017-11-11 00:00:00
2017-11-11 06:00:00
2017-11-11 12:00:00
2017-11-11 18:00:00
2017-11-12 00:00:00
2017-11-12 06:00:00
2017-11-12 12:00:00
2017-11-12 18:00:00
2017-11-13 00:00:00
2017-11-13 06:00:00


Again, a major part of the ease of implementation is that dates and times can be manipulated using standard math and comparison operators.

## [Converting Strings into Datetimes](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_converting_strings_into_datetimes)

### Problem

Your application receives temporal data in string format, but you want to convert those strings into `datetime` objects in order to perform nonstring operations on them.

### Solution

Python's standard `datetime` module can do this.

In [530]:
from datetime import datetime

text = '2017-11-1'
y = datetime.strptime(text, '%Y-%m-%d')
z = datetime.now()
diff = z - y
diff

datetime.timedelta(11, 12631, 697282)

### Discussion

The `datetime.strptime()` method supports a host of formatting codes, like `%Y` for the four-digit year and `%m` for the two-digit month.  
It’s also worth noting that these formatting placeholders also work in reverse, in case you need to represent a `datetime` object in string output and make it look nice.  
For example, let’s say you have some code that generates a `datetime` object, but you need to format a nice, human-readable date to put in the header of an auto-generated letter or report:

In [531]:
z

datetime.datetime(2017, 11, 12, 3, 30, 31, 697282)

In [532]:
nice_z = datetime.strftime(z, '%A %B %d, %Y')
nice_z

'Sunday November 12, 2017'

It’s worth noting that the performance of `strptime()` is often much worse than you might expect, due to the fact that it’s written in pure Python and it has to deal with all sorts of system locale settings.  
If you are parsing a lot of dates in your code and you know the precise format, you will probably get much better performance by cooking up a custom solution instead.  
For example, if you knew that the dates were of the form "YYYY-MM-DD," you could write a function like this:

In [533]:
from datetime import datetime

def parse_ymd(s):
    year_s, mon_s, day_s = s.split('-')
    return datetime(int(year_s), int(mon_s), int(day_s))

When tested, this function runs over seven times faster than `datetime.strptime()`.  
This is probably something to consider if you're processing large amounts of data involving dates.

## [Manipulating Dates Involving Time Zones](http://chimera.labs.oreilly.com/books/1230000000393/ch03.html#_manipulating_dates_involving_time_zones)

### Problem

You had a conference call scheduled for December 21, 2012, at 9:30 a.m. in Chicago.  
At what local time did your friend in Bangalore, India, have to show up to attend?

### Solution

For almost any problem involving time zones, you should use the [pytz module](https://pypi.python.org/pypi/pytz).  
This package provides the [Olson time zone database](https://en.wikipedia.org/wiki/Tz_database), which is the de facto standard for time zone information found in many languages and operating systems.  
A major use of `pytz` is in localizing simple dates created with the `datetime` library.  
For example, here is how you would represent a date in Chicago time:

In [534]:
from datetime import datetime
from pytz import timezone
import pytz

d = datetime(2017, 11, 12, 9, 30, 0)
print(d)

2017-11-12 09:30:00


In [535]:
# Localize the date for Chicago:
central = timezone('US/Central')
loc_d = central.localize(d)
print(loc_d)

2017-11-12 09:30:00-06:00


Once the date has been localized, it can be converted to other time zones.  
To find the same time in Bangalore, you would do this:

In [536]:
# Convert to Bangalore time:
bang_d = loc_d.astimezone(timezone('Asia/Kolkata'))
print(bang_d)

2017-11-12 21:00:00+05:30


If you are going to perform arithmetic with localized dates, you need to be particularly aware of daylight saving transitions and other details.  
For example, in 2013, U.S. standard daylight saving time started on March 13, 2:00 a.m. local time (at which point, time skipped ahead one hour).  
If you’re performing naive arithmetic, you’ll get it wrong:

In [537]:
d = datetime(2013, 3, 10, 1, 45)
loc_d = central.localize(d)
print(loc_d)

2013-03-10 01:45:00-06:00


In [538]:
later = loc_d + timedelta(minutes=30)
print(later)  # Time change is ignored:

2013-03-10 02:15:00-06:00


The answer is wrong because it doesn't account for the one-hour skip in the local time.  
To fix this, use the `normalize()` method of the time zone.

In [539]:
from datetime import timedelta

later = central.normalize(loc_d + timedelta(minutes=30))
print(later)

2013-03-10 03:15:00-05:00


### Discusssion

To keep your head from completely exploding, a common strategy for localized date handling is to convert all dates to UTC time and to use that for all internal storage and manipulation.

In [540]:
print(loc_d)

2013-03-10 01:45:00-06:00


In [541]:
utc_d = loc_d.astimezone(pytz.utc)
print(utc_d)

2013-03-10 07:45:00+00:00


Once in UTC, you don’t have to worry about issues related to daylight saving time and other matters.  
Thus, you can simply perform normal date arithmetic as before.  
Should you want to output the date in localized time, just convert it to the appropriate time zone afterward.

In [542]:
later_utc = utc_d + timedelta(minutes=30)
print(later_utc.astimezone(central))

2013-03-10 03:15:00-05:00


One issue in working with time zones is simply figuring out what time zone names to use.  
For example, in this recipe, how was it known that "Asia/Kolkata" was the correct time zone name for India?  
To find out, you can consult the `pytz.country_timezones` dictionary using the ISO 3166 country code as a key.

In [543]:
pytz.country_timezones['IN']

['Asia/Kolkata']