# Core Python

The examples in this notebook cover issues commonly encountered by someone new to Python and interested in Data Science.

## Numeric Equality

The same value, even if it is a different numeric type, compares equal.

In general, objects having different data types do not compare equal.

In [5]:
a = 2.0**3
print(f'type(a): {type(a)}, a = {a}')

b = 2**3
print(f'type(b): {type(b)}, b = {b}')

print(f'type(a) == type(b): {type(a) == type(b)}')
print(f'a == b: {a==b}')

type(a): <class 'float'>, a = 8.0
type(b): <class 'int'>, b = 8
type(a) == type(b): False
a == b: True


## Floating Point Comparisons

Floating point representation is only accurate to within epsilon.

In [6]:
import sys
eps = sys.float_info.epsilon
eps

2.220446049250313e-16

In [7]:
1.1 + 2.2

3.3000000000000003

In [8]:
3.3

3.3

In [9]:
1.1 + 2.2 == 3.3

False

In [10]:
# mathematical operations may increase the inprecision
print(abs(1.1 + 2.2 - 3.3) <= eps)
print(abs(1.1 + 2.2 - 3.3) <= 2*eps)

False
True


In [11]:
# isclose is useful for determining equality with floating point numbers
import numpy as np
x = 1.1 + 2.2
y = 3.3
print(np.isclose(x, y))

True


In [12]:
# relative tolerance: is x/y close to 1.0?
def rel_tol(x, y, tol):
    """similar to np.isclose(x, y, atol=0, rtol=tol)"""
    x, y = max(x,y), min(x,y)
    if x/y - 1.0 < tol:
        return True
    else:
        return False

In [13]:
# absolute tolerance: is x-y close to 0.0?
def abs_tol(x, y, tol):
    """similar to np.isclose(x, y, atol=tol, rtol=0)"""
    x, y = max(x,y), min(x,y)
    if (x-y < tol):
        return True
    else:
        return False

In [14]:
# exponentiation increases inprecision
x = (1.1+2.2)**10
y = 3.3**10

In [15]:
# usually relative tolerance is most useful
print(f'{x/y}', '\n')
tols = [1e-14, 1e-15]
for tol in tols:
    print(f'Relative Tolerance: {tol}')
    print(f'x == y {rel_tol(x, y, tol)}')   
    print(f'x == y {np.isclose(x, y, atol=0, rtol=tol)}', '\n')

1.0000000000000013 

Relative Tolerance: 1e-14
x == y True
x == y True 

Relative Tolerance: 1e-15
x == y False
x == y False 



In [16]:
# however when comparing with zero, absolute tolerance is necessary
print(f'{x-y}', '\n')
tols = [1e-9, 1e-10]
for tol in tols:
    print(f'Absolute Tolerance: {tol}')
    print(f'x == y {abs_tol(x, y, tol)}')   
    print(f'x == y {np.isclose(x, y, atol=tol, rtol=0)}', '\n')

2.0372681319713593e-10 

Absolute Tolerance: 1e-09
x == y True
x == y True 

Absolute Tolerance: 1e-10
x == y False
x == y False 



### math.isclose() vs numpy.isclose()

When both of atol and rtol are nonzero, math.isclose() and numpy.isclose() do not produce the same results!

The generally accepted definition, when both atol and rtol are specified, is implemented by math.isclose().

Suggestion: set one of atol or rtol to zero, or use math.isclose().

See for example: https://apassionatechie.wordpress.com/2018/01/09/isclose-function-in-numpy-is-different-from-math/

### Money Calculation Note

Floating point numbers should not be used for financial applications.  Use Decmial instead.

Although the relative difference between using float and decimal is usually very small, the absolute difference could be a penny or more.  Financial calculations must be exact or your code may not be considered acceptable to an accountant. 

## Python's Copy Semantics

The following is a descriptive overview.  The examples will make this clearer.

Before understanding how objects are copied, it is necessary to understand the difference between 'is' and '==', as well as the difference between mutable and immutable objects.

### 'is' vs '=='

**a is b**  
if two variables refer to the same object in memory 'a is b' returns True

**a == b**  
if the values (aka contents) of the objects referred to by two variables are the same, then 'a == b' returns True.

If 'a is b' is True, then 'a == b' must also be True.  
If 'a is b', then a and b are sometimes said to be an alias for one another.

### Mutable vs Immutable

An immutable object is one whose contents cannot be changed.  Examples include:
1. strings
2. tuples
3. namedtuples

A mutable object is one whose contents can be changed.  Examples include:
1. list
2. dictionary
3. set 

If a variable refers to an object in memory, and that object is immutable, it is not possible to change the contents of the memory referred to.

Understanding whether or not two variables refer to the same object in memory, and whether or not the object in memory is mutable or contains references to mutable objects, is key to understanding Python's copy semantics.

In [17]:
# a refers to the immutable object in memory that represents the integer 3
a = 3
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

# if we add 1 to a, then a can no longer refer to the same object in memory
# instead a new object is created in memory and a is bound (refers to) this new object
a = a + 1
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

a = "3", hex(id(a)) = 0x55c4ed3ea3c0
a = "4", hex(id(a)) = 0x55c4ed3ea3e0


In [18]:
# a refers to the immutable object in memory that represents the string "Hello"
a = "Hello"
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

a = a + " World"
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

a = "Hello", hex(id(a)) = 0x7f171f923960
a = "Hello World", hex(id(a)) = 0x7f173b5f35f0


In [19]:
# a refers to a mutable object
a = [1, 2, 3]
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

# its contents can be changed, a refers to the same object as before
a[1] = 9
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')

a = "[1, 2, 3]", hex(id(a)) = 0x7f1735277ac8
a = "[1, 9, 3]", hex(id(a)) = 0x7f1735277ac8


### Aside: Memory Optimization May Cause 'is' to be True Unexpectedly

In order to save memory, small integer values, as well as a certain number of strings, are reused by Python (and most other programming languages).

This could create confusion when trying to understand how 'is' works.

In [20]:
# memory optimization uses the same memory location for the integer 4 for both a and b
a = 1+3
b = 2+2

# memory location for a and b is the same!
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')
print(f'b = "{b}", hex(id(b)) = {hex(id(b))}')

print(f'a is b: {a is b}')
print(f'a == b: {a == b}')

a = "4", hex(id(a)) = 0x55c4ed3ea3e0
b = "4", hex(id(b)) = 0x55c4ed3ea3e0
a is b: True
a == b: True


In [21]:
# however for larger values, the interpreter makes no attempt to reuse the same immutable integer
a = 987654321
b = 987654321

# memory location for a and b are different!
print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')
print(f'b = "{b}", hex(id(b)) = {hex(id(b))}')

print(f'a is b: {a is b}')
print(f'a == b: {a == b}')

a = "987654321", hex(id(a)) = 0x7f171fdf0d90
b = "987654321", hex(id(b)) = 0x7f171fdf0e30
a is b: False
a == b: True


#### Conclusion: 'is' vs '=='

Never us 'is' for object equality.

In [22]:
a = 987654321

# make b refer to the same object as a
b = a

print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')
print(f'b = "{b}", hex(id(b)) = {hex(id(b))}')
print(f'a is b: {a is b}')
print(f'a == b: {a == b}')

a = "987654321", hex(id(a)) = 0x7f171fdf0db0
b = "987654321", hex(id(b)) = 0x7f171fdf0db0
a is b: True
a == b: True


In [24]:
# adding 1 to a will create a new integer object in memory and bind a to that new object
# b continues to refer to the same integer object as before
a 
a = a + 1

print(f'a = "{a}", hex(id(a)) = {hex(id(a))}')
print(f'b = "{b}", hex(id(b)) = {hex(id(b))}')

a = "987654323", hex(id(a)) = 0x7f171fdf0eb0
b = "987654321", hex(id(b)) = 0x7f171fdf0db0


## Awesome Python Visual Aid

The easiest way to understand the copy semantics of mutable objects is visually.

There is an excellent visual tool for this: http://www.pythontutor.com/visualize.html#mode=display

For learning, this visualization is better than anything I can describe in text, and is better than using a debugger.

1. Cut and paste the code from the cells below as directed.  
2. Click on the "visualize execution" button.  
3. Step through it line by line by clicking on "forward".

### Modifying an Object "in-place" vs not "in-place"

To modify an object "in-place" means to change the contents of the object.  This can only be done for mutable objects.

Usually an "in-place" operation returns None whereas a non "in-place" operation returns a copy of the operated on value.

In [25]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# lists are mutable, their contents can be modified in-place
print('in-place opearation on list')
a = [2, 1, 3]
print(f'a = {a}')

# apply in-place method
print(f'x = a.sort()')
x = a.sort()
print(f'x = {x}')
print(f'a = {a}')
print()

# apply non in-place method
print('non in-place operation on list')
a = [2, 1, 3]
print(f'a = {a}')
print(f'x = sorted(a)')
x = sorted(a)
print(f'x = {x}')
print(f'a = {a}')

in-place opearation on list
a = [2, 1, 3]
x = a.sort()
x = None
a = [1, 2, 3]

non in-place operation on list
a = [2, 1, 3]
x = sorted(a)
x = [1, 2, 3]
a = [2, 1, 3]


In [26]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

a = [1, 2, 3]
b = a
print(a)
print(b)
print(hex(id(a)))
print(hex(id(b)))
print(a is b)

# modify the contents of the list object in-place
a[1] = 99
print(hex(id(a)))
print(hex(id(b)))
print(a is b)

# the contents of a[1] changed, but this is referred to by b, so now b is different
print(b)

# in Python, the convention (which most packages follow) is for an in-place operator to return None
# a notable exeception is pop() which both modifies the list and returns the value it removed
z = a.append(-99)
print(z)
z = a.pop()
print(z)
print(a)
print(b)
print(a is b)

[1, 2, 3]
[1, 2, 3]
0x7f171f7c4608
0x7f171f7c4608
True
0x7f171f7c4608
0x7f171f7c4608
True
[1, 99, 3]
None
-99
[1, 99, 3]
[1, 99, 3]
True


In [27]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# [:] is a shallow copy
a = [1, 2, 3]
b = a[:]

# although the contents of the a and b are the same, they refer to different objects in memory
print(a)
print(b)
print(a is b)
print()

# so modifing a has no effect on b
a[1] = 100
print(a)
print(b)
print(a is b)
print()

# alternative ways to create shallow copies of a list
a = [1, 2, 3]
b = a.copy()

# modifing a has no effect on b
a[1] = 100
print(a)
print(b)
print(a is b)
print()

from copy import copy
a = [1, 2, 3]
b = copy(a)

# modifing a has no effect on b
a[1] = 100
print(a)
print(b)
print(a is b)

[1, 2, 3]
[1, 2, 3]
False

[1, 100, 3]
[1, 2, 3]
False

[1, 100, 3]
[1, 2, 3]
False

[1, 100, 3]
[1, 2, 3]
False


In [28]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# if the contents themselves references to mutable objects
# the situation is more complex
c = [[1, 2], [3, 4]]

# d is shallow copy of c
d = c[:]

# their values are the same
print(f'c = {c}')
print(f'd = {d}')

# but it is not the same list
print(f'c is d: {c is d}')
print()

# as this was a shallow copy, c[0] is d[0]
print(c[0] is d[0])
print(c[1] is d[1])

# so modifing c in-place, will change the value of d
c[0][1] = 999
print(c)
print(d)

c = [[1, 2], [3, 4]]
d = [[1, 2], [3, 4]]
c is d: False

True
True
[[1, 999], [3, 4]]
[[1, 999], [3, 4]]


In [29]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# same as previous cell but with deepcopy

c = [[1, 2], [3, 4]]

# d is shallow copy of c
from copy import deepcopy
d = deepcopy(c)

# their values are the same
print(f'c = {c}')
print(f'd = {d}')

# but it is not the same list
print(f'c is d: {c is d}')
print()

# as this was a deep copy, c[0] is not d[0]
print(c[0] is d[0])
print(c[1] is d[1])

# so modifing c[0][1], for example, will not change the value of d[0][1]
c[0][1] = 999
print(c)
print(d)

c = [[1, 2], [3, 4]]
d = [[1, 2], [3, 4]]
c is d: False

False
False
[[1, 999], [3, 4]]
[[1, 2], [3, 4]]


In [30]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# similar to above examples, but with sets instead
s = set([1, 2, 3])
t = s

# in-place modification of s, changes t
s.add(4)
print(t)

# use shallow copy
t = s.copy()
s.add(5)
print(s)
print(t)

{1, 2, 3, 4}
{1, 2, 3, 4, 5}
{1, 2, 3, 4}


In [31]:
# copy this entire cell to http://www.pythontutor.com/visualize.html#mode=display
# click: 'visualize execution' and step through code with 'forward'

# difference between append and extend
my_list = [1,2,3]
my_list.append((4,5))
ret = my_list.extend((6,7,8,9))
print(ret)
print(my_list)

# Note: a list can be used as a stack with append and pop
ret = my_list.pop()
print(f'{ret} {my_list}')

None
[1, 2, 3, (4, 5), 6, 7, 8, 9]
9 [1, 2, 3, (4, 5), 6, 7, 8]
