###### Copyright &copy; Anand B Pillai, Anvetsu Technologies Pvt. Ltd (2015)

# Data Structures & Types

## 1. Mutable data structures as default arguments

### 1.1. Show me the Code!

In [17]:
def add_name(names=[]):
    """ Add a name to a list """
    
    name=raw_input("Enter name: ").strip()
    names.append(name)
    return names

In [18]:
print add_name()

Enter name: Anand
['Anand']


In [19]:
print add_name()

Enter name: Appu
['Anand', 'Appu']


In [20]:
print add_name()

Enter name: Dhruv
['Anand', 'Appu', 'Dhruv']


## Gotcha !!!

#### This is because,

   1. The function definition line is evaluated just once and not every time the function is called.
   2. This is because Python functions are first class objects.

### 1.2. Show me the Fix !

#### 1.2.1. Use a place-holder value instead of modifying the default value directly.

In [1]:
def add_name(names=None):
    
    if names==None:
        names = []
     
    name=raw_input("Enter name: ").strip()
    names.append(name)
    return names   

In [2]:
print add_name()

Enter name: Anand
['Anand']


In [3]:
print add_name()

Enter name: Raju
['Raju']


In [4]:
names = ['Appu','Dhruv']
print add_name(names)

Enter name: Mehek
['Appu', 'Dhruv', 'Mehek']


##### 1.2.2. Use a sentinel object.

In [7]:
sentinel = object()
def add_name(names=sentinel):
    
    if names==sentinel:
        names = []
     
    name=raw_input("Enter name: ").strip()
    names.append(name)
    return names   

In [8]:
names = ['Appu','Dhruv']
print add_name(names)

Enter name: Shreya
['Appu', 'Dhruv', 'Shreya']


##### 1.2.3. Use an inner function, which is always evaluated from the context of the outer function.

In [11]:
def add_name():
    def inner(names=[]):     
        name=raw_input("Enter name: ").strip()
        names.append(name)
        return names
    return inner()

In [12]:
add_name()

Enter name: Anand


['Anand']

In [13]:
add_name()

Enter name: Appu


['Appu']

In [14]:
add_name()

Enter name: Dhruv


['Dhruv']

#### 1.3. Valid uses of this behavior

##### 1.3.1. A caching memoizer pattern

In [49]:
def fibonacci(n, memo={}):
    """ Return n'th fibonacci number """
    
    # Uses an inline caching dictionary 
    # as a memoizing data structure
    if n in memo:
        print '*** memoized data ***'
        return memo[n]
    
    a, b, c = 0, 1, 1
    
    for i in range(n-1):
        c = a + b
        a, b = b, c
        
    memo[n] = c
    return c

In [50]:
fibonacci(1)

1

In [51]:
fibonacci(2); fibonacci(2)

*** memoized data ***


1

In [52]:
fibonacci(3); fibonacci(3)

*** memoized data ***


2

#### 1.4. More Reading

   1. https://stackoverflow.com/questions/1132941/least-astonishment-in-python-the-mutable-default-argument
   1. http://effbot.org/zone/default-values.htm

## 2. Mutable Argument Modification / Name Binding

### 2.1. Show me the Code!

In [81]:
def f(x, l):
    """ A function taking a list as argument """
        
    # This is a silly function, really.
    
    if len(l)<5:          # L1
        l = g(x)          # L2 
                          # L3
    l.append(x)           # L4
        
def g(x):
    """ A functon """
    
    return [x*x]*5

In [82]:
nums = range(5)
f(10, nums)
print nums

[0, 1, 2, 3, 4, 10]


In [83]:
nums=range(3)
f(10, nums)
print nums        # 'nums' remains the same. Not surprised ? Good, Surprised - Not so good :)

[0, 1, 2]


## Gotcha !!!

#### This is because,

   1. The __nums__ that is replaced in line #2 is a __new__ object recieved from g(...). It doesn't 
    replace the original object. 
   2. This is because in Python objects are bound to variables by name. Names _refer_ to objects, they don't bind strongly to them.
   3. In order to modify a mutable, you need to call methods on it that modifies it. In case of list, these are _append_, _extend_, _remove_, _pop_ etc.

### 2.2. Show me the Fix !

In [84]:
def f(x, l):
    """ A function taking a list as argument """
    
    if len(l)<5:           # L1
        l = g(x)           # L2 
                           # L3
        
    l.append(x)            # L4
    
    # Return it
    return l
        
def g(x):
    """ A functon """
    
    return ([x]*5)

In [85]:
nums=range(3)
nums = f(10, nums)
print nums  

[10, 10, 10, 10, 10, 10]


## 3. Immutable Variable Comparison

### 3.1. Show me the Code!

In [140]:
def greet(greeting, default_value="Hi"):
    """ Greet someone with a greeting """
    
    if greeting is not default_value:
        greeting = default_value + ", " + greeting
    
    print greeting

In [127]:
# Test 1
greet("Hi")

Hi


In [112]:
# Test 2
greet("Good Morning")

Hi, Good Morning


In [115]:
# Test 3
greet("Good Morning", "Hello there")              

Hello there, Good Morning


In [133]:
# Test 4
greet("Hello there", "Hello there")      # Fine

Hello there


In [141]:
# Test 5
greet("Hi, how do you do!", "Hi, how do you do!")

Hi, how do you do!


In [142]:
# Test 6
greeting="Hello there"
greet(greeting, default_value="Hello there")         # Hmmm, not what you expected ?

Hello there, Hello there


## Gotcha !!!

#### This is because,

   1. You used __is__,  the identity comparison operator instead of __!=__, the equality comparison operator.
   1. However, the code still works as expected in __Test 4__ above because Python optimizes string memory for literal strings. Since
    both arguments are passed as literal strings and their value is the same, Python creates the object just once for both arguments, 
    so the _is_ comparison works.
   1. In __Test 6__, we use a separate name _greeting_ for the first argument and the literal string for the second. Hence Python doesn't 
    get a chance to optimize in this case and the _is_ comparison fails.
  

### 3.2. Show me the Fix !

In [143]:
def greet(greeting, default_value="Hi"):
    """ Greet someone with a greeting """
    
    # Simple: Use == or != operator always
    if greeting != default_value:
        greeting = default_value + ", " + greeting
    
    print greeting

In [144]:
# Test 4
greet("Hello there", "Hello there")  

Hello there


In [145]:
# Test 6
greeting="Hello there"
greet(greeting, default_value="Hello there")

Hello there


## 4. Integer vs Float Division

#### Integer division in Python always produces an integer result, ignoring the fractional part. Moreover, it __floors__ the result which can sometimes be a little confusing. 

### 4.1. Show me the Code!

In [146]:
5/2  # Not 2.5, but 2, i.e the answer rounded off

2

In [148]:
-5/2 # Prints -3, not -2, i.e answer is floored away from zero

-3

### 4.2. Notes

This is pretty well known behaviour of Python. It is not exactly a Gotcha, but newbie programmers are caught off-guard when
they encounter this behaviour for the first time. It does take a while to get used to it.

### 4.3. Workarounds

#### 4.3.2 Workaround #1 - Specifically use float division

In [156]:
# Just remember to convert one of the numbers to float, typically multiplying by 1.0.
# This is what I do.

x=5
y=1.0*x/2
print y

2.5


In [157]:
x=5
y=x/2.0
print y

2.5


#### 4.3.2 Workaround #2 - Backported from future

In [152]:
from __future__ import division

print "True division =>", 5 / 2
print "Floor division =>", 5 // 2

True division => 2.5
Floor division => 2


For Python 2.x, import __division__ from the future (means a feature backported from Python 3.x). Then
you get two division operators, __/__ performing true division and the new __//__ performing floor division.

#### 4.3.3 Workaround #3 - Use decimal module

In [189]:
import decimal

x=decimal.Decimal(5)
y=decimal.Decimal(2)
z=x/y
print z

2.5


__NOTE__ - Above is overkill for such a simple example. __Decimal__ types are more useful to get absolute precision for your floating point numbers. We will see another example below.

#### 4.4. More Reading

   1. http://python-history.blogspot.in/2010/08/why-pythons-integer-division-floors.html
   1. https://stackoverflow.com/questions/183853/in-python-what-is-the-difference-between-and-when-used-for-division

## 5. Floating Point Precision & Round-Off

#### Floating point numbers are always represented as a round-off to their actual internal value. In Python, sometimes these can cause some unexpected results. These are not a bug in the language or your code, but simply some interesting results of the way programming languages represent floating point numbers and display them.

### 5.1. Precision

#### 5.1.1. Show me the Code!

In [162]:
x=0.1
y=0.2
z=x+y

print z # All good

0.3


In [164]:
# However,
z

0.30000000000000004

#### 5.1.2. Notes

##### What is happening here ? 

When you print the variable z, print takes care to represent the number rounded off to the closest
value. However when you inspect the number by not printing it, you get to see the actual number internally represented.
Technically this is called a __Representation Error__ .

### 5.2 Round-off

#### 5.2.1. Show me the Code!

In [177]:
x = 0.325
print round(x, 2) # Good

0.33


In [178]:
x = 0.365
print round(x, 2) # What the ...!!!

0.36


#### 5.2.2. Notes

##### What is happening here ? 

Since the decimal fraction 0.365 is exactly half-way between 0.37 and 0.38, sometimes it could be represented by a binary
fraction which is closer to 0.36 than it is closer to 0.37. But how to find the exact precision of a float in Python ?

In [183]:
x=0.365
x  # Doesn't help!

0.365

In [205]:
# Solution - Use decimal module

import decimal
decimal.Decimal(0.365)

Decimal('0.3649999999999999911182158029987476766109466552734375')

In [204]:
decimal.Decimal(0.325) # Now you understand why 0.325 nicely rounds to 0.33 

Decimal('0.325000000000000011102230246251565404236316680908203125')

As you can see, 0.365 is internally represented by 0.3649999999999999911182158029987476766109466552734375 which is closer to 0.36 when rounded off to 2 decimal places. Which is why round(0.365) produces 0.36.

#### 5.2.3. Workarounds

##### 5.2.3.1 Use ceil for rounding up

In [199]:
import math

x=0.365

# Multiple and divide by power of 10 equal to precision required
math.ceil(pow(10,2)*x)/pow(10,2)

0.37

##### 5.2.3.2 Use decimal module

In [201]:
from decimal import *

x=Decimal(0.365).quantize(Decimal('0.01'), rounding=ROUND_UP)
y=round(x, 2)
print y

0.37


#### 5.3. More Reading

   1. https://docs.python.org/2/tutorial/floatingpoint.html
   1. https://stackoverflow.com/questions/4518641/how-to-round-off-a-floating-number-in-python

# 6. Modifying Mutables inside Immutables

#### When you have mutables (lists, dictionaries) as elements inside immutables (tuples here) you can have some unexpected results when trying to modify the former.

### 6.1. Show me the Code!

In [236]:
def make_shipment(container, items, index=0):
    """ Modify objects to be shipped in 'container' by adding
    objects from 'items' into it at index 'index' """
    
    # container is a tuple containing lists
    container[index] += items

In [237]:
# Real-life example - container of items to be exported
container = (['apples','mangoes','oranges'], ['silk','cotton','wool'])
make_shipment(container, ['papayas'])

TypeError: 'tuple' object does not support item assignment

#### However, 

In [239]:
print container # But container is modified as well!

(['apples', 'mangoes', 'oranges', 'papayas'], ['silk', 'cotton', 'wool'])


## Gotcha !!!

#### This is because,
 
    1. For mutable types in Python,
    
    >>> x += [y]
    
    is not exactly the same as,
    
    >>> x = x + [y]
    
    2. In the first one, __x__ remains the same, but in second case, a new object is created and assigned to __x__ .
    3. Hence when,
    
        container[index] += items
        
    is performed, the referenced list changes in-place. The item assignment doesn't work, but when the exception 
    occurs, the item has already been changed in place.

### 6.2. Show me the Fix!

In [241]:
def make_shipment(container, items, index=0):
    """ Modify objects to be shipped in 'container' by adding
    objects from 'items' into it at index 'index' """
    
    # container is a tuple containing lists
    # Use .extend(...)
    container[index].extend(items)

In [243]:
# Real-life example - container of items to be exported
container = (['apples','mangoes','oranges'], ['silk','cotton','wool'])
make_shipment(container, ['papayas'])
print container

(['apples', 'mangoes', 'oranges', 'papayas'], ['silk', 'cotton', 'wool'])


In [245]:
def make_shipment(container, items, index=0):
    """ Modify objects to be shipped in 'container' by adding
    objects from 'items' into it at index 'index' """
    
    # container is a tuple containing lists
    # Or retrieve the item at index to a variable
    item = container[index]
    # Then add to it.
    item += items

In [246]:
# Real-life example - container of items to be exported
container = (['apples','mangoes','oranges'], ['silk','cotton','wool'])
make_shipment(container, ['papayas'])
print container

(['apples', 'mangoes', 'oranges', 'papayas'], ['silk', 'cotton', 'wool'])


#### 6.3. More Reading

   1. http://web.archive.org/web/20031203024741/http://zephyrfalcon.org/labs/python_pitfalls.html

# 7. Boolean Type Fallacy 

#### Python doesn't respect its own boolean types. In fact, the two boolean types __True__ and __False__ can be quite flexible if you chose them to be. A developer can (often accidentally) overwrite Python's boolean types causing all kinds of problems and in this case, a bit of fun :)

### 7.1. Show me the Fun!

#### This show is named __"The Blind Truthness of Falsehood"__

In [230]:

print True
print False

x='blind'
True=x

## Fun
print 'Love is',True
print 'Hate is',not x

# Now watch the fun!

# Python allows you to overwrite its default boolean types.
False=True # Yes you can do this in #Python.

print 'Love is',x # What do you expect to get printed ?
print 'Love is',True,'as well as',False


print 'Hate is',False # What do you expect to get printed ?
print 'Hate is',False,'as well as',False

print 

# REAL-LIFE, NEAR-DEATH EXAMPLE 

# Point-blank situation
no_bullet_in_gun = False

if no_bullet_in_gun:
    print "GO AHEAD, SHOOT ME IN THE HEAD !" # Goes ahead... your life ends here.
    True='dead'
else:
    print "NO PLEASE... I BEG YOU TO SPARE ME...!"
    True='alive'
    
print 'I am',True

True
False
Love is blind
Hate is False
Love is blind
Love is blind as well as blind
Hate is blind
Hate is blind as well as blind

GO AHEAD, SHOOT ME IN THE HEAD !
I am dead


In [229]:
# Reset our world to sanity

True, False=bool(1), bool(0)

no_bullet_in_gun=False

if no_bullet_in_gun:
    print "GO AHEAD, SHOOT ME IN THE HEAD !" 
    x='dead'
else:
    print "NO PLEASE... I BEG YOU TO SPARE ME...!"  # Spares you, you live to write more code in Python, but
    x='alive'                                       # hopefully not like the one above.
    
print 'I am',x

NO PLEASE... I BEG YOU TO SPARE ME...!
I am alive


### 7.2. Show me the Fix !

#### You are kidding right ?

Word of advice - Don't overwrite your boolean types though Python allows it. It is harmful to health.

###### Copyright &copy; Anand B Pillai, Anvetsu Technologies Pvt. Ltd (2015)