#Tuples and immutable versus mutable objects

Tuples are similar to lists but are immutable.  That is, once they are created, they cannot be changed.  Tuples are created using parenthesis instead of brackets:

In [1]:
>>> t = (1, 2, 3)
>>> t[1] = 0

TypeError: 'tuple' object does not support item assignment

Like lists, tuples can contain any object, including other tuples and lists:

In [2]:
>>> t = (0., 1, 'two', [3, 4], (5,6) )

The advantage of tuples is that they are faster than lists, and Python often uses them behind the scenes to achieve efficient passing of data and function arguments.  In fact, one can write a comma separated list without any enclosing characters and Python will, by default, interpret it as a tuple:

In [3]:
>>> 1, 2, 3

(1, 2, 3)

In [4]:
>>> "hello", 5., [1, 2, 3]

('hello', 5.0, [1, 2, 3])

Tuples aren't the only immutable objects in Python.  Strings are also immutable:

In [5]:
>>> s = "There are 5 cars."
>>> s[10] = "6"

TypeError: 'str' object does not support item assignment

To modify strings in this way, we instead need to use slicing:

In [6]:
>>> s = s[:10] + "6" + s[11:]
>>> s

'There are 6 cars.'

Floats, integers, and complex numbers are also immutable; however, this is not obvious to the programmer.  For these types, what immutable means is that new numeric values always involve the creation of a new spot in memory for a new variable, rather than the modification of the memory used for an existing variable.

#Assignment and name binding

Python treats variable assignment slightly differently than what you might expect from other programming languages where variables must be declared beforehand so that a corresponding spot in memory is available to manipulate.  Consider the assignment:

In other programming languages, this statement might be read as "put the value 1 in the spot in memory corresponding to the variable a."  In Python, however, this statement says something quite different: "create a spot in memory for an integer variable, give it a value 1, and then point the variable a to it."  This behavior is called name binding in Python.  It means that most variables act like little roadmaps to spots in memory, rather than designate specific spots themselves.

Consider the following:

In [7]:
>>> a = [1, 2, 3]
>>> b = a
>>> a[1] = 0
>>> a

[1, 0, 3]

In [8]:
>>> b

[1, 0, 3]

In the second line, Python bound the variable b to the same spot in memory as the variable a.  Notice that it did not copy the contents of a, and thus any modifications to a subsequently affect b also.  This can sometimes be a convenience and speed execution of a program.

If an explicit copy of an object is needed, one can use the copy module:

In [9]:
>>> import copy
>>> a = [1, 2, 3]
>>> b = copy.copy(a)
>>> a[1] = 0
>>> a

[1, 0, 3]

In [10]:
>>> b

[1, 2, 3]

Here, the copy.copy function makes a new location in memory and copies the contents of a to it, and then b is pointed to it.  Since a and b now point to separate locations in memory, modifications to one do not affect the other.  

Actually the copy.copy function only copies the outermost structure of a list.  If a list contains another list, or objects with deeper levels of variables, the copy.deepcopy function must be used to make a full copy.

In [11]:
>>> import copy
>>> a = [1, 2, [3, 4]]
>>> b = copy.copy(a)
>>> c = copy.deepcopy(a)
>>> a[2][1] = 5
>>> a

[1, 2, [3, 5]]

In [13]:
>>> b

[1, 2, [3, 5]]

In [14]:
>>> c

[1, 2, [3, 4]]

The copy module should be used with great caution, which is why it is a module and not part of the standard command set.  The vast majority of Python programs do not need this function if one programs in a Pythonic style—that is, if one uses Python idioms and ways of doing things.  If you find yourself using the copy module frequently, chances are that your code could be rewritten to read and operate much cleaner.

The following example may now puzzle you:

In [15]:
>>> a = 1
>>> b = a
>>> a = 2
>>> a

2

In [16]:
>>> b

1

Why did b not also change?  The reason has to do with immutable objects.  Recall that values are immutable, meaning they cannot be changed once in memory.  In the second line, b points to the location in memory where the value "1" was created in the first line.  In the third line, a new value "2" is created in memory and a is pointed to it—the old value "1" is not modified at all because it is immutable.  As a result, a and b then point to different parts of memory.  In the previous example using a list, the list was actually modified in memory because it is mutable.

Similarly, consider the following example:

In [17]:
>>> a = 1
>>> b = a
>>> a = []
>>> a.append(1)
>>> a

[1]

In [18]:
>>> b

1

Here in the third line, a is assigned to point at a new empty list that is created in memory.

The general rules of thumb for assignments in Python are the following:

    •Assignment using the equals sign ("=") means point the variable name on the left hand side to the location in memory on the right hand side. 

    •If the right hand side is a variable, point the left hand side to the same location in memory that the right hand side points to.  If the right hand side is a new object or value, create a new spot in memory for it and point the left hand side to it.

    •Modifications to a mutable object will affect the corresponding location in memory and hence any variable pointing to it.  Immutable objects cannot be modified and usually involve the creation of new spots in memory.

It is possible to determine if two variable names in Python are pointing to the same value or object in memory using the is statement:

In [19]:
>>> a = [1, 2, 3]
>>> b = a
>>> a is b

True

In [20]:
>>> b = [1, 2, 3]
>>> a is b

False

In the next to the last line, a new spot in memory is created for a new list and b is assigned to it.  This spot is distinct from the area in memory to which a points and thus the is statement returns False when a and b are compared, even though their data is identical.  


One might wonder if Python is memory-intensive given the frequency with which it must create new spots in memory for new objects and values.  Fortunately, Python handles memory management quite transparently and intelligently.  In particular, it uses a technique called garbage collection.  This means that for every spot in memory that Python creates for a value or object, it keeps track of how many variable names are pointing at it.  When no variable name any longer points to a given spot, Python automatically deletes the value or object in memory, freeing its memory for later use.  Consider this example:


In [21]:
>>> a = [1, 2, 3, 4]  #a points to list 1
>>> b = [2, 3, 4, 5]  #b points to list 2
>>> c = a             #c points to list 1
>>> a = b             #a points to list 2
>>> c = b[1]          #c points '3'; list 1 deleted in memory

In the last line, there are no longer any variables that point to the first list and so Python automatically deletes it from memory.  One can explicitly delete a variable using the del statement:

In [22]:
>>> a = [1, 2, 3, 4]
>>> del a

This will delete the variable name a.  In general, however, it does not delete the object to which a points unless a is the only variable pointing to it and Python's garbage-collecting routines kick in.  Consider:

In [23]:
>>> a = [1, 2, 3, 4]
>>> b = a
>>> del a
>>> b

[1, 2, 3, 4]

#Multiple assignment

Lists and tuples enable multiple items to be assigned at the same time.  Consider the following example using lists:

In [24]:
>>> [a, b, c] = [1, 5, 9]
>>> a

1

In [25]:
>>> b

5

In [26]:
>>> c

9

In this example, Python assigned variables by lining up elements in the lists on each side.  The lists must be the same length, or an error will be returned.

Tuples are more efficient for this purpose and are usually used instead of lists for multiple assignments:

In [27]:
>>> (a, b, c) = (5, "hello", [1, 2])
>>> a

5

In [28]:
>>> b

'hello'

In [29]:
>>> c

[1, 2]

However, since Python will interpret any non-enclosed list of values separated by commas as a tuple it is more common to see the following, equivalent statement:

In [30]:
>>> a, b, c = 5, "hello", [1, 2]

Here, each side of the equals sign is interpreted as a tuple and the assignment proceeds as before.  This notation is particularly helpful for functions that return multiple values.  We will discuss this in greater detail later, but here is preview example of a function returning two values:

Technically, the function returns one thing – a tuple containing two values.  However, the multiple assignment notation allows us to treat it as two sequential values.  Alternatively, one could write this statement as:

In this case, returned would be a tuple containing two values.

Because of multiple assignment, list comprehensions can also iterate over multiple values:

In [35]:
>>> l = [(1,2), (3,4), (5,6)]
>>> [a+b for (a,b) in l]

[3, 7, 11]

In this example, the tuple (a,b) is assigned to each item in l, in sequence.  Since l contains tuples, this amounts to assigning a and b to individual tuple members.  We could have done this equivalently in the following, less elegant way:

In [36]:
>>> [t[0] + t[1] for t in l]

[3, 7, 11]

Here, t is assigned to the tuple and we access its elements using bracket indexing.  A final alternative would have been:

In [37]:
>>> [sum(t) for t in l]

[3, 7, 11]

A common use of multiple assignment is to swap variable values:

In [39]:
>>> a = 1
>>> b = 5
>>> a, b = b, a
>>> a

5

In [40]:
>>> b

1

#String functions and manipulation

Python's string processing functions make it enormously powerful and easy to use for processing string and text data, particularly when combined with the utility of lists.  Every string in Python (like every other variable) is an object.  String functions are member functions of these objects, accessed using dot notation.  

Keep in mind two very important points with these functions: (1) strings are immutable, so functions that modify strings actually return new strings that are modified versions of the originals; and (2) all string functions are case sensitive so that 'this' is recognized as a different string than 'This'.

Strings can be sliced just like lists.  This makes it easy to extract substrings:

In [41]:
>>> s = "This is a string"
>>> s[:4]

'This'

In [43]:
>>> "This is a string"[-6:]

'string'

Strings can also be split apart into lists.  The split function will automatically split strings wherever it finds whitespace (e.g., a space or a line break):

In [44]:
>>> "This is a string.\nHello.".split()

['This', 'is', 'a', 'string.', 'Hello.']

Alternatively, one can split a string wherever a particular substring is encountered:

In [45]:
>>> "This is a string.".split('is')

['Th', ' ', ' a string.']

The opposite of the split function is the join function, which takes a list of strings and joins them together with a common separation string.  This function is actually called as a member function of the separation string, not of the list to be joined:

In [46]:
>>> l = ['This', 'is', 'a', 'string.', 'Hello.']
>>> " ".join(l)

'This is a string. Hello.'

In [47]:
>>> ", ".join(["blue", "red", "orange"])

'blue, red, orange'

The join function can be used with a zero-length string:

In [48]:
>>> "".join(["house", "boat"])

'houseboat'

To remove extra beginning and ending whitespace, use the strip function:

In [49]:
>>> "    string   ".strip()

'string'

In [50]:
>>> "string\n\n  ".strip()

'string'

The replace function will make a new string in which all specified substrings have been replaced:

In [53]:
>>> "We code in Python.  We like it.".replace("We", "You")

'You code in Python.  You like it.'

It is possible to test if a substring is present in a string and to get the index of the first character in the string where the substring starts:

In [54]:
>>> s = "This is a string."
>>> "is" in s

True

In [55]:
>>> s.index("is")

2

In [56]:
>> s.index("not")

SyntaxError: invalid syntax (<ipython-input-56-5caac455a19c>, line 1)

Sometimes you need to left- or right-justify strings within a certain field width, padding them with extra spaces as necessary.  There are two functions for doing that:

In [58]:
>>> s = "apple".ljust(10) + "orange".rjust(10) + "\n"  \
...     + "grape".ljust(10) + "pear".rjust(10)
>>> print s

apple         orange
grape           pear


There are a number of functions for manipulating capitalization:

In [59]:
>>> s = "this is a String."
>>> s.lower()

'this is a string.'

In [60]:
>>> s.upper()

'THIS IS A STRING.'

In [61]:
>>> s.capitalize()

'This is a string.'

In [62]:
>>> s.title()

'This Is A String.'

Finally, there are a number of very helpful utilities for testing strings.  One can determine if a string starts or ends with specified substrings:

In [63]:
>>> s = "this is a string."
>>> s.startswith("th")

True

In [64]:
>>> s.startswith("T")

False

In [65]:
>>> s.endswith(".")

True

You can also test the kind of contents in a string.  To see if it contains all alphabetical characters,

In [66]:
>>> "string".isalpha()

True

In [67]:
>>> "string.".isalpha()

False

Similarly, you can test for all numerical characters:

In [69]:
>>> "12834".isdigit()

True

In [70]:
>>> "50 cars".isdigit()

False

#Dictionaries

Dictionaries are another type in Python that, like lists, are collections of objects.  Unlike lists, dictionaries have no ordering.  Instead, they associate keys with values similar to that of a database.  To create a dictionary, we use braces.  The following example creates a dictionary with three items:

In [71]:
>>> d = {"city":"Santa Barbara", "state":"CA", "zip":"93106"}

Here, each element of a dictionary consists of two parts that are entered in key:value syntax.  The keys are like labels that will return the associated value.  Values can be obtained by using bracket notation:

In [72]:
>>> d["city"]

'Santa Barbara'

In [73]:
>>> d["zip"]

'93106'

In [74]:
>>> d["street"]

KeyError: 'street'

Notice that a nonexistent key will return an error.  

Dictionary keys do not have to be strings.  They can be any immutable object in Python: integers, tuples, or strings.  Dictionaries can contain a mixture of these.  Values are not restricted at all; they can be any object in Python: numbers, lists, modules, functions, anything.

In [75]:
>>> d = {"one" : 80.0,  2 : [0, 1, 1],  3 : (-20,-30),  (4, 5) : 60}
>>> d[(4,5)]

60

In [76]:
>>> d[2]

[0, 1, 1]

The following example creates an empty dictionary:

In [77]:
>>> d = {}

Items can be added to dictionaries using assignment and a new key.  If the key already exists, its value is replaced:

In [78]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> d["city"] = "Goleta"
>>> d["street"] = "Calle Real"
>>> d

{'city': 'Goleta', 'state': 'CA', 'street': 'Calle Real'}

To delete an element from a dictionary, use the del statement:

In [79]:
>>> del d["street"]

There are two ways to test if a key is in a dictionary:

In [80]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> "city" in d

True

In [81]:
>>> d.has_key("zip")

False

The size of a dictionary is given by the len function:

In [83]:
>>> len(d)

2

To remove all elements from a dictionary, use the clear object function:

In [84]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> d.clear()
>>> d

{}

One can obtain lists of all keys and values (in no particular order):

In [85]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> d.keys()

['city', 'state']

In [86]:
>>> d.values()

['Santa Barbara', 'CA']

Alternatively, one can get a list of (key,value) tuples for the entire dictionary:

In [87]:
>>> d.items()

[('city', 'Santa Barbara'), ('state', 'CA')]

Similarly, it is possible to create a dictionary from a list of two-tuples:

In [88]:
>>> l = [("street", "Calle Real"), ("school", "UCSB")]
>>> dict(l)

{'school': 'UCSB', 'street': 'Calle Real'}

Finally, dictionaries provide a method to return a default value if a given key is not present:

In [89]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> d.get("city", "Goleta")

'Santa Barbara'

In [90]:
>>> d.get("zip", 93106)

93106

#If statements

if statements allow conditional execution.  Here is an example:

In [91]:
>>> x = 2
>>> if x > 3:
...   print "greater than three"
... elif x > 0:
...   print "greater than zero"
... else:
...   print "less than or equal to zero"

greater than zero


Notice that the first testing line begins with if, the second elif meaning 'else if', and the third with else.  Each of these is followed by a colon with the corresponding commands to execute.  Items after the colon are indented.  For if statements, both elif and else are optional.

A very important concept in Python is that spacing and indentations carry syntactical meaning.  That is, they dictate how to execute statements.  Colons occur whenever there is a set of sub-commands after an if statement, loop, or function definition.  All of the commands that are meant to be grouped together after the colon must be indented by the same amount.  Python does not specify how much to indent, but only requires that the commands be indented in the same way.  Consider:

In [92]:
>>> if 1 < 3:
...     print "line one"
...       print "line two"

IndentationError: unexpected indent (<ipython-input-92-5a6aa3917094>, line 3)

In [93]:
>>> if 1 < 3:
...       print "line one"
...       print "line two"

line one
line two


It is typical to indent four spaces after each colon.  Ultimately Python's use of syntactical whitespace helps make its programs look cleaner and more standardized.

Any statement or function returning a Boolean True or False value can be used in an if statement.  The number 0 is also interpreted as False, while any other number is considered True.  Empty lists and objects return False, whereas non-empty ones are True.

In [94]:
>>> d = {}
>>> if d:
...     print "Dictionary is not empty."
... else:
...    print "Dictionary is empty."

Dictionary is empty.


Single if statements (without elif or else constructs) that execute a single command can be written in one line without indentation:

In [95]:
>>> if 5 < 10: print "Five is less than ten."

Five is less than ten.


Finally, if statements can be nested using indentation:

In [96]:
>>> s = "chocolate chip"
>>> if "mint" in s:
...     print "We do not sell mint."
... elif "chocolate" in s:
...     if "ripple" in s:
...         print "We are all out of chocolate ripple."
...     elif "chip" in s:
...         print "Chocolate chip is our most popular."

Chocolate chip is our most popular.


#For loops

Like other programming languages, Python provides a mechanism for looping over consecutive values.  Unlike many languages, however, Python's loops do not intrinsically iterate over integers, but rather elements in sequences, like lists and tuples.  The general construct is:

Notice that anything falling within the loop is indented beneath the first line, similar to if statements. Here are some examples that iterate over tuples and lists:

In [99]:
>>> for i in [3, "hello", 9.5]:
...   print i

3
hello
9.5


In [100]:
>>> for i in (2.3, [8, 9, 10], {"city":"Santa Barbara"}):
...   print i

2.3
[8, 9, 10]
{'city': 'Santa Barbara'}


Notice that the items in the iterable do not need to be the same type.  In each case, the variable i is given the value of the current list or tuple element, and the loop proceeds over these in sequence.  One does not have to use the variable i; any variable name will do, but if an existing variable is used, its value will be overwritten by the loop.

It is very easy to loop over a part of a list using slicing:

In [101]:
>>> l = [4, 6, 7, 8, 10]
>>> for i in l[2:]:
...   print i

7
8
10


Iteration over a dictionary proceeds over its keys, not its values.  Keep in mind, though, that dictionaries will not return these in any particular order.  In general, it is better to iterate explicitly over keys or values using the dictionary functions that return lists of these:

In [102]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> for val in d:
...   print val

city
state


In [103]:
>>> for val in d.keys():
...   print val

city
state


In [104]:
>>> for val in d.values():
...   print val

Santa Barbara
CA


Using Python's multiple assignment capabilities, it is possible to iterate over more than one value at a time:

In [105]:
>>> l = [(1, 2), (3, 4), (5, 6)]
>>> for (a, b) in l:
...   print a + b

3
7
11


In this example, Python cycles through the list and makes the assignment (a,b) = element for each element in the list.  Since the list contains two-tuples, it effectively assigns a to the first member of the tuple and b to the second.

Multiple assignment makes it easy to cycle over both keys and values in dictionaries at the same time:

In [106]:
>>> d = {"city":"Santa Barbara", "state":"CA"}
>>> d.items()

[('city', 'Santa Barbara'), ('state', 'CA')]

In [107]:
>>> for (key, val) in d.items():
...   print "The key is %s and the value is %s" % (key, val)

The key is city and the value is Santa Barbara
The key is state and the value is CA


It is possible to iterate over sequences of numbers using the range function:

In [108]:
>>> for i in range(4):
...   print i

0
1
2
3


In other programming languages, one might use the following idiom to iterate through items in a list:

In [109]:
>>> l = [8, 10, 12]
>>> for i in range(len(l)):
...   print l[i]

8
10
12


In Python, however, the following is more natural and efficient, and thus always preferred:

In [110]:
>>> l = [8, 10, 12]
>>> for i in l:
...   print i

8
10
12


Notice that the second line could have been written in a single line since there is a single command within the loop, although this is not usually preferred because the loop is less clear upon inspection:

In [111]:
>>> for i in l: print l

[8, 10, 12]
[8, 10, 12]
[8, 10, 12]


If one desires to have the index of the loop in addition to the iterated element, the enumerate command is helpful:

In [112]:
>>> l = [8, 10, 12]
>>> for (ind, val) in enumerate(l):
...   print "The %ith element in the list is %d" % (ind, val)

The 0th element in the list is 8
The 1th element in the list is 10
The 2th element in the list is 12


Notice that enumerate returns indices that always begin at 0, whether or not the loop actually iterates over a slice of a list:

In [113]:
>>> l = [4, 6, 7, 8, 10]
>>> for (ind, val) in enumerate(l[2:]):
...   print "The %ith element in the list is %d" % (ind, val)

The 0th element in the list is 7
The 1th element in the list is 8
The 2th element in the list is 10


It is also possible to iterate over two lists simultaneously using the zip function:

In [114]:
>>> l1 = [1, 2, 3]
>>> l2 = [0, 6, 8]
>>> for (a, b) in zip(l1, l2):
...    print a, b, a+b

1 0 1
2 6 8
3 8 11


The zip function can be used outside of for loops.  It simply takes two or more lists and groups them together, making tuples of corresponding list elements:

In [115]:
>>> zip([1, 2, 3], [4, 5, 6])

[(1, 4), (2, 5), (3, 6)]

In [116]:
>>> zip([1, 2, 3], [4, 5, 6], [7, 8, 9])

[(1, 4, 7), (2, 5, 8), (3, 6, 9)]

This behavior, combined with multiple assignment, is how zip allows simultaneous iteration over multiple lists at once.

Like if statements, loops can be nested:

In [117]:
>>> for i in range(3):
...   for j in range(0,i):
...     print (i, j)

(1, 0)
(2, 0)
(2, 1)


It is possible to skip forward to the next loop iteration immediately, without executing subsequent commands in the same indentation block, using the continue statement.  The following produces the same output as the previous example using continue, but is ultimately less efficient because more loop cycles need to be traversed:

In [118]:
>>> for i in range(3):
...   for j in range(3):
...     if i <= j: continue
...     print (i, j)

(1, 0)
(2, 0)
(2, 1)


One can also terminate the innermost loop using the break statement.  Again, the following produces the same result but is almost as efficient as the first example because the inner loop terminates as soon as the break statement is encountered:

In [119]:
>>> for i in range(3):
...   for j in range(3):
...     if i <= j: break
...     print (i, j)

(1, 0)
(2, 0)
(2, 1)


#While loops

Unlike for loops, while loops do not iterate over a sequence of elements but rather continue so long as some test condition is met.  Their syntax follows indentation rules similar to the cases we have seen before.  The initial statement takes the form:

The following example computes the first couple of values in the Fibonacci sequence:

In [121]:
>>> k1, k2 = 1, 1
>>> while k1 < 20:
...   k1, k2 = k2, k1 + k2
...   print k1

1
2
3
5
8
13
21


Sometimes it is desired to stop the while loop somewhere in the middle of the commands that follow it.  For this purpose, the break statement can be used with an infinite loop.  In the previous example, we might want to print all Fibonacci numbers less than or equal to 20:

In [122]:
>>> k1, k2 = 1, 1
>>> while True:
...   k1, k2 = k2, k1 + k2
...   if k1 > 20: break
...   print k1

1
2
3
5
8
13


Here the infinite while loop is created with the while True statement.  Keep in mind that, if multiple loops are nested, the break statement will stop only the innermost loop 