Definition:
<font color='blue'>Iterable</font>: An object capable of returning its members one at a time. Examples of iterables include all <font color='blue'>sequence</font> types (such as list, str, and tuple) and some non-sequence types like dict all of which are discussed next.

# Basic data structures

<ol>
  <li>List.</li>
  <li>Tuple.</li>
  <li>Dictionary.</li>
  <li>Set.</li>
</ol>

### <font color='red'>String</font>*

An <font color='blue'>immutable</font> sequence of characters.

In [None]:
text0 = ''  # empty string
text1 = 'x' # a single character
text2 = 'The circumference of the circle is 2 pi R '     # must be contained in ONE line
text3 = "The circumference of the circle is 2 pi R "     # must be contained in ONE line
text4 = '''The circumference of the circle is 2 pi R ''' # can be split among several lines
text5 = """ a string with special character " and ' inside """
text6 = " a string with escaped special character \", \' inside " # Note the use of the control character \

What does it mean to be immutable?

In [None]:
print text4
print id(text4)

In [None]:
text4 = 'changed text'
print id(text4)

### Things to keep in mind

When you read text from a file (or STDIN) you get - text. This is important to remember when you are reading numerical data and you intend to use it as such.

In [None]:
one = 1
two = '2'
print one
print two
print one + two

The + and * operator are overloaded and thus can be used to create new strings
One can also use comparison operators >, >=, < <=, != to compare strings

In [None]:
one = 1
two = '2'
print str(one) + two
print '#'*10
print 'a' < 'b'

Strings are sequences. Therefore, indexing can be used to extract individual characters from a string. The extraction process is known as <font color='blue'>slicing</font>.

In [None]:
for c in 'hello':
    print c,

In [None]:
print text3

# Python uses zero-based indexing. Square brackets are used for indexing:
print text3[0]          

# Slicing: Given string s, s[start:end] is a substring that starts at index 'start' and ends at index 'end-1'
print text3[25:30]
print text3[-7:-1]

In [None]:
text3.<TAB>

### <font color='red'>List</font>

A list is an ordered sequence of elements. They are expressed as a comma-separated list of elements within square brackets, allowing mixed-type elements. Lists are <font color='blue'>mutable</font>. 

Initialization

In [None]:
empty_list = []

In [None]:
some_primes = [2, 3, 5, 7, 11, 13]
shoplist = ['apple', 'mango', 'carrot', 'banana']

In [None]:
# Access elements
print shoplist[1]
print shoplist[2:4]

Proof that lists are mutable.

In [None]:
print 'I have', len(shoplist), 'items to purchase. The list ID is ',id(shoplist)

In [None]:
# This is how one iterates over a list
for item in shoplist:        
    print item,

In [None]:
print 'I also have to buy rice.'
shoplist.append('rice')
print 'My shopping list is now ', shoplist,'The list ID is ',id(shoplist)

print 'I will sort my list now'
shoplist.sort() # Note that the sort is done ON the list
print 'Sorted shopping list is', shoplist,'The list ID is still ',id(shoplist)

In [None]:
# What happens here?
shoplist.append('apple')

List methods

In [None]:
shoplist.<TAB>

In [None]:
# Mixed types
P = ['Wednesday', 'April', 5, 2017, ('a','b','c')]
print P[0:4]
print P[4]

In [None]:
# Multi-dimensional list
A = [[1, 3], [2, 4], [1, 9], [4, 16]]
print A[0]
print A[2][0]

Sometimes we need to loop over a list and retrieve the element and its correponding index

The <font color='blue'>enumerate</font> function  gives us an iterable where each element is an object (called a tuple) that contains the index of the item and the original item value.

In [None]:
presidents = ["Washington", "Adams", "Jefferson", "Madison", "Monroe", "Adams", "Jackson"]
for num, name in enumerate(presidents, start=1): # optional start=1 argument
    print("President {}: {}".format(num, name))

You can loop over multiple lists at the same time.

The zip function takes multiple lists and returns an iterable that provides a tuple of the corresponding elements of each list as we loop over it.

In [None]:
colors = ["red", "green", "blue", "purple"]
ratios = [0.2, 0.3, 0.1, 0.4]
for color, ratio in zip(colors, ratios):
    print("{}% {}".format(ratio * 100, color))

Exercise

Transform the gettysburg_address string into a list whose elements are just the words in the text (no punctuation or special characters). Then print the list and the number of words (i.e. the size of the list). 

In [None]:
gettysburg_address = """
Four score and seven years ago our fathers brought forth on this continent, 
a new nation, conceived in Liberty, and dedicated to the proposition that 
all men are created equal.

Now we are engaged in a great civil war, testing whether that nation, or 
any nation so conceived and so dedicated, can long endure. We are met on
a great battle-field of that war. We have come to dedicate a portion of
that field, as a final resting place for those who here gave their lives
that that nation might live. It is altogether fitting and proper that we
should do this.

But, in a larger sense, we can not dedicate -- we can not consecrate -- we
can not hallow -- this ground. The brave men, living and dead, who struggled
here, have consecrated it, far above our poor power to add or detract.  The
world will little note, nor long remember what we say here, but it can never
forget what they did here. It is for us the living, rather, to be dedicated
here to the unfinished work which they who fought here have thus far so nobly
advanced. It is rather for us to be here dedicated to the great task remaining
before us -- that from these honored dead we take increased devotion to that
cause for which they gave the last full measure of devotion -- that we here
highly resolve that these dead shall not have died in vain -- that this
nation, under God, shall have a new birth of freedom -- and that government
of the people, by the people, for the people, shall not perish from the earth.
"""

In [None]:
gettysburg_address.

### <font color='red'>Tuple</font>

A tuple is an ordered sequence of elements. They are expressed as a comma-separated list of elements optionally within parentheses and allowing mixed-type elements. Tuples are <font color='blue'>immutable</font>. 

Initialization

In [None]:
empty_tuple = ()

In [None]:
some_primes = 2, 3, 5, 7, 11, 13
solar_system = ('mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune')

In [None]:
# Access elements
solar_system[4]

In [None]:
print 'Number of planets in the solar system is', len(solar_system)

Tuples are simple objects. Two methods only:

In [None]:
print solar_system.count('earth')     # to count the number of occurence of a value
print old_solar_system.index('pluto') # to find occurence of a value

# Very little overhead -> faster than lists

In [None]:
# (*)
old_solar_system = solar_system, 'pluto'
print old_solar_system

solar_system_list = ['mercury', 'venus', 'earth', 'mars', 'jupiter', 'saturn', 'uranus', 'neptune']

a_mutable_tuple = (solar_system_list, 'pluto')
a_mutable_tuple[0][2] = 'EARTH'
print a_mutable_tuple

In [None]:
# Assigning multiple values
(x, y, z) = ('a','b','c')
print x, y, z

In [None]:
# Unpacking data
data  = (1,2,3)
print data

Tuples are sequences:

In [None]:
solar_system[0:3]

In [None]:
k = 0
while k < len(solar_system):
    print solar_system[k]
    k += 1

### <font color='red'>Dictionary</font>

A dictionary is an associative data structure of variable length. Unlike lists and tuples, the elements of a dictionary are unordered, instead accessed by an associated key value. Dictionaries are <font color='blue'>mutable</font>. 

Initialization

In [None]:
empty_dict_too = {}

In [None]:
daily_temps = {'mon': 70.2, 'tue': 67.2, 'wed': 71.8, 'thur': 73.2, 'fri': 75.6}

Note that you can use only immutable objects for the keys of a dictionary but you can use either immutable or mutable objects for the values of the dictionary. 

In [None]:
# Access elements
# Location at given key stores desired element in the dictionary. Square brackets are used for accessing elements:
print daily_temps['mon']

In [None]:
print daily_temps.keys()

In [None]:
print daily_temps.values()

In [None]:
print daily_temps.items()

In [None]:
print daily_temps.has_key('sun')

The specific location that a value is stored is determined by a particular method of converting key values into index values called <font color='red'>hashing</font>. Thus, key values must be hashable. A requirement for a data type to be hashable is that the type must be immutable.

In [None]:
temps = {('April',3,2017): 70.2, ('April',4,2017): 67.2, ('April',5,2017): 71.8}
temps[('April',3,2017)]

Looping over a dictionary:

In [None]:
# To loop all the keys from a dictionary 
for k in dict:
    print k

In [None]:
for k in daily_temps:
    print k

In [None]:
# To loop every key and value from a dictionary
for k, v in dict.items():
    print k,v

In [None]:
for k, v in daily_temps.items():
    print k, v

Exercise:
Given the scrabble score dictionary, write a program to compute the value of any word. E.g. 'Python' = 14

In [79]:
score = {"a": 1, "c": 3, "b": 3, "e": 1, "d": 2, "g": 2,
         "f": 4, "i": 1, "h": 4, "k": 5, "j": 8, "m": 3,
         "l": 1, "o": 1, "n": 1, "q": 10, "p": 3, "s": 1,
         "r": 1, "u": 1, "t": 1, "w": 4, "v": 4, "y": 4,
         "x": 8, "z": 10}

In [81]:
word = 'Python'
# Write your program here

### <font color='red'>Set</font>

In [None]:
a = set([1, 2, 3, 4])
b = set([3, 4, 5, 6])

In [None]:
a | b # Union or a.union(b)

In [None]:
a & b # Intersection  or a.intersection(b)

In [None]:
a < b # Subset or a.issubset(b)

In [None]:
a - b # Difference or a.difference(b)

Exercise:
Use set() to compute the number of unique words in the "Gettysburg address". How many of each are there?

In [None]:
unique_words = set(words)
count = {}
for uw in unique_words:
    count[uw] = ga.count(uw)
    
count