** Introduction **

You're already here in this iPython notebook, so well done. This brower-based interactive experience provides a useful tool for exploratory and trial-and-error style code writing. And since we're in the business of learning new things, that's inevitably the kind of code writing we'll be doing at first.

Each of the "cells" can be use to execute python code by hitting Shift-Enter.

But this particular cell isn't Python code, it's just "Markdown" which means we can type whatever we want to jot down notes and explanation. 

But "Code" cells do get evaluated in a Python process and then sent back here:

In [3]:
2 + 3

5

This iterative and interactive code execution is quite helpful, as is the Markdown capability, so these notebooks will be a prevalent medium for teaching, learning, and completing homework.

** Basic Data Types **
  * Boolean


In [4]:
2 == 2

True

In [5]:
2 == 3

False

In [6]:
2 != 3

True

  * Strings

In [7]:
poe = "Once upon a midnight dreary"
len(poe)

27

In [8]:
poe.lower()

'once upon a midnight dreary'

In [9]:
poe + ", As I pondered weak and weary"

'Once upon a midnight dreary, As I pondered weak and weary'

In [10]:
poe += ", As I pondered weak and weary"

In [11]:
poe

'Once upon a midnight dreary, As I pondered weak and weary'

In [12]:
poe.split(" ")

['Once',
 'upon',
 'a',
 'midnight',
 'dreary,',
 'As',
 'I',
 'pondered',
 'weak',
 'and',
 'weary']

In [13]:
poe

'Once upon a midnight dreary, As I pondered weak and weary'

In [14]:
"midnight" in poe

True

  * Ints

In [15]:
2**4

16

watch out for integer division in Python 2

In [16]:
7/2

3

  * Floats

In [17]:
7 / float(2)

3.5

** Basic Data Structures ** 
  * Tuples - collections of a fixed size, immutable

In [18]:
someTuple = ("dog",5,True)


Access elements by their position (Python is 0-based)

In [19]:
someTuple[0]

'dog'

In [20]:
someTuple[2]

True

  * Lists - collections of arbitrary size, easily modified

In [21]:
someList = [1,2, "lizard", True]

In [22]:
someList[1] += 5

In [23]:
someList[2] += "_5"

In [24]:
someList

[1, 7, 'lizard_5', True]

In [25]:
someList.append("hello")

In [26]:
someList

[1, 7, 'lizard_5', True, 'hello']

Notice that the append method modifies the contents of the list *in place* and doesn't return anything. Start to take notice of when operations return values and when they have *in place* modification. 

  * Dictionaries - good for storing "lookup associations"

In [27]:
someDict = {"Fred":26, "Amy":31,"Oscar":41}

In [28]:
someDict["Fred"]

26

In [29]:
someDict.keys()


['Amy', 'Oscar', 'Fred']

In [30]:
someDict.values()

[31, 41, 26]

In [31]:
someDict["Not a real key"]

KeyError: 'Not a real key'

** Exercise ** - Implement a dictionary called *inventory* that keeps track of several products and their corresponding price.

In [32]:
inventory = {'prod1':13.24, 'prod2':74.32, 'prod3':89.71, 'prod4':0.60}

** Conditional Statements **
  * If


In [33]:
if "Fred" in someDict.keys():
    print someDict["Fred"]

26


  * Else 

In [34]:
myKey = "Fred"
if myKey in someDict.keys():
    print someDict[myKey]
else:
    print "Key not found"

26


  * If-Elif-Else

In [35]:
myKey = "fred"
if myKey in someDict.keys():
    print someDict[myKey]
elif myKey.capitalize() in someDict.keys():
    print someDict[myKey.capitalize()]
else:
    print "Really, I cannot find that key."
        

26


** Looping over collections **
  * For Loops 

In [36]:
for word in poe.split(" "):
    print "Here's a word: " + word

Here's a word: Once
Here's a word: upon
Here's a word: a
Here's a word: midnight
Here's a word: dreary,
Here's a word: As
Here's a word: I
Here's a word: pondered
Here's a word: weak
Here's a word: and
Here's a word: weary


In [37]:
for index,word in enumerate(poe.split(" ")):
    print "Word " + str(index) +": " + word

Word 0: Once
Word 1: upon
Word 2: a
Word 3: midnight
Word 4: dreary,
Word 5: As
Word 6: I
Word 7: pondered
Word 8: weak
Word 9: and
Word 10: weary


** Exercise ** - Loop through the numbers 0 to 20 and print only the even numbers. [Hint: use the function range() and the mod operator % ]

In [40]:
even_numbers = (num for num in range(21) if num%2 == 0)
for num in even_numbers:
    print num

0
2
4
6
8
10
12
14
16
18
20


  * List Comprehensions - a compact for-loop

In [41]:
[word for word in poe.split(" ")]

['Once',
 'upon',
 'a',
 'midnight',
 'dreary,',
 'As',
 'I',
 'pondered',
 'weak',
 'and',
 'weary']

In [42]:
[word for word in poe.split(" ") if "a" in word]

['a', 'dreary,', 'weak', 'and', 'weary']

** Exercise ** - FizzBuzz

Loop through the numbers 1 to 35. If a number is a multiple of 3, print "Fizz". If the number is a multiple of 5, print "Buzz". If a number is a multiple of 3 and 5, print "FizzBuzz". For all other numbers, just print out the number. 

In [43]:
def multiples(mod, myList):
    return (num for num in myList if num%mod == 0)

In [44]:
for num in range(1,36):
    mystring = ""
    if num%3 == 0:
        mystring = "Fizz"
    if num%5 == 0:
        mystring += "Buzz"
    
    if mystring == "":
        print str(num)
    else:
        print mystring

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz
16
17
Fizz
19
Buzz
Fizz
22
23
Fizz
Buzz
26
Fizz
28
29
FizzBuzz
31
32
Fizz
34
Buzz


** Functions **

Once we start having a complex series of operations to perform, we can and **should** encapsulate those computations into *functions* to provide modularity and reusability.

Defining functions requires the **def** keyword, as well as any (optional) arguments and a colon. The content of the function must be poperly indented and will typically use the **return** keyword.

In [46]:
def say_hello():
    return "hello"

In [47]:
print say_hello()

hello


Functions typically takes arguments as inputs and do some transformation or computation with those inputs.

In [48]:
def  combineTwoStrings(a,b):
    return a+"_"+b

In [49]:
combineTwoStrings("Hello","World")

'Hello_World'

** Exercise ** Cash Register

Write a function **RingUp** that takes in a list of product names, as well as your *inventory* dictionary, and return the total price of the items.

In [81]:
def RingUp(items,inventory):
    return '$' + '{0:.2f}'.format(round(sum([inventory[x] for x in items if x in inventory.keys()]),2))

In [82]:
my_items = ['prod1','prod2','prod3','prod1', 'prod4', 'prod5']
sum([inventory[x] for x in my_items if x in inventory.keys()])

191.10999999999999

In [84]:
my_items2 = ['prod4']

In [85]:
print RingUp(my_items2,inventory)

$0.60


** Exercsise** 
Write a function **GenerousRingUp** that doesn't charge for items that cost less than a dollar.

In [86]:
def GenerousRingUp(items,inventory):
    return '$' + '{0:.2f}'.format(round(sum([inventory[x] for x in items 
                                             if x in inventory.keys() 
                                             if inventory[x] >= 1.0]),2))

In [89]:
GenerousRingUp(my_items, inventory)

'$190.51'

** Exercise ** - Reimplement FizzBuzz as a function that takes an argument for N

In [97]:
def FizzBuss(N):
    for num in range(1,N+1):
        mystring = ""
        if num%3 == 0:
            mystring = "Fizz"
        if num%5 == 0:
            mystring += "Buzz"
    
        if mystring == "":
            print str(num)
        else:
            print mystring

In [98]:
FizzBuss(15)

1
2
Fizz
4
Buzz
Fizz
7
8
Fizz
Buzz
11
Fizz
13
14
FizzBuzz


** Reading In Files **

In [100]:
with open("../../DAT-DC-10/data/airlines.csv", 'rU') as inFile:
    f = inFile.read()

** Exercise **

- Read in the airlines file
- split into rows (use the token \n to split)
- split each row on the comma 
- separate the header row from the data rows


In [108]:
with open("../../DAT-DC-10/data/airlines.csv", 'rU') as inFile:
    f = inFile.read()
    air_rows=f.split('\n')
    new_rows = [row.split(',') for row in air_rows]
    header = new_rows[0]
    data = new_rows[1:]

In [111]:
print header
print data

['airline', 'avail_seat_km_per_week', 'incidents_85_99', 'fatal_accidents_85_99', 'fatalities_85_99', 'incidents_00_14', 'fatal_accidents_00_14', 'fatalities_00_14']
[['Aer Lingus', '320906734', '2', '0', '0', '0', '0', '0'], ['Aeroflot*', '1197672318', '76', '14', '128', '6', '1', '88'], ['Aerolineas Argentinas', '385803648', '6', '0', '0', '1', '0', '0'], ['Aeromexico*', '596871813', '3', '1', '64', '5', '0', '0'], ['Air Canada', '1865253802', '2', '0', '0', '2', '0', '0'], ['Air France', '3004002661', '14', '4', '79', '6', '2', '337'], ['Air India*', '869253552', '2', '1', '329', '4', '1', '158'], ['Air New Zealand*', '710174817', '3', '0', '0', '5', '1', '7'], ['Alaska Airlines*', '965346773', '5', '0', '0', '5', '1', '88'], ['Alitalia', '698012498', '7', '2', '50', '4', '0', '0'], ['All Nippon Airways', '1841234177', '3', '1', '1', '7', '0', '0'], ['American*', '5228357340', '21', '5', '101', '17', '3', '416'], ['Austrian Airlines', '358239823', '1', '0', '0', '1', '0', '0'], ['Av

We can use other utilities that are particularly good at parsing CSV files.

In [113]:
import csv
with open('../../DAT-DC-10/data/airlines.csv', mode='rU') as f:
    file_nested_list = [row for row in csv.reader(f)]

In [114]:
file_nested_list

[['airline',
  'avail_seat_km_per_week',
  'incidents_85_99',
  'fatal_accidents_85_99',
  'fatalities_85_99',
  'incidents_00_14',
  'fatal_accidents_00_14',
  'fatalities_00_14'],
 ['Aer Lingus', '320906734', '2', '0', '0', '0', '0', '0'],
 ['Aeroflot*', '1197672318', '76', '14', '128', '6', '1', '88'],
 ['Aerolineas Argentinas', '385803648', '6', '0', '0', '1', '0', '0'],
 ['Aeromexico*', '596871813', '3', '1', '64', '5', '0', '0'],
 ['Air Canada', '1865253802', '2', '0', '0', '2', '0', '0'],
 ['Air France', '3004002661', '14', '4', '79', '6', '2', '337'],
 ['Air India*', '869253552', '2', '1', '329', '4', '1', '158'],
 ['Air New Zealand*', '710174817', '3', '0', '0', '5', '1', '7'],
 ['Alaska Airlines*', '965346773', '5', '0', '0', '5', '1', '88'],
 ['Alitalia', '698012498', '7', '2', '50', '4', '0', '0'],
 ['All Nippon Airways', '1841234177', '3', '1', '1', '7', '0', '0'],
 ['American*', '5228357340', '21', '5', '101', '17', '3', '416'],
 ['Austrian Airlines', '358239823', '1', '0

** Data Cleaning Exercises ** 
- With the airlines data, create a list of airline names that end in a *
- Create a dictionary where the airline name (without a star) is the key, the value is a 1 or a 0 to indicate whether the name had a star

** Exercise ** With the data set you brought in
- Repeat this process (if applicable)
- Read in the data, separate the rows
- Further parse the rows, if needed
- Come up with something to calculate about the data


In [115]:
import csv
with open('../../project/data/Net_generation_for_all_sectors.csv', mode='rU') as f:
    file_nested_list = [row for row in csv.reader(f)]

In [119]:
header = file_nested_list[4]

In [122]:
data = [row for row in file_nested_list[5:] if row[3] != ""]

Testing the last month if that has a value

In [137]:
total = 0.0
for row in data:
    if (row[-1] != '--' and row[-1] != 'NM' and row[-1] != '-- '):
        total += float(row[-1])
print total/len(data)

189.835685484


*** Optional Material ***

** Higher-Order functions ** 

A higher-order function is any function that either (i) takes a function as an input or (ii) produces a function as an output. 

A common one is *map* which provides us another way to iterate over collections and perform some operation to each element. The first argument to map is a function which we wish to apply to every element. The second argument is the collection.

In [138]:
map(len, poe.split(" "))

[4, 4, 1, 8, 7, 2, 1, 8, 4, 3, 5]

Another common one is *reduce* which has the same positional arguments. For reduce, you function you pass in must take in two elements and produce a single element in return (this forms a *commutative monoid* if you really want to get nerdy here). 

In [139]:
reduce(combineTwoStrings, poe.split(" "))

'Once_upon_a_midnight_dreary,_As_I_pondered_weak_and_weary'

** Lambdas aka Anonymous Functions aka Function Literals **

You may encouter Lambda expressions, and they seem more complicated than they are. Once you get the hang of them, they're super useful, most especially in combination with higher-order functions.

In [140]:
map(lambda s: s.capitalize(),poe.split(" "))

['Once',
 'Upon',
 'A',
 'Midnight',
 'Dreary,',
 'As',
 'I',
 'Pondered',
 'Weak',
 'And',
 'Weary']

** Classes and Methods **

We don't really need to write code in the Object Oriented style, but many of the libraries we'll use (such as scikit-learn) implement their functionality through such an interface, so we should be familiar with the premise. 

In [141]:
class Person:
    def __init__(self,name,age):
        self.name=name
        self.age=age
        
    def age_in_dog_years(self):
        return self.age / 7

In [142]:
person1 = Person("Alice", 22)
person2 = Person("Jeff",42)

In [143]:
person1.name
person1.age

22

In [144]:
person2.age_in_dog_years()

6