---

_You are currently looking at **version 1.1** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._

---

# The Python Programming Language: Functions

<br> 
`add_numbers` is a function that takes two numbers and adds them together.

In [1]:
def add_numbers(x, y):
    return x + y

add_numbers(1, 2)

3

<br> 
`add_numbers` updated to take an optional 3rd parameter. Using `print` allows printing of multiple expressions within a single cell.

In [15]:
def add_numbers(x,y,z=None): #z is None unless entered otherwise
                            #place optional parameter at the end
    if (z==None): 
        return x+y
    else:
        return x+y+z

print(add_numbers(1, 2))
print(add_numbers(1, 2, 3))

3
6


<br> 
`add_numbers` updated to take an optional flag parameter.

In [7]:
def add_numbers(x, y, z=None, flag=False):
    if (flag):
        print('Flag is true!')
    if (z==None):
        return x + y
    else:
        return x + y + z
    
print(add_numbers(1, 2, flag=True))

Flag is true!
3


<br> 
Assign function `add_numbers` to variable `a`.

In [24]:
def add_numbers(x,y):
    return x+y

a = add_numbers
a(1,2)


3

<br> 
# The Python Programming Language: Types and Sequences

<br> 
Use `type` to return the object's type.

In [2]:
type('This is a string')

str

In [8]:
type(None) #NoneType

NoneType

In [3]:
type(1)

int

In [27]:
type(1.0)

float

In [30]:
type(add_numbers)

function

<br> 
Tuples are an immutable data structure (cannot be altered).

In [6]:
x = (1, 'a', 2, 'b')
type(x)

tuple

<br> 
Lists are a mutable data structure.

In [17]:
x = [1, 'a', 2, 'b']
type(x)

list

<br> 
Use `append` to append an object to a list.

In [15]:
x.append([3,4]) #inplace
print(x)

[1, 'a', 2, 'b', 3.3, 3.3, 6.6, [3, 4]]


<br> 
Use `extend` to concacernate lists.

In [16]:
x.extend([5,6]) #inplace
print(x)

[1, 'a', 2, 'b', 3.3, 3.3, 6.6, [3, 4], 5, 6]


<br> 
This is an example of how to loop through each item in the list.

In [18]:
for item in x:
    print(item)

1
a
2
b


<br> 
Or using the indexing operator:

In [7]:
i=0
while i < len(x): # from 0 to length-1, inclusive. run (length) times
    print(x[i])
    i = i + 1

1
a
2
b


<br> 
Use `+` to concatenate lists.

In [8]:
[1,2] + [3,4]

[1, 2, 3, 4]

<br> 
Use `*` to repeat lists.

In [9]:
[1]*3

[1, 1, 1]

<br> 
Use the `in` operator to check if something is inside a list.

In [22]:
1 in [1, 2, 3]

True

<br> 
Now let's look at strings. Use bracket notation to slice a string.

In [10]:
x = 'This is a string'
print(x[0]) #first character
print(x[0:1]) #first character, but we have explicitly set the end character
print(x[0:2]) #first two characters


T
T
Th


<br> 
This will return the last element of the string.

In [11]:
x[-1]

'g'

<br> 
This will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.

In [12]:
x[-4:-2] #length = diff. Same as JS

'ri'

<br> 
This is a slice from the beginning of the string and stopping before the 3rd element.

In [13]:
x[:3]

'Thi'

<br> 
And this is a slice starting from the 4th element of the string and going all the way to the end.

In [None]:
x[3:]

In [14]:
firstname = 'Christopher'
lastname = 'Brooks'

print(firstname + ' ' + lastname)
print(firstname*3)
print('Chris' in firstname)


Christopher Brooks
ChristopherChristopherChristopher
True


`split` returns a list of all the words in a string, or a list split on a specific character.


In [15]:
firstname = 'Christopher Arthur Hansen Brooks'.split(' ')[0] # [0] selects the first element of the list
lastname = 'Christopher Arthur Hansen Brooks'.split()[-1] # [-1] selects the last element of the list
print(firstname)
print(lastname)

Christopher
Brooks


<br> 
Make sure you convert objects to strings before concatenating.

In [22]:
try:
    'Chris' + 2
except TypeError:
    print("Cannot concatenating string with number")
else:
    print("You get away.")

Cannot concatenating string with number


In [16]:
'Chris' + str(2)

'Chris2'

<br> 
Dictionaries associate keys with values.

In [32]:
x = {'Christopher Brooks': 'brooksch@umich.edu', 'Bill Gates': 'billg@microsoft.com'}
x['Christopher Brooks'] # Retrieve a value by using the indexing operator
 

'brooksch@umich.edu'

In [33]:
x['Kevyn Collins-Thompson'] = None #change the value
print(type(x['Kevyn Collins-Thompson'] )) 

<class 'NoneType'>


<br> 
Iterate over all of the keys:

In [33]:
for key in x: #default means keys
    print(x[key])

brooksch@umich.edu
billg@microsoft.com
None


<br> 
Iterate over all of the values:

In [39]:
for value in x.values(): #values
    print(value)
    
    

brooksch@umich.edu
billg@microsoft.com
None


<br> 
Iterate over all of the items in the list:

In [27]:
print(x.items()) #returns dict_items as a list of tuples
for key, value in x.items(): #key, value
    print(key)
    print(value)

dict_items([('Christopher Brooks', 'brooksch@umich.edu'), ('Bill Gates', 'billg@microsoft.com')])
Christopher Brooks
brooksch@umich.edu
Bill Gates
billg@microsoft.com


<br> 
<strong>Destructuring: </strong>You can unpack a sequence into different variables:

In [34]:
#unpacks sequence i.e. tuple, into multiple variables
x = ('Christopher', 'Brooks', 'brooksch@umich.edu')
fname, lname, email = x
fname, lname, email 

('Christopher', 'Brooks', 'brooksch@umich.edu')

<br> 
Make sure the number of values you are unpacking matches the number of variables being assigned.

In [35]:
try:
    x = ('Christopher', 'Brooks', 'brooksch@umich.edu', 'Ann Arbor')
    fname, lname, email = x
except: 
    print('ValuError: too many values to unpack')


ValuError: too many values to unpack


<br> 
# The Python Programming Language: More on Strings

`str.format()` Python has a built in method for convenient string formatting.

In [None]:
sales_record = {
'price': 3.24,
'num_items': 4,
'person': 'Chris'}

sales_statement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(sales_statement.format(sales_record['person'],
                             sales_record['num_items'],
                             sales_record['price'],
                             sales_record['num_items']*sales_record['price']))


<br> 
# Reading and Writing CSV files

<br> 
Let's import our datafile mpg.csv, which contains fuel economy data for 234 cars.

* mpg : miles per gallon
* class : car classification
* cty : city mpg
* cyl : # of cylinders
* displ : engine displacement in liters
* drv : f = front-wheel drive, r = rear wheel drive, 4 = 4wd
* fl : fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* hwy : highway mpg
* manufacturer : automobile manufacturer
* model : model of car
* trans : type of transmission
* year : model year

In [2]:
import csv #csv reader

In [3]:
#iPython magic: set dislay precision
%precision 2

'%.2f'

In [4]:
%cat 'mpg.csv'

"","manufacturer","model","displ","year","cyl","trans","drv","cty","hwy","fl","class"
"1","audi","a4",1.8,1999,4,"auto(l5)","f",18,29,"p","compact"
"2","audi","a4",1.8,1999,4,"manual(m5)","f",21,29,"p","compact"
"3","audi","a4",2,2008,4,"manual(m6)","f",20,31,"p","compact"
"4","audi","a4",2,2008,4,"auto(av)","f",21,30,"p","compact"
"5","audi","a4",2.8,1999,6,"auto(l5)","f",16,26,"p","compact"
"6","audi","a4",2.8,1999,6,"manual(m5)","f",18,26,"p","compact"
"7","audi","a4",3.1,2008,6,"auto(av)","f",18,27,"p","compact"
"8","audi","a4 quattro",1.8,1999,4,"manual(m5)","4",18,26,"p","compact"
"9","audi","a4 quattro",1.8,1999,4,"auto(l5)","4",16,25,"p","compact"
"10","audi","a4 quattro",2,2008,4,"manual(m6)","4",20,28,"p","compact"
"11","audi","a4 quattro",2,2008,4,"auto(s6)","4",19,27,"p","compact"
"12","audi","a4 quattro",2.8,1999,6,"auto(l5)","4",15,25,"p","compact"
"13","audi","a4 quattro",2.8,1999,6,"manual(m5)","4",17,25,"p","compact"
"14","audi","a4 quattro",3.1,2008,6,"a

csv.`DictReader`( f ) to read dict format from .csv

In [17]:
# Close csv object after reading.
csvfile = open('mpg.csv')
mpg=list(csv.DictReader(csvfile))
csvfile.close()
mpg

[{'': '1',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '18',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '2',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'manual(m5)',
  'drv': 'f',
  'cty': '21',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '3',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'manual(m6)',
  'drv': 'f',
  'cty': '20',
  'hwy': '31',
  'fl': 'p',
  'class': 'compact'},
 {'': '4',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'auto(av)',
  'drv': 'f',
  'cty': '21',
  'hwy': '30',
  'fl': 'p',
  'class': 'compact'},
 {'': '5',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2.8',
  'year': '1999',
  'cyl': '6',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '16',
  'hwy': '26',
 

Auto close using `with open`('file.csv') `as`opened_file:

In [9]:
with open('mpg.csv') as csvfile:
    mpg = list(csv.DictReader(csvfile)) # wrap in a list
    #close file
mpg

[{'': '1',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '18',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '2',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'manual(m5)',
  'drv': 'f',
  'cty': '21',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '3',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'manual(m6)',
  'drv': 'f',
  'cty': '20',
  'hwy': '31',
  'fl': 'p',
  'class': 'compact'},
 {'': '4',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'auto(av)',
  'drv': 'f',
  'cty': '21',
  'hwy': '30',
  'fl': 'p',
  'class': 'compact'},
 {'': '5',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2.8',
  'year': '1999',
  'cyl': '6',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '16',
  'hwy': '26',
 

In [10]:
mpg[:3] # The first three dictionaries in our list.

[{'': '1',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '18',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '2',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'manual(m5)',
  'drv': 'f',
  'cty': '21',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '3',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'manual(m6)',
  'drv': 'f',
  'cty': '20',
  'hwy': '31',
  'fl': 'p',
  'class': 'compact'}]

<br> 
`csv.Dictreader` has read in each row of our csv file as a dictionary. `len` shows that our list is comprised of 234 dictionaries.

In [18]:
len(mpg)

234

<br> 
`keys` gives us the column names of our csv.

In [58]:
mpg[0].keys() #select first set as a sample

odict_keys(['', 'manufacturer', 'model', 'displ', 'year', 'cyl', 'trans', 'drv', 'cty', 'hwy', 'fl', 'class'])

<br> 
This is how to find the average cty fuel economy across all cars. All values in the dictionaries are strings, so we need to convert to float.

In [60]:
sum(float(d['cty']) for d in mpg) / len(mpg) # d is every dictionary in the list
#see explanation below

16.86

In [63]:
# Explanation of the last code
thesum = 0 
thelist = []
for d in mpg:
    thelist.append (float(d['cty']))
    thesum = thesum + float(d['cty'])
print (thesum/len(mpg)) 

16.858974358974358


Similarly this is how to find the average hwy fuel economy across all cars.

In [None]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

<br> 
Use `set` to return the unique values for the number of cylinders the cars in our dataset have.

In [15]:
cylinders = set(d['cyl'] for d in mpg)
cylinders

{'4', '5', '6', '8'}

<br> 
Here's a more complex example where we are grouping the cars by number of cylinder, and finding the average cty mpg for each group.

In [31]:
# for each cylinder-type in the cylinder-set, 
# pair it with the avergae city-mpg 
# which is the sum of list of city-mpgs of mpg-item in the mpg-lists divided
# by the length of list of city-mpgs in the mpg-item in the mpg-lists 
# when the cylinder-type in the mpg-item is the one we start with.

sorted([(c,sum([float(d['cty'])for d in mpg if d['cyl']==c])/
    sum([1 for d in mpg if d['cyl']==c])) for c in cylinders],
       key=lambda x:x[0])


[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

Same as below:

In [16]:
CtyMpgByCyl = []

for c in cylinders: # iterate over all the cylinder levels
    summpg = 0
    cyltypecount = 0
    for d in mpg: # iterate over all dictionaries
        if d['cyl'] == c: # if the cylinder level type matches,
            summpg += float(d['cty']) # add the cty mpg
            cyltypecount += 1 # increment the count
    CtyMpgByCyl.append((c, summpg / cyltypecount)) # append the tuple ('cylinder', 'avg mpg')

CtyMpgByCyl.sort(key=lambda x: x[0]) #key = lambda fuction: sort by first element
CtyMpgByCyl

[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

<br> 
Use `set` to return the unique values for the class types in our dataset.

In [None]:
vehicleclass = set(d['class'] for d in mpg) # what are the class types
vehicleclass

<br> 
And here's an example of how to find the average hwy mpg for each class of vehicle in our dataset.

In [None]:
HwyMpgByClass = []

for t in vehicleclass: # iterate over all the vehicle classes
    summpg = 0
    vclasscount = 0
    for d in mpg: # iterate over all dictionaries
        if d['class'] == t: # if the cylinder amount type matches,
            summpg += float(d['hwy']) # add the hwy mpg
            vclasscount += 1 # increment the count
    HwyMpgByClass.append((t, summpg / vclasscount)) # append the tuple ('class', 'avg mpg')

HwyMpgByClass.sort(key=lambda x: x[1])
HwyMpgByClass

<br> 
# The Python Programming Language: Dates and Times

In [32]:
import datetime as dt
import time as tm

`time.time()` returns the time stamp in seconds since the Epoch. (January 1st, 1970)

In [69]:
#seconds past since 1970
#input for dt.datetime.fromtimestamp()
tm.time() 

1547439570.28

`time.gmtime()` returns the object of current time, with accessible component by calling attributes.

In [37]:
tm.gmtime()

time.struct_time(tm_year=2020, tm_mon=5, tm_mday=19, tm_hour=0, tm_min=40, tm_sec=2, tm_wday=1, tm_yday=140, tm_isdst=0)

`dt.datetime.today()` returns current datetime object

In [49]:
#return (YYYY, M, D, hour24, min, sec)
dtnow = dt.datetime.today() 

dtnow

datetime.datetime(2020, 5, 18, 18, 48, 15, 62676)

`dt.datetime.fromtimestamp(timestamp)`Convert the timestamp to datetime object

In [50]:
#return (YYYY, M, D, hour24, min, sec)
dtnow = dt.datetime.fromtimestamp(tm.time()) # use tm.time() /s as input

dtnow

datetime.datetime(2020, 5, 18, 18, 48, 22, 389708)

Handy datetime attributes:

In [41]:
dtnow.year, dtnow.month, dtnow.day, dtnow.hour, dtnow.minute, dtnow.second # get year, month, day, etc.from a datetime

(2020, 5, 18, 18, 43, 24)

<br> 
`timedelta` is a duration expressing the difference between two dates.

In [46]:
#convert input to days and seconds duration
delta = dt.timedelta(hours=1000) # create a timedelta of 100 days
delta

datetime.timedelta(days=41, seconds=57600)

<br> 
`date.today` returns the current local date.

In [51]:
today = dt.date.today() # return date(YYYY, M, D)
today

datetime.date(2020, 5, 18)

In [80]:
#Use delta to modify date

today - delta # the date 100 days ago

datetime.date(2018, 10, 5)

In [52]:
today > today-delta # compare dates

True

<br> 
# The Python Programming Language: Objects and map()

<br> 
An example of a class in python:

In [53]:
class Person: #Capitalize first word
    department = 'School of Information' #a class variable
    def __init__(self, name = "Unknonw", location = "Unknown"): #default value if no input
        self.name = name
        self.location = location
    def set_name(self, new_name): #a method
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location

In [89]:
person = Person()
print('By Default: ', person.name, person.location)
person.set_name('Christopher Brooks')
person.set_location('Ann Arbor, MI, USA')
print('{} live in {} and works in the department {}'.format(person.name, person.location, person.department))

By Default:  Unknonw Unknown
Christopher Brooks live in Ann Arbor, MI, USA and works in the department School of Information


`map(function, array1, array2, ...)` vertically iterate the function to arraays

In [77]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
cheapest = map(min, store1, store2) #return a Map type
[item for item in cheapest] #unpack map with iteration

[9.00, 11.00, 12.34, 2.01]

Above is equal to:

In [94]:
#in this case, not necessary because min() also compare arrays vertically
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
min(store1, store2)

[9.00, 11.10, 12.34, 2.01]

Example:<br> Here is a list of faculty teaching this MOOC. Can you write a function and apply it using map() to get a list of all faculty titles and last names (e.g. ['Dr. Brooks', 'Dr. Collins-Thompson', …]) ?

In [89]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

#List comprehension
[person.split(' ')[0] + person.split(' ')[-1] for person in people ]

#OR map lambda function
list(map(lambda x: x.split()[0] + x.split()[-1], people))


['Dr.Brooks', 'Dr.Collins-Thompson', 'Dr.Vydiswaran', 'Dr.Romero']

<br> 
# The Python Programming Language: Lambda and List Comprehensions

Lambda function is a anonymous function

In [None]:
my_function = lambda a, b, c : a + b

In [None]:
my_function(1, 2, 3)

Example:<br>Convert this function into a lambda

<br> 
Let's iterate from 0 to 999 and return the even numbers.

In [95]:
my_list = []
for number in range(0, 1000): #from 0 to 999 inclusive
    if number % 2 == 0:
        my_list.append(number)
my_list

[0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 20,
 22,
 24,
 26,
 28,
 30,
 32,
 34,
 36,
 38,
 40,
 42,
 44,
 46,
 48,
 50,
 52,
 54,
 56,
 58,
 60,
 62,
 64,
 66,
 68,
 70,
 72,
 74,
 76,
 78,
 80,
 82,
 84,
 86,
 88,
 90,
 92,
 94,
 96,
 98,
 100,
 102,
 104,
 106,
 108,
 110,
 112,
 114,
 116,
 118,
 120,
 122,
 124,
 126,
 128,
 130,
 132,
 134,
 136,
 138,
 140,
 142,
 144,
 146,
 148,
 150,
 152,
 154,
 156,
 158,
 160,
 162,
 164,
 166,
 168,
 170,
 172,
 174,
 176,
 178,
 180,
 182,
 184,
 186,
 188,
 190,
 192,
 194,
 196,
 198,
 200,
 202,
 204,
 206,
 208,
 210,
 212,
 214,
 216,
 218,
 220,
 222,
 224,
 226,
 228,
 230,
 232,
 234,
 236,
 238,
 240,
 242,
 244,
 246,
 248,
 250,
 252,
 254,
 256,
 258,
 260,
 262,
 264,
 266,
 268,
 270,
 272,
 274,
 276,
 278,
 280,
 282,
 284,
 286,
 288,
 290,
 292,
 294,
 296,
 298,
 300,
 302,
 304,
 306,
 308,
 310,
 312,
 314,
 316,
 318,
 320,
 322,
 324,
 326,
 328,
 330,
 332,
 334,
 336,
 338,
 340,
 342,
 344,
 346,
 348,
 350,

<br> 
Now the same thing but with list comprehension.

In [None]:
my_list = [number for number in range(0,1000) if number % 2 == 0]
my_list

Example: list comprehension can be nested.

In [102]:
def times_tables():
    lst = []
    for i in range(3):
        for j in range (5):
            lst.append(i*j)
    return lst
times_tables()

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 0,
 3,
 6,
 9,
 12,
 15,
 18,
 21,
 24,
 27,
 0,
 4,
 8,
 12,
 16,
 20,
 24,
 28,
 32,
 36]

In [108]:
#i, j in casacading order
times_tables = [i*j for i in range(3) for j in range(5)]
times_tables

[0, 0, 0, 0, 0, 0, 1, 2, 3, 4, 0, 2, 4, 6, 8]

Example:<br> Imagine you work at an internet service provider and the user ids are all two letters followed by two numbers (e.g. aa49). Write an initialization line as a single list comprehension which creates a list of all possible user ids. Assume the letters are all lower case.

In [110]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

[i + j + m + n for i in lowercase \
          for j in lowercase \
         for m in digits \
         for n in digits]

['aa00',
 'aa01',
 'aa02',
 'aa03',
 'aa04',
 'aa05',
 'aa06',
 'aa07',
 'aa08',
 'aa09',
 'aa10',
 'aa11',
 'aa12',
 'aa13',
 'aa14',
 'aa15',
 'aa16',
 'aa17',
 'aa18',
 'aa19',
 'aa20',
 'aa21',
 'aa22',
 'aa23',
 'aa24',
 'aa25',
 'aa26',
 'aa27',
 'aa28',
 'aa29',
 'aa30',
 'aa31',
 'aa32',
 'aa33',
 'aa34',
 'aa35',
 'aa36',
 'aa37',
 'aa38',
 'aa39',
 'aa40',
 'aa41',
 'aa42',
 'aa43',
 'aa44',
 'aa45',
 'aa46',
 'aa47',
 'aa48',
 'aa49',
 'aa50',
 'aa51',
 'aa52',
 'aa53',
 'aa54',
 'aa55',
 'aa56',
 'aa57',
 'aa58',
 'aa59',
 'aa60',
 'aa61',
 'aa62',
 'aa63',
 'aa64',
 'aa65',
 'aa66',
 'aa67',
 'aa68',
 'aa69',
 'aa70',
 'aa71',
 'aa72',
 'aa73',
 'aa74',
 'aa75',
 'aa76',
 'aa77',
 'aa78',
 'aa79',
 'aa80',
 'aa81',
 'aa82',
 'aa83',
 'aa84',
 'aa85',
 'aa86',
 'aa87',
 'aa88',
 'aa89',
 'aa90',
 'aa91',
 'aa92',
 'aa93',
 'aa94',
 'aa95',
 'aa96',
 'aa97',
 'aa98',
 'aa99',
 'ab00',
 'ab01',
 'ab02',
 'ab03',
 'ab04',
 'ab05',
 'ab06',
 'ab07',
 'ab08',
 'ab09',
 'ab10',
 

<br> 
# The Python Programming Language: Numerical Python (NumPy)

**See Numpy_Introduction.ipynb**

In [109]:
import numpy as np

<br> 
## THE NP.ARRAY( )

Create a list and convert it to a numpy array

In [110]:
mylist = [1, 2, 3]
x = np.array(mylist)
x

array([1, 2, 3])

<br> 
Or just pass in a list directly

In [111]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

<br> 
Pass in a list of lists to create a multidimensional array.

In [112]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

<br> 
Use the shape method to find the dimensions of the array. (rows, columns)

In [113]:
m.shape

(2, 3)

<br> 
`arange` returns evenly spaced values within a given interval.

In [117]:
n = np.arange(0, 11, 2) # start at 0 count up by 2, stop at 10. 
n

array([ 0,  2,  4,  6,  8, 10])

<br> 
`reshape` returns an array with the same data with a new shape.

In [123]:
n = n.reshape(2, 3) # reshape array to be 2 row x 3 columns
n

array([[ 0,  2,  4],
       [ 6,  8, 10]])

<br> 
`linspace` returns evenly spaced numbers over a specified interval.

In [140]:
#return 9 evenly spaced values from 0 to 4
o = np.linspace(0, 4, 9) #default 4 is included
print(o)
o = np.linspace(0, 4, 9, endpoint = False, dtype = 'f2')
print(o)

[0.  0.5 1.  1.5 2.  2.5 3.  3.5 4. ]
[0.     0.4443 0.8887 1.333  1.777  2.223  2.666  3.111  3.555 ]


<br> 
`resize` changes the shape and size of array in-place.

In [144]:
o.resize(3, 3)
o

array([[0.    , 0.4443, 0.8887],
       [1.333 , 1.777 , 2.223 ],
       [2.666 , 3.111 , 3.555 ]], dtype=float16)

<br> 
`ones` returns a new array of given shape and type, filled with ones.

In [145]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

<br> 
`zeros` returns a new array of given shape and type, filled with zeros.

In [146]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

<br> 
`eye` returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [147]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

<br> 
`diag` extracts a diagonal or constructs a diagonal array.

In [151]:
np.diag(y)#1D array --> Diagonal array


array([[4, 0, 0],
       [0, 5, 0],
       [0, 0, 6]])

In [150]:
np.diag(np.diag(y)) #Diagonal array --> 1D array

array([4, 5, 6])

<br> 
Create an array using repeating list (or see `np.tile`)

In [167]:
# List does NOT support element-wise calculation
[1, 2, 3] * 3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

In [165]:
#Array supports element-wise calculation
np.array([1, 2, 3] * 3) * 3

array([3, 6, 9, 3, 6, 9, 3, 6, 9])

<br> 
Repeat elements of an array using `repeat`.

In [159]:
np.repeat([1, 2, 3], 3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

<br> 
#### Combining Arrays

In [155]:
p = np.ones([2, 3], int)
p

array([[1, 1, 1],
       [1, 1, 1]])

<br> 
Use `vstack` to stack arrays in sequence vertically (row wise).

In [160]:
np.vstack([p, 2*p])

array([[1, 1, 1],
       [1, 1, 1],
       [2, 2, 2],
       [2, 2, 2]])

<br> 
Use `hstack` to stack arrays in sequence horizontally (column wise).

In [161]:
np.hstack([p, 2*p])

array([[1, 1, 1, 2, 2, 2],
       [1, 1, 1, 2, 2, 2]])

<br> 
## Operations

Use `+`, `-`, `*`, `/` and `**` to perform element wise addition, subtraction, multiplication, division and power.

In [162]:
print(x + y) # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
print(x - y) # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

[5 7 9]
[-3 -3 -3]


In [163]:
print(x * y) # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
print(x / y) # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

[ 4 10 18]
[0.25 0.4  0.5 ]


In [164]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]

[1 4 9]


<br> 
**Dot Product:**  

$ \begin{bmatrix}x_1 \ x_2 \ x_3\end{bmatrix}
\cdot
\begin{bmatrix}y_1 \\ y_2 \\ y_3\end{bmatrix}
= x_1 y_1 + x_2 y_2 + x_3 y_3$

In [168]:
x.dot(y) # dot product  1*4 + 2*5 + 3*6

32

In [169]:
z = np.array([y, y**2])
print(len(z)) # number of rows of array

2


<br> 
Let's look at transposing arrays. Transposing permutes the dimensions of the array.

In [170]:
z = np.array([y, y**2])
z

array([[ 4,  5,  6],
       [16, 25, 36]])

<br> 
The shape of array `z` is `(2,3)` before transposing.

In [171]:
z.shape

(2, 3)

<br> 
Use `.T` to get the transpose.

In [172]:
z.T

array([[ 4, 16],
       [ 5, 25],
       [ 6, 36]])

<br> 
The number of rows has swapped with the number of columns.

In [173]:
z.T.shape

(3, 2)

<br> 
Use `.dtype` to see the data type of the elements in the array.

In [174]:
z.dtype

dtype('int64')

<br> 
Use `.astype` to cast to a specific type.

In [177]:
z = z.astype('f')
z.dtype

dtype('float32')

<br> 
## Math Functions

Numpy has many built in math functions that can be performed on arrays.

In [179]:
import numpy as np
a = np.array([-4, -2, 1, 3, 5])

In [180]:
a.sum()

3

In [181]:
a.max()

5

In [182]:
a.min()

-4

In [183]:
a.mean()

0.6

In [184]:
a.std()

3.2619012860600183

<br> 
`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [185]:
a.argmax()

4

In [186]:
a.argmin()

0

array also supports python native list functions

In [188]:
min(a)

-4

<br> 
## Indexing / Slicing

In [189]:
s = np.arange(13)**2
s

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144])

<br> 
Use bracket notation to get the value at a specific index. Remember that indexing starts at 0.

In [190]:
s[0], s[4], s[-1]

(0, 16, 144)

<br> 
Use `:` to indicate a range. `array[start:stop]`


Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [191]:
s[1:5]

array([ 1,  4,  9, 16])

<br> 
Use negatives to count from the back.

In [192]:
# count to the right end by default
s[-4:]

array([ 81, 100, 121, 144])

<br> 
A second `:` can be used to indicate step-size. `array[start:stop:stepsize]`

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [193]:
s[-5::-2]

array([64, 36, 16,  4,  0])

<br> 
Let's look at a multidimensional array.

In [220]:
r = np.arange(36)
r.resize((6, 6))
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

<br> 
Use bracket notation to slice: `array[row, column]`

In [196]:
r[2, 2]

14

<br> 
And use : to select a range of rows or columns

In [197]:
r[3, 3:6]

array([21, 22, 23])

<br> 
Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [198]:
r[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

<br> 
This is a slice of the last row, and only every other element.

In [199]:
r[-1, ::2]

array([30, 32, 34])

<br> 
We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also see `np.where`)

In [201]:
np.where(r>0)

(array([5, 5, 5, 5, 5]), array([1, 2, 3, 4, 5]))

In [207]:
#return a filter array
[r > 20]

[array([[False, False, False, False, False, False],
        [False, False, False, False, False, False],
        [False, False, False, False, False, False],
        [False, False, False,  True,  True,  True],
        [ True,  True,  True,  True,  True,  True],
        [ True,  True,  True,  True,  True,  True]])]

In [221]:
# return a flattened refernece of results
r[r > 20]

array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])

<br> 
Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [215]:
# Assign new value to the qualified referneces
r[r > 30] = 30
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br> 
## Copying Data

Be careful with copying and modifying arrays in NumPy!


`r2` is a slice of `r`

In [222]:
r2 = r[:3,:3]
r2

array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14]])

<br> 
Set this slice's values to zero ([:] selects the entire array)

In [None]:
r2[:] = 0
r2

<br> 
`r` has also been changed!

In [None]:
r

<br> 
To avoid this, use `r.copy` to create a copy that will not affect the original array

In [223]:
r_copy = r.copy()
r_copy

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

<br> 
Now when r_copy is modified, r will not be changed.

In [None]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

<br> 
### Iterating Over Arrays

Let's create a new 4 by 3 array of random numbers 0-9.

In [224]:
test = np.random.randint(0, 10, (4,3))
test

array([[2, 4, 9],
       [2, 1, 6],
       [2, 4, 3],
       [4, 2, 1]])

<br> 
Iterate by row:

In [225]:
for row in test:
    print(row)

[2 4 9]
[2 1 6]
[2 4 3]
[4 2 1]


<br> 
Iterate by index:

In [226]:
for i in range(len(test)):
    print(test[i])

[2 4 9]
[2 1 6]
[2 4 3]
[4 2 1]


<br> 
Iterate by row and index:

In [227]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

row 0 is [2 4 9]
row 1 is [2 1 6]
row 2 is [2 4 3]
row 3 is [4 2 1]


<br> 
Use `zip` to iterate over multiple iterables.

In [229]:
test2 = test**2
test2

array([[ 4, 16, 81],
       [ 4,  1, 36],
       [ 4, 16,  9],
       [16,  4,  1]])

In [230]:
for i, j in zip(test, test2):
    print(i,'+',j,'=',i+j)

[2 4 9] + [ 4 16 81] = [ 6 20 90]
[2 1 6] + [ 4  1 36] = [ 6  2 42]
[2 4 3] + [ 4 16  9] = [ 6 20 12]
[4 2 1] + [16  4  1] = [20  6  2]
