---

_You are currently looking at **version 1.1** of this notebook. To download notebooks and datafiles, as well as get help on Jupyter notebooks in the Coursera platform, visit the [Jupyter Notebook FAQ](https://www.coursera.org/learn/python-data-analysis/resources/0dhYG) course resource._

---

# The Python Programming Language: Functions

<br>
`add_numbers` is a function that takes two numbers and adds them together.

In [1]:
def add_numbers(x, y):
    return x + y

add_numbers(1, 2)

3

<br>
`add_numbers` updated to take an optional 3rd parameter. Using `print` allows printing of multiple expressions within a single cell.

In [2]:
def add_numbers(x,y,z=None):
    if (z==None):
        return x+y
    else:
        return x+y+z

print(add_numbers(1, 2))
print(add_numbers(1, 2, 3))

3
6


<br>
`add_numbers` updated to take an optional flag parameter.

In [None]:
def add_numbers(x, y, z=None, flag=False):
    if (flag):
        print('Flag is true!')
    if (z==None):
        return x + y
    else:
        return x + y + z
    
print(add_numbers(1, 2, flag=True))

<br>
Assign function `add_numbers` to variable `a`.

In [None]:
def add_numbers(x,y):
    return x+y

a = add_numbers
a(1,2)

<br>
# The Python Programming Language: Types and Sequences

<br>
Use `type` to return the object's type.

In [None]:
type('This is a string')

In [None]:
type(None)

In [None]:
type(1)

In [None]:
type(1.0)

In [None]:
type(add_numbers)

<br>
Tuples are an immutable data structure (cannot be altered).

In [None]:
x = (1, 'a', 2, 'b')
type(x)

<br>
Lists are a mutable data structure.

In [None]:
x = [1, 'a', 2, 'b']
type(x)

<br>
Use `append` to append an object to a list.

In [None]:
x.append(3.3)
print(x)

<br>
This is an example of how to loop through each item in the list.

In [None]:
for item in x:
    print(item)

<br>
Or using the indexing operator:

In [None]:
i=0
while( i != len(x) ):
    print(x[i])
    i = i + 1

<br>
Use `+` to concatenate lists.

In [3]:
[1,2] + [3,4]

[1, 2, 3, 4]

<br>
Use `*` to repeat lists.

In [None]:
[1]*3

<br>
Use the `in` operator to check if something is inside a list.

In [None]:
1 in [1, 2, 3]

<br>
Now let's look at strings. Use bracket notation to slice a string.

In [None]:
x = 'This is a string'
print(x[0]) #first character
print(x[0:1]) #first character, but we have explicitly set the end character
print(x[0:2]) #first two characters


<br>
This will return the last element of the string.

In [None]:
x[-1]

<br>
This will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.

In [None]:
x[-4:-2]

<br>
This is a slice from the beginning of the string and stopping before the 3rd element.

In [None]:
x[:3]

<br>
And this is a slice starting from the 4th element of the string and going all the way to the end.

In [None]:
x[3:]

In [None]:
firstname = 'Christopher'
lastname = 'Brooks'

print(firstname + ' ' + lastname)
print(firstname*3)
print('Chris' in firstname)


<br>
`split` returns a list of all the words in a string, or a list split on a specific character.

In [4]:
firstname = 'Christopher Arthur Hansen Brooks'.split(' ')[0] # [0] selects the first element of the list
lastname = 'Christopher Arthur Hansen Brooks'.split(' ')[-1] # [-1] selects the last element of the list
print(firstname)
print(lastname)

Christopher
Brooks


<br>
Make sure you convert objects to strings before concatenating.

In [None]:
'Chris' + 2

In [None]:
'Chris' + str(2)

<br>
Dictionaries associate keys with values.

In [2]:
x = {'Christopher Brooks': 'brooksch@umich.edu', 'Bill Gates': 'billg@microsoft.com'}
x['Christopher Brooks'] # Retrieve a value by using the indexing operator


'brooksch@umich.edu'

In [3]:
x['Kevyn Collins-Thompson'] = None
x['Kevyn Collins-Thompson']

<br>
Iterate over all of the keys:

In [4]:
for name in x:
    print(x[name])

brooksch@umich.edu
billg@microsoft.com
None


<br>
Iterate over all of the values:

In [6]:
for email in x.values():
    print(email)

brooksch@umich.edu
billg@microsoft.com


In [11]:
x.keys()

dict_keys(['Christopher Brooks', 'Bill Gates'])

In [5]:
x.values()

dict_values(['brooksch@umich.edu', 'billg@microsoft.com', None])

In [6]:
x.items()

dict_items([('Christopher Brooks', 'brooksch@umich.edu'), ('Bill Gates', 'billg@microsoft.com'), ('Kevyn Collins-Thompson', None)])

<br>
Iterate over all of the items in the list:

In [None]:
for name, email in x.items():
    print(name)
    print(email)

<br>
You can unpack a sequence into different variables:

In [None]:
x = ('Christopher', 'Brooks', 'brooksch@umich.edu')
fname, lname, email = x

In [None]:
fname

In [None]:
lname

<br>
Make sure the number of values you are unpacking matches the number of variables being assigned.

In [9]:
x = ('Christopher', 'Brooks', 'brooksch@umich.edu', 'Ann Arbor')
fname, lname, email,name2 = x
x

('Christopher', 'Brooks', 'brooksch@umich.edu', 'Ann Arbor')

<br>
# The Python Programming Language: More on Strings

In [None]:
print('Chris' + 2)

In [None]:
print('Chris' + str(2))

<br>
Python has a built in method for convenient string formatting.

In [None]:
sales_record = {
'price': 3.24,
'num_items': 4,
'person': 'Chris'}

sales_statement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(sales_statement.format(sales_record['person'],
                             sales_record['num_items'],
                             sales_record['price'],
                             sales_record['num_items']*sales_record['price']))


<br>
# Reading and Writing CSV files

<br>
Let's import our datafile mpg.csv, which contains fuel economy data for 234 cars.

* mpg : miles per gallon
* class : car classification
* cty : city mpg
* cyl : # of cylinders
* displ : engine displacement in liters
* drv : f = front-wheel drive, r = rear wheel drive, 4 = 4wd
* fl : fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* hwy : highway mpg
* manufacturer : automobile manufacturer
* model : model of car
* trans : type of transmission
* year : model year

In [1]:
import csv

%precision 2

with open('mpg.csv') as csvfile:
    mpg = list(csv.DictReader(csvfile))
    
mpg[:3] # The first three dictionaries in our list.

[OrderedDict([('', '1'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '1.8'),
              ('year', '1999'),
              ('cyl', '4'),
              ('trans', 'auto(l5)'),
              ('drv', 'f'),
              ('cty', '18'),
              ('hwy', '29'),
              ('fl', 'p'),
              ('class', 'compact')]),
 OrderedDict([('', '2'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '1.8'),
              ('year', '1999'),
              ('cyl', '4'),
              ('trans', 'manual(m5)'),
              ('drv', 'f'),
              ('cty', '21'),
              ('hwy', '29'),
              ('fl', 'p'),
              ('class', 'compact')]),
 OrderedDict([('', '3'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '2'),
              ('year', '2008'),
              ('cyl', '4'),
              ('trans', 'manual(m6)'),
              ('drv',

<br>
`csv.Dictreader` has read in each row of our csv file as a dictionary. `len` shows that our list is comprised of 234 dictionaries.

In [2]:
mpg[0].values()

odict_values(['1', 'audi', 'a4', '1.8', '1999', '4', 'auto(l5)', 'f', '18', '29', 'p', 'compact'])

<br>
`keys` gives us the column names of our csv.

In [3]:
mpg[0]

OrderedDict([('', '1'),
             ('manufacturer', 'audi'),
             ('model', 'a4'),
             ('displ', '1.8'),
             ('year', '1999'),
             ('cyl', '4'),
             ('trans', 'auto(l5)'),
             ('drv', 'f'),
             ('cty', '18'),
             ('hwy', '29'),
             ('fl', 'p'),
             ('class', 'compact')])

<br>
This is how to find the average cty fuel economy across all cars. All values in the dictionaries are strings, so we need to convert to float.

In [4]:
sum(float(d['cty']) for d in mpg) / len(mpg)

16.86

<br>
Similarly this is how to find the average hwy fuel economy across all cars.

In [7]:
mpg.d['cty']

AttributeError: 'list' object has no attribute 'd'

In [2]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

23.44

<br>
Use `set` to return the unique values for the number of cylinders the cars in our dataset have.

In [3]:
cylinders = set(d['cyl'] for d in mpg)
cylinders

{'4', '5', '6', '8'}

<br>
Here's a more complex example where we are grouping the cars by number of cylinder, and finding the average cty mpg for each group.

In [4]:
CtyMpgByCyl = []

for c in cylinders: # iterate over all the cylinder levels
    summpg = 0
    cyltypecount = 0
    for d in mpg: # iterate over all dictionaries
        if d['cyl'] == c: # if the cylinder level type matches,
            summpg += float(d['cty']) # add the cty mpg
            cyltypecount += 1 # increment the count
    CtyMpgByCyl.append((c, summpg / cyltypecount)) # append the tuple ('cylinder', 'avg mpg')

CtyMpgByCyl.sort(key=lambda x: x[0])
CtyMpgByCyl

[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

<br>
Use `set` to return the unique values for the class types in our dataset.

In [5]:
vehicleclass = set(d['class'] for d in mpg) # what are the class types
vehicleclass

{'2seater', 'compact', 'midsize', 'minivan', 'pickup', 'subcompact', 'suv'}

<br>
And here's an example of how to find the average hwy mpg for each class of vehicle in our dataset.

In [6]:
HwyMpgByClass = []

for t in vehicleclass: # iterate over all the vehicle classes
    summpg = 0
    vclasscount = 0
    for d in mpg: # iterate over all dictionaries
        if d['class'] == t: # if the cylinder amount type matches,
            summpg += float(d['hwy']) # add the hwy mpg
            vclasscount += 1 # increment the count
    HwyMpgByClass.append((t, summpg / vclasscount)) # append the tuple ('class', 'avg mpg')

HwyMpgByClass.sort(key=lambda x: x[1])
HwyMpgByClass

[('pickup', 16.88),
 ('suv', 18.13),
 ('minivan', 22.36),
 ('2seater', 24.80),
 ('midsize', 27.29),
 ('subcompact', 28.14),
 ('compact', 28.30)]

<br>
# The Python Programming Language: Dates and Times

In [7]:
import datetime as dt
import time as tm

<br>
`time` returns the current time in seconds since the Epoch. (January 1st, 1970)

In [8]:
tm.time()

1595538432.44

<br>
Convert the timestamp to datetime.

In [9]:
dtnow = dt.datetime.fromtimestamp(tm.time())
dtnow

datetime.datetime(2020, 7, 23, 21, 7, 40, 15128)

<br>
Handy datetime attributes:

In [10]:
dtnow.year, dtnow.month, dtnow.day, dtnow.hour, dtnow.minute, dtnow.second # get year, month, day, etc.from a datetime

(2020, 7, 23, 21, 7, 40)

In [16]:
dtnow.hour

21

<br>
`timedelta` is a duration expressing the difference between two dates.

In [11]:
delta = dt.timedelta(days = 100) # create a timedelta of 100 days
delta

datetime.timedelta(100)

In [12]:
delta

datetime.timedelta(100)

<br>
`date.today` returns the current local date.

In [13]:
today = dt.date.today()

In [14]:
today

datetime.date(2020, 7, 23)

In [17]:
today - delta # the date 100 days ago

datetime.date(2020, 4, 14)

In [18]:
today > today-delta # compare dates

True

<br>
# The Python Programming Language: Objects and map()

<br>
An example of a class in python:

In [28]:
class Person:
    department = 'School of Information' #a class variable

    def set_name(self, new_name): #a method
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location

In [29]:
person = Person()
person.set_name('Christopher Brooks')
person.set_location('Ann Arbor, MI, USA')
print('{} live in {} and works in the department {}'.format(person.name, person.location, person.department))

Christopher Brooks live in Ann Arbor, MI, USA and works in the department School of Information


<br>
Here's an example of mapping the `min` function between two lists.

In [32]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
cheapest = map(min, store1, store2)
cheapest

<map at 0x7fddecc77320>

<br>
Now let's iterate through the map object to see the values.

In [33]:
for it in cheapest:
    print(it)

9.0
11.0
12.34
2.01


<br>
# The Python Programming Language: Lambda and List Comprehensions

<br>
Here's an example of lambda that takes in three parameters and adds the first two.

In [None]:
my_function = lambda a, b, c : a + b

In [None]:
my_function(1, 2, 3)

In [2]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']
myFunction=lambda people:people.split()[0] + ' ' + people.split()[-1]
#print(people[0])


Dr. Christopher Brooks


<br>
Let's iterate from 0 to 999 and return the even numbers.

In [None]:
my_list = []
for number in range(0, 1000):
    if number % 2 == 0:
        my_list.append(number)
my_list

<br>
Now the same thing but with list comprehension.

In [None]:
my_list = [number for number in range(0,1000) if number % 2 == 0]
my_list

In [3]:
def times_tables():
    lst = []
    for i in range(10):
        for j in range (10):
            lst.append(i*j)
    return lst

times_tables()

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 0,
 3,
 6,
 9,
 12,
 15,
 18,
 21,
 24,
 27,
 0,
 4,
 8,
 12,
 16,
 20,
 24,
 28,
 32,
 36,
 0,
 5,
 10,
 15,
 20,
 25,
 30,
 35,
 40,
 45,
 0,
 6,
 12,
 18,
 24,
 30,
 36,
 42,
 48,
 54,
 0,
 7,
 14,
 21,
 28,
 35,
 42,
 49,
 56,
 63,
 0,
 8,
 16,
 24,
 32,
 40,
 48,
 56,
 64,
 72,
 0,
 9,
 18,
 27,
 36,
 45,
 54,
 63,
 72,
 81]

In [1]:
[i*j for j in range(10) for i in range(10)  ]

[0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 0,
 2,
 4,
 6,
 8,
 10,
 12,
 14,
 16,
 18,
 0,
 3,
 6,
 9,
 12,
 15,
 18,
 21,
 24,
 27,
 0,
 4,
 8,
 12,
 16,
 20,
 24,
 28,
 32,
 36,
 0,
 5,
 10,
 15,
 20,
 25,
 30,
 35,
 40,
 45,
 0,
 6,
 12,
 18,
 24,
 30,
 36,
 42,
 48,
 54,
 0,
 7,
 14,
 21,
 28,
 35,
 42,
 49,
 56,
 63,
 0,
 8,
 16,
 24,
 32,
 40,
 48,
 56,
 64,
 72,
 0,
 9,
 18,
 27,
 36,
 45,
 54,
 63,
 72,
 81]

In [2]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

[let1+let2  for let1 in lowercase for let2 in lowercase ]


['aa',
 'ab',
 'ac',
 'ad',
 'ae',
 'af',
 'ag',
 'ah',
 'ai',
 'aj',
 'ak',
 'al',
 'am',
 'an',
 'ao',
 'ap',
 'aq',
 'ar',
 'as',
 'at',
 'au',
 'av',
 'aw',
 'ax',
 'ay',
 'az',
 'ba',
 'bb',
 'bc',
 'bd',
 'be',
 'bf',
 'bg',
 'bh',
 'bi',
 'bj',
 'bk',
 'bl',
 'bm',
 'bn',
 'bo',
 'bp',
 'bq',
 'br',
 'bs',
 'bt',
 'bu',
 'bv',
 'bw',
 'bx',
 'by',
 'bz',
 'ca',
 'cb',
 'cc',
 'cd',
 'ce',
 'cf',
 'cg',
 'ch',
 'ci',
 'cj',
 'ck',
 'cl',
 'cm',
 'cn',
 'co',
 'cp',
 'cq',
 'cr',
 'cs',
 'ct',
 'cu',
 'cv',
 'cw',
 'cx',
 'cy',
 'cz',
 'da',
 'db',
 'dc',
 'dd',
 'de',
 'df',
 'dg',
 'dh',
 'di',
 'dj',
 'dk',
 'dl',
 'dm',
 'dn',
 'do',
 'dp',
 'dq',
 'dr',
 'ds',
 'dt',
 'du',
 'dv',
 'dw',
 'dx',
 'dy',
 'dz',
 'ea',
 'eb',
 'ec',
 'ed',
 'ee',
 'ef',
 'eg',
 'eh',
 'ei',
 'ej',
 'ek',
 'el',
 'em',
 'en',
 'eo',
 'ep',
 'eq',
 'er',
 'es',
 'et',
 'eu',
 'ev',
 'ew',
 'ex',
 'ey',
 'ez',
 'fa',
 'fb',
 'fc',
 'fd',
 'fe',
 'ff',
 'fg',
 'fh',
 'fi',
 'fj',
 'fk',
 'fl',
 'fm',

In [1]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

for let1 in lowercase:
    for let2 in lowercase:
        for num1 in digits:
            print (let1+let2+num1)

aa0
aa1
aa2
aa3
aa4
aa5
aa6
aa7
aa8
aa9
ab0
ab1
ab2
ab3
ab4
ab5
ab6
ab7
ab8
ab9
ac0
ac1
ac2
ac3
ac4
ac5
ac6
ac7
ac8
ac9
ad0
ad1
ad2
ad3
ad4
ad5
ad6
ad7
ad8
ad9
ae0
ae1
ae2
ae3
ae4
ae5
ae6
ae7
ae8
ae9
af0
af1
af2
af3
af4
af5
af6
af7
af8
af9
ag0
ag1
ag2
ag3
ag4
ag5
ag6
ag7
ag8
ag9
ah0
ah1
ah2
ah3
ah4
ah5
ah6
ah7
ah8
ah9
ai0
ai1
ai2
ai3
ai4
ai5
ai6
ai7
ai8
ai9
aj0
aj1
aj2
aj3
aj4
aj5
aj6
aj7
aj8
aj9
ak0
ak1
ak2
ak3
ak4
ak5
ak6
ak7
ak8
ak9
al0
al1
al2
al3
al4
al5
al6
al7
al8
al9
am0
am1
am2
am3
am4
am5
am6
am7
am8
am9
an0
an1
an2
an3
an4
an5
an6
an7
an8
an9
ao0
ao1
ao2
ao3
ao4
ao5
ao6
ao7
ao8
ao9
ap0
ap1
ap2
ap3
ap4
ap5
ap6
ap7
ap8
ap9
aq0
aq1
aq2
aq3
aq4
aq5
aq6
aq7
aq8
aq9
ar0
ar1
ar2
ar3
ar4
ar5
ar6
ar7
ar8
ar9
as0
as1
as2
as3
as4
as5
as6
as7
as8
as9
at0
at1
at2
at3
at4
at5
at6
at7
at8
at9
au0
au1
au2
au3
au4
au5
au6
au7
au8
au9
av0
av1
av2
av3
av4
av5
av6
av7
av8
av9
aw0
aw1
aw2
aw3
aw4
aw5
aw6
aw7
aw8
aw9
ax0
ax1
ax2
ax3
ax4
ax5
ax6
ax7
ax8
ax9
ay0
ay1
ay2
ay3
ay4
ay5
ay6
ay7
ay8
ay9


jo5
jo6
jo7
jo8
jo9
jp0
jp1
jp2
jp3
jp4
jp5
jp6
jp7
jp8
jp9
jq0
jq1
jq2
jq3
jq4
jq5
jq6
jq7
jq8
jq9
jr0
jr1
jr2
jr3
jr4
jr5
jr6
jr7
jr8
jr9
js0
js1
js2
js3
js4
js5
js6
js7
js8
js9
jt0
jt1
jt2
jt3
jt4
jt5
jt6
jt7
jt8
jt9
ju0
ju1
ju2
ju3
ju4
ju5
ju6
ju7
ju8
ju9
jv0
jv1
jv2
jv3
jv4
jv5
jv6
jv7
jv8
jv9
jw0
jw1
jw2
jw3
jw4
jw5
jw6
jw7
jw8
jw9
jx0
jx1
jx2
jx3
jx4
jx5
jx6
jx7
jx8
jx9
jy0
jy1
jy2
jy3
jy4
jy5
jy6
jy7
jy8
jy9
jz0
jz1
jz2
jz3
jz4
jz5
jz6
jz7
jz8
jz9
ka0
ka1
ka2
ka3
ka4
ka5
ka6
ka7
ka8
ka9
kb0
kb1
kb2
kb3
kb4
kb5
kb6
kb7
kb8
kb9
kc0
kc1
kc2
kc3
kc4
kc5
kc6
kc7
kc8
kc9
kd0
kd1
kd2
kd3
kd4
kd5
kd6
kd7
kd8
kd9
ke0
ke1
ke2
ke3
ke4
ke5
ke6
ke7
ke8
ke9
kf0
kf1
kf2
kf3
kf4
kf5
kf6
kf7
kf8
kf9
kg0
kg1
kg2
kg3
kg4
kg5
kg6
kg7
kg8
kg9
kh0
kh1
kh2
kh3
kh4
kh5
kh6
kh7
kh8
kh9
ki0
ki1
ki2
ki3
ki4
ki5
ki6
ki7
ki8
ki9
kj0
kj1
kj2
kj3
kj4
kj5
kj6
kj7
kj8
kj9
kk0
kk1
kk2
kk3
kk4
kk5
kk6
kk7
kk8
kk9
kl0
kl1
kl2
kl3
kl4
kl5
kl6
kl7
kl8
kl9
km0
km1
km2
km3
km4
km5
km6
km7
km8
km9
kn0
kn1
kn2
kn3
kn4


vz2
vz3
vz4
vz5
vz6
vz7
vz8
vz9
wa0
wa1
wa2
wa3
wa4
wa5
wa6
wa7
wa8
wa9
wb0
wb1
wb2
wb3
wb4
wb5
wb6
wb7
wb8
wb9
wc0
wc1
wc2
wc3
wc4
wc5
wc6
wc7
wc8
wc9
wd0
wd1
wd2
wd3
wd4
wd5
wd6
wd7
wd8
wd9
we0
we1
we2
we3
we4
we5
we6
we7
we8
we9
wf0
wf1
wf2
wf3
wf4
wf5
wf6
wf7
wf8
wf9
wg0
wg1
wg2
wg3
wg4
wg5
wg6
wg7
wg8
wg9
wh0
wh1
wh2
wh3
wh4
wh5
wh6
wh7
wh8
wh9
wi0
wi1
wi2
wi3
wi4
wi5
wi6
wi7
wi8
wi9
wj0
wj1
wj2
wj3
wj4
wj5
wj6
wj7
wj8
wj9
wk0
wk1
wk2
wk3
wk4
wk5
wk6
wk7
wk8
wk9
wl0
wl1
wl2
wl3
wl4
wl5
wl6
wl7
wl8
wl9
wm0
wm1
wm2
wm3
wm4
wm5
wm6
wm7
wm8
wm9
wn0
wn1
wn2
wn3
wn4
wn5
wn6
wn7
wn8
wn9
wo0
wo1
wo2
wo3
wo4
wo5
wo6
wo7
wo8
wo9
wp0
wp1
wp2
wp3
wp4
wp5
wp6
wp7
wp8
wp9
wq0
wq1
wq2
wq3
wq4
wq5
wq6
wq7
wq8
wq9
wr0
wr1
wr2
wr3
wr4
wr5
wr6
wr7
wr8
wr9
ws0
ws1
ws2
ws3
ws4
ws5
ws6
ws7
ws8
ws9
wt0
wt1
wt2
wt3
wt4
wt5
wt6
wt7
wt8
wt9
wu0
wu1
wu2
wu3
wu4
wu5
wu6
wu7
wu8
wu9
wv0
wv1
wv2
wv3
wv4
wv5
wv6
wv7
wv8
wv9
ww0
ww1
ww2
ww3
ww4
ww5
ww6
ww7
ww8
ww9
wx0
wx1
wx2
wx3
wx4
wx5
wx6
wx7
wx8
wx9
wy0
wy1


In [2]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

[let1+let2+num1+num2 for let1 in lowercase for let2 in lowercase for num1 in digits for num2 in digits ]


['aa00',
 'aa01',
 'aa02',
 'aa03',
 'aa04',
 'aa05',
 'aa06',
 'aa07',
 'aa08',
 'aa09',
 'aa10',
 'aa11',
 'aa12',
 'aa13',
 'aa14',
 'aa15',
 'aa16',
 'aa17',
 'aa18',
 'aa19',
 'aa20',
 'aa21',
 'aa22',
 'aa23',
 'aa24',
 'aa25',
 'aa26',
 'aa27',
 'aa28',
 'aa29',
 'aa30',
 'aa31',
 'aa32',
 'aa33',
 'aa34',
 'aa35',
 'aa36',
 'aa37',
 'aa38',
 'aa39',
 'aa40',
 'aa41',
 'aa42',
 'aa43',
 'aa44',
 'aa45',
 'aa46',
 'aa47',
 'aa48',
 'aa49',
 'aa50',
 'aa51',
 'aa52',
 'aa53',
 'aa54',
 'aa55',
 'aa56',
 'aa57',
 'aa58',
 'aa59',
 'aa60',
 'aa61',
 'aa62',
 'aa63',
 'aa64',
 'aa65',
 'aa66',
 'aa67',
 'aa68',
 'aa69',
 'aa70',
 'aa71',
 'aa72',
 'aa73',
 'aa74',
 'aa75',
 'aa76',
 'aa77',
 'aa78',
 'aa79',
 'aa80',
 'aa81',
 'aa82',
 'aa83',
 'aa84',
 'aa85',
 'aa86',
 'aa87',
 'aa88',
 'aa89',
 'aa90',
 'aa91',
 'aa92',
 'aa93',
 'aa94',
 'aa95',
 'aa96',
 'aa97',
 'aa98',
 'aa99',
 'ab00',
 'ab01',
 'ab02',
 'ab03',
 'ab04',
 'ab05',
 'ab06',
 'ab07',
 'ab08',
 'ab09',
 'ab10',
 

<br>
# The Python Programming Language: Numerical Python (NumPy)

In [1]:
import numpy as np

<br>
## Creating Arrays

Create a list and convert it to a numpy array

In [28]:
mylist = [1, 2, 3]
x = np.array(mylist)
x

array([1, 2, 3])

<br>
Or just pass in a list directly

In [29]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

<br>
Pass in a list of lists to create a multidimensional array.

In [30]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

<br>
Use the shape method to find the dimensions of the array. (rows, columns)

In [31]:
m.shape

(2, 3)

<br>
`arange` returns evenly spaced values within a given interval.

In [32]:
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
n

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

<br>
`reshape` returns an array with the same data with a new shape.

In [33]:
n = n.reshape(3, 5) # reshape array to be 3x5
n

array([[ 0,  2,  4,  6,  8],
       [10, 12, 14, 16, 18],
       [20, 22, 24, 26, 28]])

<br>
`linspace` returns evenly spaced numbers over a specified interval.

In [None]:
o = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
o

<br>
`resize` changes the shape and size of array in-place.

In [None]:
o.resize(3, 3)
o

<br>
`ones` returns a new array of given shape and type, filled with ones.

In [None]:
np.ones((3, 2))

<br>
`zeros` returns a new array of given shape and type, filled with zeros.

In [None]:
np.zeros((2, 3))

<br>
`eye` returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [2]:
np.eye(3)

array([[ 1.,  0.,  0.],
       [ 0.,  1.,  0.],
       [ 0.,  0.,  1.]])

<br>
`diag` extracts a diagonal or constructs a diagonal array.

In [None]:
np.diag(y)

<br>
Create an array using repeating list (or see `np.tile`)

In [None]:
np.array([1, 2, 3] * 3)

<br>
Repeat elements of an array using `repeat`.

In [None]:
np.repeat([1, 2, 3], 3)

<br>
#### Combining Arrays

In [None]:
p = np.ones([2, 3], int)
p

<br>
Use `vstack` to stack arrays in sequence vertically (row wise).

In [None]:
np.vstack([p, 2*p])

<br>
Use `hstack` to stack arrays in sequence horizontally (column wise).

In [None]:
np.hstack([p, 2*p])

<br>
## Operations

Use `+`, `-`, `*`, `/` and `**` to perform element wise addition, subtraction, multiplication, division and power.

In [None]:
print(x + y) # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
print(x - y) # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

In [None]:
print(x * y) # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
print(x / y) # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

In [None]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]

<br>
**Dot Product:**  

$ \begin{bmatrix}x_1 \ x_2 \ x_3\end{bmatrix}
\cdot
\begin{bmatrix}y_1 \\ y_2 \\ y_3\end{bmatrix}
= x_1 y_1 + x_2 y_2 + x_3 y_3$

In [None]:
x.dot(y) # dot product  1*4 + 2*5 + 3*6

In [None]:
z = np.array([y, y**2])
print(len(z)) # number of rows of array

<br>
Let's look at transposing arrays. Transposing permutes the dimensions of the array.

In [None]:
z = np.array([y, y**2])
z

<br>
The shape of array `z` is `(2,3)` before transposing.

In [None]:
z.shape

<br>
Use `.T` to get the transpose.

In [None]:
z.T

<br>
The number of rows has swapped with the number of columns.

In [None]:
z.T.shape

<br>
Use `.dtype` to see the data type of the elements in the array.

In [None]:
z.dtype

<br>
Use `.astype` to cast to a specific type.

In [None]:
z = z.astype('f')
z.dtype

<br>
## Math Functions

Numpy has many built in math functions that can be performed on arrays.

In [None]:
a = np.array([-4, -2, 1, 3, 5])

In [None]:
a.sum()

In [None]:
a.max()

In [None]:
a.min()

In [None]:
a.mean()

In [None]:
a.std()

<br>
`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [None]:
a.argmax()

In [None]:
a.argmin()

<br>
## Indexing / Slicing

In [3]:
s = np.arange(13)**2
s

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144])

<br>
Use bracket notation to get the value at a specific index. Remember that indexing starts at 0.

In [4]:
s[0], s[4], s[-1]

(0, 16, 144)

<br>
Use `:` to indicate a range. `array[start:stop]`


Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [5]:
s[1:5]

array([ 1,  4,  9, 16])

<br>
Use negatives to count from the back.

In [6]:
s[-4:]

array([ 81, 100, 121, 144])

<br>
A second `:` can be used to indicate step-size. `array[start:stop:stepsize]`

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [7]:
s[-5::-2]

array([64, 36, 16,  4,  0])

<br>
Let's look at a multidimensional array.

In [8]:
r = np.arange(36)
r.resize((6, 6))
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

<br>
Use bracket notation to slice: `array[row, column]`

In [None]:
r[2, 2]

<br>
And use : to select a range of rows or columns

In [9]:
r[3, 3:6]

array([21, 22, 23])

<br>
Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [15]:
r[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

<br>
This is a slice of the last row, and only every other element.

In [16]:
r[-1, ::2]

array([30, 32, 34])

<br>
We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also see `np.where`)

In [17]:
r[r > 30]

array([31, 32, 33, 34, 35])

<br>
Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [18]:
r[r > 30] = 30
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br>
## Copying Data

Be careful with copying and modifying arrays in NumPy!


`r2` is a slice of `r`

In [None]:
r2 = r[:3,:3]
r2

<br>
Set this slice's values to zero ([:] selects the entire array)

In [None]:
r2[:] = 0
r2

<br>
`r` has also been changed!

In [None]:
r

<br>
To avoid this, use `r.copy` to create a copy that will not affect the original array

In [None]:
r_copy = r.copy()
r_copy

<br>
Now when r_copy is modified, r will not be changed.

In [None]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

In [None]:
r[::7]

In [22]:
['a', 'b', 'c'] + [1, 2, 3]

['a', 'b', 'c', 1, 2, 3]

<br>
### Iterating Over Arrays

In [21]:
type(lambda x: x+1) 


function

Let's create a new 4 by 3 array of random numbers 0-9.

In [23]:
test = np.random.randint(0, 10, (4,3))
test

array([[7, 0, 7],
       [0, 6, 1],
       [1, 7, 2],
       [4, 3, 7]])

In [37]:
r=np.arange(0,36,1)
r

array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33,
       34, 35])

In [42]:
r=r.reshape(6,6)
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

In [51]:
r.reshape(36)[::7]

array([ 0,  7, 14, 21, 28, 35])

<br>
Iterate by row:

In [27]:
for row in test:
    print(row)

[7 0 7]
[0 6 1]
[1 7 2]
[4 3 7]


<br>
Iterate by index:

In [None]:
for i in range(len(test)):
    print(test[i])

<br>
Iterate by row and index:

In [None]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

<br>
Use `zip` to iterate over multiple iterables.

In [None]:
test2 = test**2
test2

In [None]:
for i, j in zip(test, test2):
    print(i,'+',j,'=',i+j)