In this Notebook, we are going to provide a very basic overview of the Python programming language. If you have a background in programming, this overview should be enough to help you be successful in the rest of this course and the following courses in this specialization. If you don't have a programming experience or feel like content is overly challenging, then we would encourage you to stop and take a look at [How to Get Started With Python?](https://www.programiz.com/python-programming/first-program). 

***

A common surprise for some programmers coming from a Java or C background is that Python is a dynamically typed language, similar to languages like JavaScript. This means that when you declare a variable, you can assign it to be an integer on one line, and a string on the next line.

Since there is no compilation step, you don't have anyone to help you manage types. You need to either check for the presence of functionality when you go to use it or try and use the functionality and catch any errors that occur. The dynamic typing of Python is particularly nice when used in an interactive fashion, as it allows you to quickly set and modify variable contents without having to worry about the underlying syntactic definition of the variable.

# The Python Programming Language: Functions

<br>
`add_numbers` is a function that takes two numbers and adds them together.

In [2]:
def add_numbers(x, y):
    return x + y

In [3]:
add_numbers(1, 2)

3

We can run this cell by hitting shift+enter or by clicking on the play head icon in the tool bar. The output from the statement is immediately printed. If you were using Python in a non interactive mode, nothing would print. But since we're using it in interactive mode, we get the value immediately. What's happening underneath is that the browser is sending your Python code across to a machine in the cloud, which executes the code in a Python three interpreter, and sends the results back.

<br>
`add_numbers` updated to take an optional 3rd parameter. Using `print` allows printing of multiple expressions within a single cell.

In [52]:
def add_numbers(x,y,z=None):
    if (z==None):
        return x+y
    else:
        return x+y+z

print(add_numbers(1, 2))
print(add_numbers(1, 2, 3))

3
6


<br>
`add_numbers` updated to take an optional flag parameter.

In [4]:
def add_numbers(x, y, z=None, flag=False):
    if (flag):
        print('Flag is true!')
    if (z==None):
        return x + y
    else:
        return x + y + z
    


In [8]:
add_numbers(1, 2, flag=3)

Flag is true!


3

<br>
Assign function `add_numbers` to variable `a`.

In [9]:
def add_numbers(x,y):
    return x+y

a = add_numbers
a(1,2)

3

# The Python Programming Language: Types and Sequences

<br>
Use `type` to return the object's type.

In [10]:
type('This is a string')

str

In [11]:
type(None)

NoneType

In [12]:
type(1)

int

In [13]:
type(1.0)

float

In [14]:
type(add_numbers)

function

Tuples are an immutable data structure (cannot be altered).

In [15]:
x = (1, 'a', 2, 'b')
type(x)

tuple

Lists are a mutable data structure.

In [16]:
x = [1, 'a', 2, 'b']
type(x)

list

<br>
Use `append` to append an object to a list.

In [17]:
x.append(3.3)
print(x)

[1, 'a', 2, 'b', 3.3]


<br>
This is an example of how to loop through each item in the list.

In [18]:
for item in x:
    print(item)

1
a
2
b
3.3


<br>
Or using the indexing operator:

In [20]:
i=0
while( i != len(x) ):
    print(i)
    print(x[i])
    i = i + 1

0
1
1
a
2
2
3
b
4
3.3


<br>
Use `+` to concatenate lists.

In [21]:
[1,2] + [3,4]

[1, 2, 3, 4]

<br>
Use `*` to repeat lists.

In [22]:
[1,2,3]*3

[1, 2, 3, 1, 2, 3, 1, 2, 3]

<br>
Use the `in` operator to check if something is inside a list.

In [23]:
1 in [1, 2, 3]

True

<br>
Now let's look at strings. Use bracket notation to slice a string.

In [24]:
x = 'This is a string'
print(x[0]) #first character
print(x[0:1]) #first character, but we have explicitly set the end character
print(x[0:2]) #first two characters


T
T
Th


<br>
This will return the last element of the string.

In [25]:
x[-1]

'g'

<br>
This will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.

In [26]:
x[-4:-2]

'ri'

<br>
This is a slice from the beginning of the string and stopping before the 3rd element.

In [27]:
x[:3]

'Thi'

<br>
And this is a slice starting from the 4th element of the string and going all the way to the end.

In [28]:
x[3:]

's is a string'

In [29]:
firstname = 'Yoshua'
lastname = 'Bengio'


print(firstname + ' ' + lastname)
print(firstname*3)
print('Yosh' in firstname)


Yoshua Bengio
YoshuaYoshuaYoshua
True


<br>
`split` returns a list of all the words in a string, or a list split on a specific character.

In [30]:
firstname = 'Yoshua Samuel Abraham Bengio'.split(' ')[0] # [0] selects the first element of the list
lastname = 'Yoshua Samuel Abraham Bengio'.split(' ')[-1] # [-1] selects the last element of the list
print(firstname)
print(lastname)

Yoshua
Bengio


<br>
Make sure you convert objects to strings before concatenating.

In [31]:
'Yosh' + 2

TypeError: must be str, not int

In [32]:
'Yosh' + str(2)

'Yosh2'

<br>
Dictionaries associate keys with values.

In [33]:
x = {'Yoshua Bengio': 'Yoshua.Bengio@umontreal.ca', 'Bill Gates': 'billg@microsoft.com'}



In [34]:
x['Yoshua Bengio'] # Retrieve a value by using the indexing operator

'Yoshua.Bengio@umontreal.ca'

In [36]:
x['Avi Bernstein'] = None
print(x['Avi Bernstein'])

None


<br>
Iterate over all of the keys:

In [37]:
for name in x:
    print(x[name])

Yoshua.Bengio@umontreal.ca
billg@microsoft.com
None


<br>
Iterate over all of the values:

In [38]:
for email in x.values():
    print(email)

Yoshua.Bengio@umontreal.ca
billg@microsoft.com
None


<br>
Iterate over all of the items in the list:

In [39]:
for name, email in x.items():
    print(name)
    print(email)

Yoshua Bengio
Yoshua.Bengio@umontreal.ca
Bill Gates
billg@microsoft.com
Avi Bernstein
None


<br>
You can unpack a sequence into different variables:

In [40]:
x = ('Yoshua', 'Bengio', 'Yoshua.Bengio@umontreal.ca')
fname, lname, email = x

In [41]:
fname

'Yoshua'

In [42]:
lname

'Bengio'

<br>
Make sure the number of values you are unpacking matches the number of variables being assigned.

In [43]:
x = ('Yoshua', 'Bengio', 'Yoshua.Bengio@umontreal.ca', 'Avi Berstein')
fname, lname, email = x

ValueError: too many values to unpack (expected 3)

# The Python Programming Language: More on Strings

In [44]:
print('Yosh' + 2)

TypeError: must be str, not int

In [45]:
print('Yosh' + str(2))

Yosh2


Python has a built in method for convenient string formatting.

In [46]:
sales_record = {
'price': 3.24,
'num_items': 4,
'person': 'Yosh'}

sales_statement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(sales_statement.format(sales_record['person'],
                             sales_record['num_items'],
                             sales_record['price'],
                             sales_record['num_items']*sales_record['price']))


Yosh bought 4 item(s) at a price of 3.24 each for a total of 12.96


In [47]:
print("how are you {}?".format('Petra'))

how are you Petra?


# Reading and Writing CSV files

<br>
Let's import our datafile mpg.csv, which contains fuel economy data for 234 cars.

* mpg : miles per gallon
* class : car classification
* cty : city mpg
* cyl : # of cylinders
* displ : engine displacement in liters
* drv : f = front-wheel drive, r = rear wheel drive, 4 = 4wd
* fl : fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* hwy : highway mpg
* manufacturer : automobile manufacturer
* model : model of car
* trans : type of transmission
* year : model year

In [48]:
import csv

%precision 2

with open('../data/ds4/mpg.csv') as csvfile:
    mpg = list(csv.DictReader(csvfile))
    
mpg[:3] # The first three dictionaries in our list.

[OrderedDict([('', '1'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '1.8'),
              ('year', '1999'),
              ('cyl', '4'),
              ('trans', 'auto(l5)'),
              ('drv', 'f'),
              ('cty', '18'),
              ('hwy', '29'),
              ('fl', 'p'),
              ('class', 'compact')]),
 OrderedDict([('', '2'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '1.8'),
              ('year', '1999'),
              ('cyl', '4'),
              ('trans', 'manual(m5)'),
              ('drv', 'f'),
              ('cty', '21'),
              ('hwy', '29'),
              ('fl', 'p'),
              ('class', 'compact')]),
 OrderedDict([('', '3'),
              ('manufacturer', 'audi'),
              ('model', 'a4'),
              ('displ', '2'),
              ('year', '2008'),
              ('cyl', '4'),
              ('trans', 'manual(m6)'),
              ('drv',

<br>
`csv.Dictreader` has read in each row of our csv file as a dictionary. `len` shows that our list is comprised of 234 dictionaries.

In [49]:
len(mpg)

234

<br>
`keys` gives us the column names of our csv.

In [50]:
mpg[0].keys()

odict_keys(['', 'manufacturer', 'model', 'displ', 'year', 'cyl', 'trans', 'drv', 'cty', 'hwy', 'fl', 'class'])

MPG is miles per gallon. 

|  | |
| -- | --- |
| 1 Gallone | = 3,79 Liter (L) |
| 1 Mile | = 1,61 Kilometer (km) |

This is how to find the average city (cty) fuel economy across all cars. All values in the dictionaries are strings, so we need to convert to float. 

In [7]:
sum(float(d['cty']) for d in mpg) / len(mpg)

16.86

Similarly this is how to find the average highway (hwy) fuel economy across all cars.

In [51]:
type(float(d['hwy']) for d in mpg)

generator

In [99]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

23.44

<br>
Use `set` to return the unique values for the number of cylinders the cars in our dataset have.

In [55]:
cylinders = set([d['cyl'] for d in mpg])
cylinders

{'4', '5', '6', '8'}

<br>
Here's a more complex example where we are grouping the cars by number of cylinder, and finding the average cty mpg for each group.

In [56]:
# city miles per gallon by cylinder
CtyMpgByCyl = []

for c in cylinders: # iterate over all the cylinder levels
    summpg = 0
    cyltypecount = 0
    for d in mpg: # iterate over all dictionaries
        if d['cyl'] == c: # if the cylinder level type matches,
            summpg += float(d['cty']) # add the cty mpg
            cyltypecount += 1 # increment the count
    CtyMpgByCyl.append((c, summpg / cyltypecount)) # append the tuple ('cylinder', 'avg mpg')

CtyMpgByCyl.sort(key=lambda x: x[0])
CtyMpgByCyl

[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

<br>
Use `set` to return the unique values for the class types in our dataset.

In [57]:
vehicleclass = set(d['class'] for d in mpg) # what are the class types
vehicleclass

{'2seater', 'compact', 'midsize', 'minivan', 'pickup', 'subcompact', 'suv'}

<br>
And here's an example of how to find the average hwy mpg for each class of vehicle in our dataset.

In [58]:
HwyMpgByClass = []

for t in vehicleclass: # iterate over all the vehicle classes
    summpg = 0
    vclasscount = 0
    for d in mpg: # iterate over all dictionaries
        if d['class'] == t: # if the cylinder amount type matches,
            summpg += float(d['hwy']) # add the hwy mpg
            vclasscount += 1 # increment the count
    HwyMpgByClass.append((t, summpg / vclasscount)) # append the tuple ('class', 'avg mpg')

HwyMpgByClass.sort(key=lambda x: x[1])
HwyMpgByClass

[('pickup', 16.88),
 ('suv', 18.13),
 ('minivan', 22.36),
 ('2seater', 24.80),
 ('midsize', 27.29),
 ('subcompact', 28.14),
 ('compact', 28.30)]

# The Python Programming Language: Dates and Times

In [59]:
import datetime as dt
import time as tm

<br>
`time` returns the current time in seconds since the Epoch. (January 1st, 1970)

In [60]:
tm.time()

1583757393.13

<br>
Convert the timestamp to datetime.

In [61]:
dtnow = dt.datetime.fromtimestamp(tm.time())
dtnow

datetime.datetime(2020, 3, 9, 12, 37, 2, 8369)

<br>
Handy datetime attributes:

In [62]:
dtnow.year, dtnow.month, dtnow.day, dtnow.hour, dtnow.minute, dtnow.second # get year, month, day, etc.from a datetime

(2020, 3, 9, 12, 37, 2)

<br>
`timedelta` is a duration expressing the difference between two dates.

In [63]:
delta = dt.timedelta(days = 100) # create a timedelta of 100 days
delta

datetime.timedelta(100)

<br>
`date.today` returns the current local date.

In [64]:
today = dt.date.today()

In [65]:
today - delta # the date 100 days ago

datetime.date(2019, 11, 30)

In [111]:
today > today-delta # compare dates

True

# The Python Programming Language: Objects and map()

<br>
An example of a class in python:

In [22]:
class Person:
    department = 'School of Computer Science' #a class variable

    def set_name(self, new_name): #a method
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location

To define a method, you just write it as you would have a function. The one change, is that to have access to the instance which a method is being invoked upon, you must include `self`, in the method signature.

Similarly, if you want to refer to instance variables set on the object, you prepend them with the word `self`, with a full stop.

In this definition of a person, for instance, we have written two methods. Set name and set location. And both change instance bound variables, called name and location respectively.

In [114]:
person = Person()
person.set_name('Yoshua Bengio')
person.set_location('Montreal, Québec, Canada')
print('{} live in {} and works in the department {}'.format(person.name, person.location, person.department))

Yoshua Bengio live in Montreal, Québec, Canada and works in the department School of Computer Science


Here's an example of mapping the `min` function between two lists.

In [1]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]
cheapest = map(min, store1)
cheapest



<map at 0x7feb3facfb38>

In [2]:
#firstname = 'Yoshua Samuel Abraham Bengio'.split(' ')[0] # [0] selects the first element of the list
#lastname = 'Yoshua Samuel Abraham Bengio'.split(' ')[-1] # [-1] selects the last element of the list
#print(firstname)
#print(lastname)


people = ['Prof. Siegfried Handschuh', 'Prof. Simon Mayer', 'Prof.Barbara Weber']

def split_title_and_name(person):
    title = person.split(' ')[0]
    # lastname = 
    return '{} {}'.format(title, lastname)

list(map(split_title_and_name, people ))

NameError: name 'lastname' is not defined

Python `map()` function is used to apply a function -- in this case the `min` function -- on all the elements of specified iterable and return map object. Python map object is an iterator, so we can iterate over its elements. We can also convert map object to sequence objects such as list, tuple etc.

<br>
Now let's iterate through the map object to see the values.

In [24]:
for item in cheapest:
    print(item)

9.0
11.0
12.34
2.01


# The Python Programming Language: Lambda and List Comprehensions

<br>
Here's an example of lambda that takes in three parameters and adds the first two.

In [4]:
my_function = lambda a, b, c : a + b

In [5]:
my_function(1, 2, 3)

3

## List Comprehension

<a id='mylist1'></a>
Let's iterate from 0 to 999 and return the even numbers.

In [20]:
my_list = []
for number in range(0, 1000):
    if number % 2 == 0:
        my_list.append(number)
print(my_list)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420,

List comprehensions are Python’s way of implementing the notation for sets as used in mathematics.

$$\{ n : n \in [0 \dots 1000] \quad \text{if} \; n\bmod 2 = 0 \}$$


Now the same [as in the above cell](#mylist1) but with list comprehension.

In [6]:
my_list = [number for number in range(0,1000) if number % 2 == 0]
print(my_list)

[0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198, 200, 202, 204, 206, 208, 210, 212, 214, 216, 218, 220, 222, 224, 226, 228, 230, 232, 234, 236, 238, 240, 242, 244, 246, 248, 250, 252, 254, 256, 258, 260, 262, 264, 266, 268, 270, 272, 274, 276, 278, 280, 282, 284, 286, 288, 290, 292, 294, 296, 298, 300, 302, 304, 306, 308, 310, 312, 314, 316, 318, 320, 322, 324, 326, 328, 330, 332, 334, 336, 338, 340, 342, 344, 346, 348, 350, 352, 354, 356, 358, 360, 362, 364, 366, 368, 370, 372, 374, 376, 378, 380, 382, 384, 386, 388, 390, 392, 394, 396, 398, 400, 402, 404, 406, 408, 410, 412, 414, 416, 418, 420,

$$\{ x^2 : x \in [1,2,3,4,5] \; \text{if} \; x>2 \}$$

In [26]:
squares = [x**2 for x in [1, 2, 3, 4, 5] if x > 2]
print(squares)

[9, 16, 25]


# Numerical Python (NumPy)

In [65]:
import numpy as np

## Creating Arrays

Create a list and convert it to a numpy array

In [66]:
mylist = [1, 2, 3]
x = np.array(mylist)
x

array([1, 2, 3])

<br>
Or just pass in a list directly

In [67]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

<br>
Pass in a list of lists to create a multidimensional array.

In [68]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

<br>
Use the shape method to find the dimensions of the array. (rows, columns)

In [69]:
m.shape

(2, 3)

<br>
`arange` returns evenly spaced values within a given interval.

In [70]:
n = np.arange(0, 30, 2) # start at 0 count up by 2, stop before 30
n

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

<br>
`reshape` returns an array with the same data with a new shape.

In [71]:
n = n.reshape(3, 5) # reshape array to be 3x5
n

array([[ 0,  2,  4,  6,  8],
       [10, 12, 14, 16, 18],
       [20, 22, 24, 26, 28]])

<br>
`linspace` returns evenly spaced numbers over a specified interval.

In [72]:
o = np.linspace(0, 4, 9) # return 9 evenly spaced values from 0 to 4
o

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])

<br>
`resize` changes the shape and size of array in-place.

In [73]:
o.resize(3, 3)
o

array([[0. , 0.5, 1. ],
       [1.5, 2. , 2.5],
       [3. , 3.5, 4. ]])

<br>
`ones` returns a new array of given shape and type, filled with ones.

In [74]:
np.ones((3, 2))

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

<br>
`zeros` returns a new array of given shape and type, filled with zeros.

In [75]:
np.zeros((2, 3))

array([[0., 0., 0.],
       [0., 0., 0.]])

<br>
`eye` returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [76]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

<br>
`diag` extracts a diagonal or constructs a diagonal array.

In [77]:
print(y)
np.diag(y)

[4 5 6]


array([[4, 0, 0],
       [0, 5, 0],
       [0, 0, 6]])

<br>
Create an array using repeating list (or see `np.tile`)

In [78]:
np.array([1, 2, 3] * 3)

array([1, 2, 3, 1, 2, 3, 1, 2, 3])

<br>
Repeat elements of an array using `repeat`.

In [79]:
np.repeat([1, 2, 3], 3)

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

### Combining Arrays

In [80]:
p = np.ones([2, 3], int)
p

array([[1, 1, 1],
       [1, 1, 1]])

Use `vstack` to stack arrays in sequence vertically (row wise).

In [81]:
np.vstack([p, 2*p])

array([[1, 1, 1],
       [1, 1, 1],
       [2, 2, 2],
       [2, 2, 2]])

Use `hstack` to stack arrays in sequence horizontally (column wise).

In [82]:
np.hstack([p, 2*p])

array([[1, 1, 1, 2, 2, 2],
       [1, 1, 1, 2, 2, 2]])

## Operations

Use `+`, `-`, `*`, `/` and `**` to perform element wise addition, subtraction, multiplication, division and power.

In [83]:
print(x + y) # elementwise addition     [1 2 3] + [4 5 6] = [5  7  9]
print(x - y) # elementwise subtraction  [1 2 3] - [4 5 6] = [-3 -3 -3]

[5 7 9]
[-3 -3 -3]


In [84]:
print(x * y) # elementwise multiplication  [1 2 3] * [4 5 6] = [4  10  18]
print(x / y) # elementwise divison         [1 2 3] / [4 5 6] = [0.25  0.4  0.5]

[ 4 10 18]
[0.25 0.4  0.5 ]


In [85]:
print(x**2) # elementwise power  [1 2 3] ^2 =  [1 4 9]

[1 4 9]


**Dot Product:**  

$ \begin{bmatrix}x_1 \ x_2 \ x_3\end{bmatrix}
\cdot
\begin{bmatrix}y_1 \\ y_2 \\ y_3\end{bmatrix}
= x_1 y_1 + x_2 y_2 + x_3 y_3$

In [86]:
print(x)
print(y)
x.dot(y) # dot product  1*4 + 2*5 + 3*6

[1 2 3]
[4 5 6]


32

In [87]:
z = np.array([y, y**2])
print(len(z)) # number of rows of array

2


Let's look at transposing arrays. Transposing permutes the dimensions of the array.

In [88]:
z = np.array([y, y**2])
z

array([[ 4,  5,  6],
       [16, 25, 36]])

The shape of array `z` is `(2,3)` before transposing.

In [89]:
z.shape

(2, 3)

Use `.T` to get the transpose.

In [90]:
z.T

array([[ 4, 16],
       [ 5, 25],
       [ 6, 36]])

The number of rows has swapped with the number of columns.

In [91]:
z.T.shape

(3, 2)

Use `.dtype` to see the data type of the elements in the array.

In [92]:
z.dtype

dtype('int64')

Use `.astype` to cast to a specific type.

In [93]:
z = z.astype('f')
z.dtype

dtype('float32')

## Math Functions

Numpy has many built in math functions that can be performed on arrays.

In [95]:
a = np.array([-4, -2, 1, 3, 5])

In [96]:
a.sum()

3

In [97]:
a.max()

5

In [98]:
a.min()

-4

In [99]:
a.mean()

0.6

In [100]:
a.std()

3.2619012860600183

`argmax` and `argmin` return the index of the maximum and minimum values in the array.

In [101]:
a.argmax()

4

In [102]:
a.argmin()

0

## Indexing / Slicing

In [103]:
s = np.arange(13)**2
s

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144])

<br>
Use bracket notation to get the value at a specific index. Remember that indexing starts at 0.

In [104]:
s[0], s[4]

(0, 16)

In [106]:
s[-1]

144

<br>
Use `:` to indicate a range. `array[start:stop]`


Leaving `start` or `stop` empty will default to the beginning/end of the array.

In [108]:
s[1:5]

array([ 1,  4,  9, 16])

<br>
Use negatives to count from the back.

In [109]:
s[-4:]

array([ 81, 100, 121, 144])

<br>
A second `:` can be used to indicate step-size. `array[start:stop:stepsize]`

Here we are starting 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [110]:
s[-5::-2]

array([64, 36, 16,  4,  0])

<br>
Let's look at a multidimensional array.

In [111]:
r = np.arange(36)
r.resize((6, 6))
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

<br>
Use bracket notation to slice: `array[row, column]`

In [112]:
r[2, 2]

14

<br>
And use : to select a range of rows or columns

In [113]:
r[3, 3:6]

array([21, 22, 23])

<br>
Here we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [114]:
r[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

<br>
This is a slice of the last row, and only every other element.

In [115]:
r[-1, ::2]

array([30, 32, 34])

<br>
We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also see `np.where`)

In [116]:
r[r > 30]

array([31, 32, 33, 34, 35])

<br>
Here we are assigning all values in the array that are greater than 30 to the value of 30.

In [117]:
r[r > 30] = 30
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

## Copying Data

Be careful with copying and modifying arrays in NumPy!


`r2` is a slice of `r`

In [118]:
r2 = r[:3,:3]
r2

array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14]])

<br>
Set this slice's values to zero ([:] selects the entire array)

In [119]:
r2[:] = 0
r2

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

<br>
`r` has also been changed!

In [120]:
r

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br>
To avoid this, use `r.copy` to create a copy that will not affect the original array

In [121]:
r_copy = r.copy()
r_copy

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

<br>
Now when r_copy is modified, r will not be changed.

In [122]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

[[10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]] 

[[ 0  0  0  3  4  5]
 [ 0  0  0  9 10 11]
 [ 0  0  0 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 30 30 30 30 30]]


## Iterating Over Arrays

Let's create a new 4 by 3 array of random numbers 0-9.

In [123]:
test = np.random.randint(0, 10, (4,3))
test

array([[2, 5, 8],
       [4, 9, 0],
       [4, 0, 5],
       [8, 1, 4]])

<br>
Iterate by row:

In [124]:
for row in test:
    print(row)

[2 5 8]
[4 9 0]
[4 0 5]
[8 1 4]


<br>
Iterate by index:

In [125]:
for i in range(len(test)):
    print(test[i])

[2 5 8]
[4 9 0]
[4 0 5]
[8 1 4]


<br>
Iterate by row and index:

In [126]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

row 0 is [2 5 8]
row 1 is [4 9 0]
row 2 is [4 0 5]
row 3 is [8 1 4]


Use `zip` to iterate over multiple iterables.

In [127]:
test2 = test**2
test2

array([[ 4, 25, 64],
       [16, 81,  0],
       [16,  0, 25],
       [64,  1, 16]])

In [128]:
for i, j in zip(test, test2):
    print(i,'+',j,'=',i+j)

[2 5 8] + [ 4 25 64] = [ 6 30 72]
[4 9 0] + [16 81  0] = [20 90  0]
[4 0 5] + [16  0 25] = [20  0 30]
[8 1 4] + [64  1 16] = [72  2 20]


In [98]:
['a', 'b', 'c'] + [1, 2, 3]

['a', 'b', 'c', 1, 2, 3]

In [129]:
r = np.array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

In [130]:
r[2:4,2:4]

array([[14, 15],
       [20, 21]])