# Python: Functions

In [1]:
x = 1
y = 2
x + y

3

add_numbers is a function that takes two numbers and adds them together.

In [2]:
def add_numbers(x, y):
    return x + y

add_numbers(1, 2)

3

add_numbers updated to take an optional 3rd parameter. Using print allows display of multiple expressions within the cell.

In [3]:
def add_numbers(x, y, z = None):
    if z == None:
        return x + y
    else:
        return x + y + z

print(add_numbers(1, 5, 10))
print(add_numbers(1, 5))

16
6


add_numbers updated to take an optional flag parameter

In [4]:
def add_numbers(x, y, z = None, flag = False):
    if flag:
        print('Flag is true!')
    if z == None:
        return x + y
    else:
        return x + y + z

print(add_numbers(1, 5, flag = True))

Flag is true!
6


# Python: Types & Sequences

Use type to return the object's type

In [5]:
type('This is a string')

str

In [6]:
type(None)

NoneType

In [7]:
type(1)

int

In [8]:
type(1.0)

float

In [9]:
type(add_numbers)

function

Tuples are an immutable data structure and cannot be altered.

In [10]:
x = (1, 'a', 2, 'b')
type(x)

tuple

Lists are a mutable data structure and can be altered.

In [11]:
x = [1, 'a', 2, 'b']
type(x)

list

Use the append method to add an object to a list.

In [12]:
x.append(3.3)
print(x)

[1, 'a', 2, 'b', 3.3]


We can loop through each item in a list using a for loop.

In [13]:
for item in x:
    print(item)

1
a
2
b
3.3


We can also use the indexing operator.

In [14]:
i = 0
while i != len(x):
    print(x[i])
    i += 1

1
a
2
b
3.3


Use the + operator to concatenate lists.

In [15]:
[1, 2] + [3, 4]

[1, 2, 3, 4]

Use the * operator to repeat lists.

In [16]:
[1] * 3

[1, 1, 1]

Use the in operator to check if something is inside a list.

In [18]:
1 in [1, 2, 3]

True

Strings are essentially lists of characters. As a result, we can use bracket notation or indexing to slice a string.

In [19]:
x = 'This is a string'
print(x[0]) # The first character
print(x[0:1]) # first character, but we have explicitly set the end character
print(x[0:2]) # first two characters

T
T
Th


We can also use negative indices to get the end of the string.

In [20]:
x[-1]

'g'

This will return the slice starting from the 4th element from the end and stopping before the 2nd element from the end.

In [21]:
x[-4:-2]

'ri'

This is slice from the beginning of the string and stopping before the 3rd element.

In [22]:
x[:3]

'Thi'

This slice starts at the 4th element and goes till the end of the string.

In [24]:
x[3:]

's is a string'

In [25]:
firstName = 'Narottam'
lastName = 'Medhora'

print(firstName + ' ' + lastName)
print(firstName * 3)
print('Naro' in firstName)

Narottam Medhora
NarottamNarottamNarottam
True


The split method returns a list of all the words in a string, or a list split on a specific character.

In [26]:
firstName = 'Narottam Medhora'.split(' ')[0]
lastName = 'Narottam Medhora'.split(' ')[-1]

print(firstName)
print(lastName)

Narottam
Medhora


We must convert objects to strings before concatenating.

In [27]:
'Chris' + 2

TypeError: can only concatenate str (not "int") to str

In [28]:
'Chris' + str(2)

'Chris2'

Dictionaries associate keys with values.

In [29]:
x = {'Narottam Medhora': 'narottammedhora@outlook.com', 'Bill Gates': 'billg@microsoft.com'}
x['Narottam Medhora'] # Retrieve a value by using the index operator

'narottammedhora@outlook.com'

In [31]:
x['Richa Naidu'] = None
x['Richa Naidu']

We can iterate over all the keys.

In [36]:
for name in x:
    print(x[name])

narottammedhora@outlook.com
billg@microsoft.com
None


We can iterate over all the values.

In [34]:
for email in x.values():
    print(email)

narottammedhora@outlook.com
billg@microsoft.com
None


We can iterate over all the items in the list.

In [35]:
for name, email in x.items():
    print(name, email)

Narottam Medhora narottammedhora@outlook.com
Bill Gates billg@microsoft.com
Richa Naidu None


We can also unpack a sequence into different variables.

In [37]:
x = ('Narottam', 'Medhora', 'narottammedhora@gmail.com')
fname, lname, email = x

In [38]:
fname

'Narottam'

In [39]:
lname

'Medhora'

We have to make sure the number of values unpacked matches the number of variables.

In [41]:
x = ('Narottam', 'Medhora', 'narottammedhora@gmail.com', 'Chicago')
fname, lname, email = x

ValueError: too many values to unpack (expected 3)

# Python: More on Strings

In [42]:
print('Chris' + 2)

TypeError: can only concatenate str (not "int") to str

In [43]:
print('Chris' + str(2))

Chris2


Python also has a built-in method for convenient string formatting.

In [45]:
salesRecord = {
    'price': 3.24,
    'num_items': 4,
    'person': 'Chris'
}

salesStatement = '{} bought {} item(s) at a price of {} each for a total of {}'

print(salesStatement.format(salesRecord['person'],
                            salesRecord['num_items'],
                            salesRecord['price'],
                            salesRecord['price'] * salesRecord['num_items']))

Chris bought 4 item(s) at a price of 3.24 each for a total of 12.96


# Reading and Writing CSV files

Let's import our datafile mpg.csv, which contains fuel economy data for 234 cars.

* mpg : miles per gallon
* class : car classification
* cty : city mpg
* cyl : # of cylinders
* displ : engine displacement in liters
* drv : f = front-wheel drive, r = rear wheel drive, 4 = 4wd
* fl : fuel (e = ethanol E85, d = diesel, r = regular, p = premium, c = CNG)
* hwy : highway mpg
* manufacturer : automobile manufacturer
* model : model of car
* trans : type of transmission
* year : model year

In [53]:
import csv

%precision 2 # Sets floating point precision to 2 decimal places

with open('mpg.csv') as csvfile:
    mpg = list(csv.DictReader(csvfile))

mpg[:3]

[{'': '1',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'auto(l5)',
  'drv': 'f',
  'cty': '18',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '2',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '1.8',
  'year': '1999',
  'cyl': '4',
  'trans': 'manual(m5)',
  'drv': 'f',
  'cty': '21',
  'hwy': '29',
  'fl': 'p',
  'class': 'compact'},
 {'': '3',
  'manufacturer': 'audi',
  'model': 'a4',
  'displ': '2',
  'year': '2008',
  'cyl': '4',
  'trans': 'manual(m6)',
  'drv': 'f',
  'cty': '20',
  'hwy': '31',
  'fl': 'p',
  'class': 'compact'}]

csv.DictReader reads in each row of our csv file as a dictionary. len shows our list is comprised of 234 dictionaries.

In [54]:
len(mpg)

234

keys gives us the column names of our csv file.

In [55]:
mpg[0].keys()

dict_keys(['', 'manufacturer', 'model', 'displ', 'year', 'cyl', 'trans', 'drv', 'cty', 'hwy', 'fl', 'class'])

We can find out the average city fuel economy across all cars. All values in the dictonaries are strings, so we need to convert to float.

In [58]:
sum(float(d['cty']) for d in mpg) / len(mpg)

16.86

Similarly, this is how to find the average highway fuel economy.

In [60]:
sum(float(d['hwy']) for d in mpg) / len(mpg)

23.44

We can use set to return the unique values for the number of cylinders the cars in our dataset have.

In [61]:
cylinders = set(d['cyl'] for d in mpg)
cylinders

{'4', '5', '6', '8'}

Here's a more complex example where we are grouping the cars by number of cylinders, and finding the average cty mpg for each group.

In [62]:
ctyMpgByCyl = []

for c in cylinders: # iterate over all the cylinder levels
    summpg = 0
    cylTypeCount = 0

    for d in mpg: # iterate over all the dictionaries
        if d['cyl'] == c: #if the cylinder level type matches
            summpg += float(d['cty']) # add the cty mpg
            cylTypeCount += 1 # increment the count for that type of cylinder
    ctyMpgByCyl.append((c, summpg/cylTypeCount)) # append the tuple ('cylinder', 'avg mpg')

ctyMpgByCyl.sort(key = lambda x: x[0])
ctyMpgByCyl

[('4', 21.01), ('5', 20.50), ('6', 16.22), ('8', 12.57)]

Use set to return the unique values for the class types in the dataset.

In [64]:
vehicleClass = set(d['class'] for d in mpg) # create a set of vehicle classes
vehicleClass

{'2seater', 'compact', 'midsize', 'minivan', 'pickup', 'subcompact', 'suv'}

We can now find the average hwy mpg for each class of vehicle in the dataset.

In [65]:
hwyMpgByClass = []

for vehicleType in vehicleClass: # iterate over all the vehicle classes
    summpg = 0
    vehicleClassCount = 0

    for d in mpg: # iterate over all the dictionaries
        if d['class'] == vehicleType: # if the vehicle class matches
            summpg += float(d['hwy']) # add the hwy mpg
            vehicleClassCount += 1 # increment the count
    hwyMpgByClass.append((vehicleType, summpg / vehicleClassCount)) # append the tuple ('class', 'avg mpg') 

hwyMpgByClass.sort(key = lambda x: x[1])
hwyMpgByClass


[('pickup', 16.88),
 ('suv', 18.13),
 ('minivan', 22.36),
 ('2seater', 24.80),
 ('midsize', 27.29),
 ('subcompact', 28.14),
 ('compact', 28.30)]

# Dates and Times

In [1]:
import datetime as dt
import time as tm

The time library's time method returns the current time in seconds since the Epoch (Jan 1., 1970)

In [2]:
tm.time()

1594220425.514366

We can convert the timestamp to datetime.

In [4]:
dtnow = dt.datetime.fromtimestamp(tm.time())
dtnow

datetime.datetime(2020, 7, 8, 10, 1, 12, 959667)

Some handy datetime attributes:

In [6]:
dtnow.year, dtnow.month, dtnow.day, dtnow.hour, dtnow.minute, dtnow.second # get year, month, day etc from datetime

(2020, 7, 8, 10, 1, 12)

timedelta is a duration expressing the difference between two dates.

In [8]:
delta = dt.timedelta(days = 100)
delta

datetime.timedelta(days=100)

In [9]:
today = dt.date.today()

In [10]:
today - delta # the date 100 days ago

datetime.date(2020, 3, 30)

In [11]:
today > today - delta # compare dates

True

# Objects and map()

An example of a class in Python:

In [34]:
class Person:
    department = 'School Of Information' # a class variable

    def set_name(self, new_name): # a method
        self.name = new_name
    def set_location(self, new_location):
        self.location = new_location

In [41]:
p = Person()
p.set_name('Narottam Medhora')
p.set_location('Chicago, IL, USA')
print('{} lives in {} and works in the department {}'.format(p.name, p.location, p.department))

Narottam Medhora lives in Chicago, IL, USA and works in the department School Of Information


An example of using map:

In [42]:
store1 = [10.00, 11.00, 12.34, 2.34]
store2 = [9.00, 11.10, 12.34, 2.01]

cheapest = map(min, store1, store2) # map(function, iterable1, iterable2)
cheapest # creates a map object

<map at 0x7ff3d52e65b0>

In [43]:
for item in cheapest:
    print(item)

9.0
11.0
12.34
2.01


In [61]:
# Write a function and apply it using map() to get a list of all faculty titles and last names


people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

def split_title_and_name(person):
    return person.split(' ')[0] + ' ' + person.split(' ')[-1]

list(map(split_title_and_name, people))




['Dr. Brooks', 'Dr. Collins-Thompson', 'Dr. Vydiswaran', 'Dr. Romero']

# Lambda & list comprehension

Here's an example of lambda that takes in three parameters and adds the first two.

In [109]:
my_func = lambda a, b, c: a + b
my_func(1, 2, 3)

3

In [63]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

def split_title_and_name(person):
    return person.split()[0] + ' ' + person.split()[-1]

# option 1
for person in people:
    name = lambda person: person.split(' ')[0] + ' ' +  person.split(' ')[-1]
    print(name(person))

# option 2
list(map(lambda person: person.split(' ')[0] + ' ' +  person.split(' ')[-1], people))

Dr. Brooks
Dr. Collins-Thompson
Dr. Vydiswaran
Dr. Romero


['Dr. Brooks', 'Dr. Collins-Thompson', 'Dr. Vydiswaran', 'Dr. Romero']

In [1]:
people = ['Dr. Christopher Brooks', 'Dr. Kevyn Collins-Thompson', 'Dr. VG Vinod Vydiswaran', 'Dr. Daniel Romero']

def split_title_and_name(person):
    return person.split()[0] + ' ' + person.split()[-1]

# option 1. since lambda is an anonymous function, it must be called. Therefore, we have the (lambda expression)(parameter) syntax
for person in people:
    print(split_title_and_name(person) == (lambda x: x.split(' ')[0] + ' ' + x.split(' ')[-1])(person))

# option 2
list(map(split_title_and_name, people)) == list(map(lambda x: x.split(' ')[0] + ' ' +  x.split(' ')[-1], people))

True
True
True
True


True

In [78]:
my_list = []
for number in range(0, 1000):
    if number % 2 == 0:
        my_list.append(number)

my_list = [number for number in range(0, 1000) if number % 2 == 0]

In [81]:
def times_tables():
    lst = []
    for i in range(10):
        for j in range(10):
            lst.append(i * j)
    return lst

times_tables() == [i * j for i in range(10) for j in range(10)]

True

Hereâ€™s a harder question which brings a few things together.

Many organizations have user ids which are constrained in some way. Imagine you work at an internet service provider and the user ids are all two letters followed by two numbers (e.g. aa49). Your task at such an organization might be to hold a record on the billing activity for each possible user.

Write an initialization line as a single list comprehension which creates a list of all possible user ids. Assume the letters are all lower case.

In [2]:
lowercase = 'abcdefghijklmnopqrstuvwxyz'
digits = '0123456789'

answer = [l1 + l2 + n1 + n2 for l1 in lowercase for l2 in lowercase for n1 in digits for n2 in digits]
answer[:50]

['aa00',
 'aa01',
 'aa02',
 'aa03',
 'aa04',
 'aa05',
 'aa06',
 'aa07',
 'aa08',
 'aa09',
 'aa10',
 'aa11',
 'aa12',
 'aa13',
 'aa14',
 'aa15',
 'aa16',
 'aa17',
 'aa18',
 'aa19',
 'aa20',
 'aa21',
 'aa22',
 'aa23',
 'aa24',
 'aa25',
 'aa26',
 'aa27',
 'aa28',
 'aa29',
 'aa30',
 'aa31',
 'aa32',
 'aa33',
 'aa34',
 'aa35',
 'aa36',
 'aa37',
 'aa38',
 'aa39',
 'aa40',
 'aa41',
 'aa42',
 'aa43',
 'aa44',
 'aa45',
 'aa46',
 'aa47',
 'aa48',
 'aa49']

# Numpy

In [4]:
import numpy as np

# Creating Arrays

Create a list and convert it to a numpy array

In [116]:
myList = [1, 2, 3]
x = np.array(myList)
x

array([1, 2, 3])

We can also pass in a list directly.

In [118]:
y = np.array([4, 5, 6])
y

array([4, 5, 6])

We can pass in a list of lists to create a multidimensional array.

In [119]:
m = np.array([[7, 8, 9], [10, 11, 12]])
m

array([[ 7,  8,  9],
       [10, 11, 12]])

We can use the shape method to find the dimensions of the array (rows, columns)

In [121]:
m.shape # returns a tuple of integers

(2, 3)

arange returns evenly spaced values within a given interval

In [123]:
n = np.arange(0, 30, 2) # start at 0, count up by 2, stop before 30
n

array([ 0,  2,  4,  6,  8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28])

reshape returns an array with the same data with a new shape.

In [124]:
n = n.reshape(3, 5) # reshape array to be 3 x 5
n

array([[ 0,  2,  4,  6,  8],
       [10, 12, 14, 16, 18],
       [20, 22, 24, 26, 28]])

linspace returns evenly spaced numbers over a specified interval.

In [6]:
o = np.linspace(0, 4, 9, dtype = float, ) # return 9 evenly spaced values between 0 and 4
o

array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. ])

resize changes the shape and size of the array in-place.

In [127]:
o.resize(3, 3)
o

array([[0. , 0.5, 1. ],
       [1.5, 2. , 2.5],
       [3. , 3.5, 4. ]])

ones returns a new array of a given shape and type, filled with ones.

In [128]:
np.ones((3, 2)) # numpy.zeros(shape, dtype = float). shape can be int or a tuple of ints

array([[1., 1.],
       [1., 1.],
       [1., 1.]])

zeros returns a new array of given shape and type, filled with zeros.

In [130]:
np.zeros((5, 5))

array([[0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0.]])

In [132]:
np.zeros((5,5), dtype = int)

array([[0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0]])

eye returns a 2-D array with ones on the diagonal and zeros elsewhere.

In [136]:
np.eye(6)

array([[1., 0., 0., 0., 0., 0.],
       [0., 1., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0.],
       [0., 0., 0., 1., 0., 0.],
       [0., 0., 0., 0., 1., 0.],
       [0., 0., 0., 0., 0., 1.]])

diag extratcs a diagonal or constructs a diagonal array

In [138]:
np.diag(np.eye(6)) # extract a diganonal array

array([1., 1., 1., 1., 1., 1.])

In [139]:
np.diag(y) # create a diagonal array

array([[4, 0, 0],
       [0, 5, 0],
       [0, 0, 6]])

Create an array using repeating list (or see np.tile)

In [140]:
np.array([1, 2, 3] * 3)

array([1, 2, 3, 1, 2, 3, 1, 2, 3])

In [7]:
a = np.array([1, 2, 3])
np.tile(a, (3, 5)) # Constructs an array by repeating A the number of times given by reps, i.e. a * (3, 5)

array([[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3],
       [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3],
       [1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]])

Repeat elements of an array using repeat.

In [142]:
np.repeat(a, 3) # note the difference between repeat and list multiplication

array([1, 1, 1, 2, 2, 2, 3, 3, 3])

### Combining arrays

Use vstack to stack arrays in sequence vertically (row wise).

In [144]:
p = np.ones([2, 3], dtype = int)
p

array([[1, 1, 1],
       [1, 1, 1]])

In [145]:
np.vstack([p, 2 * p])

array([[1, 1, 1],
       [1, 1, 1],
       [2, 2, 2],
       [2, 2, 2]])

Use hstack to stack arrays in sequence horizontally (column wise).

In [146]:
np.hstack([p, 2 * p])

array([[1, 1, 1, 2, 2, 2],
       [1, 1, 1, 2, 2, 2]])

### Operations

Use +, -, / and ** to perform element wise addition, subtraction, multiplication, division and power.

In [147]:
print(x + y) # element wise addition [1 2 3] + [4 5 6] = [5 7 9]
print(x - y) # element wise subtraction [1 2 3] - [4 5 6] = [-3 -3 -3]

[5 7 9]
[-3 -3 -3]


In [148]:
print(x * y)
print(x / y)

[ 4 10 18]
[0.25 0.4  0.5 ]


In [149]:
print(x**2) # element wise power [1, 2, 3] ^ 2 = [1 4 9]

[1 4 9]


#### Dot product



In [151]:
x.dot(y) # dot product 1*4 + 2*5 + 3*6 

32

In [153]:
z = np.array([y, y**2])
print(z)
print(len(z)) # number of rows in the array

[[ 4  5  6]
 [16 25 36]]
2


We can also transpose arrays. Transposing permutes the dimensions of the array.

In [154]:
z.shape

(2, 3)

The shape of the array is (2, 3) before transposing. use .T to get the transpose.

In [156]:
z.T

array([[ 4, 16],
       [ 5, 25],
       [ 6, 36]])

The number of rows has swapped with the number of columns.

In [157]:
z.T.shape

(3, 2)

We can use dtype to see the data type of the elements in the array.

In [158]:
z.dtype

dtype('int64')

We can use astype to cast to a specific data type.

In [159]:
z = z.astype(float)
z.dtype

dtype('float64')

## Math functions

In [160]:
b = np.array([-4, -2, 1, 3, 5])

In [162]:
b.sum()

3

In [163]:
b.max()

5

In [164]:
b.min()

-4

In [165]:
b.max()

5

In [166]:
b.mean()

0.6

In [167]:
b.std()

3.2619012860600183

argmax and argmin return the index of the maximum and minimum values in the array.

In [168]:
b.argmax()

4

In [169]:
b.argmin()

0

## Indexing/Slicing

In [13]:
s = np.arange(13)**2
s

array([  0,   1,   4,   9,  16,  25,  36,  49,  64,  81, 100, 121, 144])

We can use bracket notation to get the value at a specific index. Keep in mind that the index starts at 0.

In [171]:
s[0], s[4], s[-1]

(0, 16, 144)

Use : to indicate a range, eg. `array[start:stop]`. Leaving start or stop empty will default to the beginning or end of the array.

In [172]:
s[1:5]

array([ 1,  4,  9, 16])

We can use negatives to count from the end of the arrray.

In [173]:
s[-4:]

array([ 81, 100, 121, 144])

A second `:` can be used to indicate step-size. `array[start:stop:stepsize]`

Here, we are starting at the 5th element from the end, and counting backwards by 2 until the beginning of the array is reached.

In [174]:
s[-5::-2]

array([64, 36, 16,  4,  0])

Let's look at a multidimensional array.

In [14]:
r = np.arange(36)
r.resize((6, 6))
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

In [20]:
r[:, ::7]

array([[ 0],
       [ 6],
       [12],
       [18],
       [24],
       [30]])

We can use the bracket notation to slice: `array[row:column]`

In [177]:
r[2, 2]

14

We can also use `:` to select a range of rows or columns.

In [178]:
r[3, 3:6]

array([21, 22, 23])

Here, we are selecting all the rows up to (and not including) row 2, and all the columns up to (and not including) the last column.

In [179]:
r[:2, :-1]

array([[ 0,  1,  2,  3,  4],
       [ 6,  7,  8,  9, 10]])

This is a slice of the last row, and only every other element.

In [181]:
r[-1, ::2]

array([30, 32, 34])

We can also perform conditional indexing. Here we are selecting values from the array that are greater than 30. (Also, see np.where).

In [193]:
r[r > 30]

array([31, 32, 33, 34, 35])

In [187]:
a = np.arange(10)
a

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [188]:
np.where(a < 5, a, 10 * a)

array([ 0,  1,  2,  3,  4, 50, 60, 70, 80, 90])

Here, we are assigning all the values in the array that are greater than 30 to the value of 30.

In [183]:
r[r > 30] = 30
r

array([[ 0,  1,  2,  3,  4,  5],
       [ 6,  7,  8,  9, 10, 11],
       [12, 13, 14, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 30, 30, 30, 30, 30]])

## Copying Data

We need to be careful with copying and modifying arrays in NumPy.

In this example, r2 is a slice of r.

In [194]:
r2 = r[:3, :3]
r2

array([[ 0,  1,  2],
       [ 6,  7,  8],
       [12, 13, 14]])

Set this slice's values to zero (`[:]` selects the entire array)

In [195]:
r2[:] = 0
r2

array([[0, 0, 0],
       [0, 0, 0],
       [0, 0, 0]])

But this also changes the values in r.

In [196]:
r

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

To avoid this, we can use r.copy to create a copy that will not affect the original array.

In [198]:
r_copy = r.copy()
r_copy

array([[ 0,  0,  0,  3,  4,  5],
       [ 0,  0,  0,  9, 10, 11],
       [ 0,  0,  0, 15, 16, 17],
       [18, 19, 20, 21, 22, 23],
       [24, 25, 26, 27, 28, 29],
       [30, 31, 32, 33, 34, 35]])

Now when r_copy is modified, r will not be changed.

In [199]:
r_copy[:] = 10
print(r_copy, '\n')
print(r)

[[10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]
 [10 10 10 10 10 10]] 

[[ 0  0  0  3  4  5]
 [ 0  0  0  9 10 11]
 [ 0  0  0 15 16 17]
 [18 19 20 21 22 23]
 [24 25 26 27 28 29]
 [30 31 32 33 34 35]]


## Iterating over arrays

Let's create a new 4 x 3 array of random numbers between 0 and 9. 

In [201]:
test = np.random.randint(0, 10, (4, 3))
test

array([[8, 4, 4],
       [6, 9, 3],
       [7, 9, 9],
       [0, 0, 5]])

Iterate by row:

In [202]:
for row in test:
    print(row)

[8 4 4]
[6 9 3]
[7 9 9]
[0 0 5]


Iterate by index:

In [204]:
for i in range(len(test)):
    print(test[i])

[8 4 4]
[6 9 3]
[7 9 9]
[0 0 5]


We can also iterate by row and index using enumerate.

In [205]:
for i, row in enumerate(test):
    print('row', i, 'is', row)

row 0 is [8 4 4]
row 1 is [6 9 3]
row 2 is [7 9 9]
row 3 is [0 0 5]


We can use zip to iterate over multiple iterables.

In [206]:
test2 = test**2
test2

array([[64, 16, 16],
       [36, 81,  9],
       [49, 81, 81],
       [ 0,  0, 25]])

In [207]:
for i, j in zip(test, test2):
    print(i, '+', j, '=', i + j)

[8 4 4] + [64 16 16] = [72 20 20]
[6 9 3] + [36 81  9] = [42 90 12]
[7 9 9] + [49 81 81] = [56 90 90]
[0 0 5] + [ 0  0 25] = [ 0  0 30]
