# Python Introduction:
---

## Types

Let's evaluate some simple expressions.

In [3]:
#clear
3*2

6

In [4]:
#clear
5+3*2

11

You can use `type()` to find the *type* of an expression.

In [5]:
#clear
type(5)

int

Now add decimal points.

In [6]:
#clear
5+3.5*2

12.0

In [7]:
#clear
type(5+3.0*2)

float

Strings are written with single (``'``) or double quotes (`"`)

In [8]:
'hello "hello"'

"hello 'hello'"

"hello 'hello'"

Multiplication and addition work on strings, too.

In [9]:
#clear
3 * 'hello ' + "cs357"

'hello hello hello cs357'

Lists are written in brackets (`[]`) with commas (`,`).

In [9]:
#clear
[5, 3.5, 7]

[5, 3.5, 7]

In [10]:
#clear 
type([5,3,7])

list

List entries don't have to have the same type.

In [11]:
#clear
["hi there", 15, [1,2,3]]

['hi there', 15, [1, 2, 3]]

"Multiplication" and "addition" work on lists, too.

In [12]:
#clear
[1,2,3] * 4 + [5, 5, 5]

[1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 5, 5, 5]

Hmmmmmm. Was that what you expected? How can we "fix" this?

In [14]:
# clear
import numpy as np

np.array([1,2,3]) * 4 + np.array([5,5,5])

array([ 9, 13, 17])

We will introduce python later in this notebook

## Names and Values

Define and reference a variable:

In [None]:
#clear
a = 3*2 + 5

In [None]:
#clear
a = "interesting"*3

No type declaration needed!

(But values still have types--let's check.)

In [None]:
#clear
type(a)

Everything in python is an object. Python variables are like *pointers*.

(if that word makes sense)

In [None]:
a = [1,2,3]

In [None]:
b = a

In [None]:
#clear
print(a)
print(b)

In [None]:
#clear
b.append(4)

In [None]:
#clear
b

In [None]:
#clear
a

You can see this pointer with `id()`.

In [None]:
#clear
print(id(a), id(b))

The `is` operator tests for object sameness.

In [None]:
#clear
a is b

This is a **stronger** condition than being equal!

In [None]:
a = [1,2,3]
b = [1,2,3]
print("IS   ", a is b)
print("EQUAL", a == b)

What do you think the following prints?

In [None]:
a = [1,2,3]
b = a
a = a + [4]
print(b)
print(a)

In [None]:
a is b

Why is that?

-----
How could this lead to bugs?

----------
* To help manage this risk, Python provides **immutable** types.

* Immutable types cannot be changed in-place, only by creating a new object.

* A `tuple` is an immutable `list`.

In [None]:
a = [1,2,3]
type(a)

In [None]:
a[2] = 0
print(a)

In [None]:
#clear
a = (1,2,3)
type(a)

Let's try to change that tuple.

In [None]:
# clear
a[2] = 0

*Bonus question:* How do you spell a single-element tuple?

In [None]:
#clear
a = (3,)

type(a)

String is also immutable type. 

In [None]:
myName = 'Mariena'

Note that myName is spelled incorrectly. We would like to change the letter "e" with a letter "a"

In [None]:
myName[4]

In [None]:
myName[4]='a'

## Memory management

In [None]:
a = "apple"
b = "apple"
print(id(a),id(b))
print (a is b)
print (a == b)

<img src="figures/PointToString.png" width=200 />

Note that "a" and "b" are bounded to the same object "apple", and therefore they have the same id. For optimization reasons, Python does not store duplicates of "simple" strings. But if the strings are a little more complicated...

In [None]:
a = "Hello, how are you?"
b = "Hello, how are you?"
print(id(a),id(b))
print (a is b)
print (a == b)

This is called interning, and Python does interning (to some extent) of shorter string literals (such as "apple") which are created at compile time. But in general, Python string literals creates a new string object each time (as in "Hello, how are you?"). Interning is runtime dependant and is always a trade-off between memory use and the cost of checking if you are creating the same string. 

In general, integers, floats, lists, tuples, etc, will be stored at different locations, and therefore have different ids.

In [None]:
a = 5000
b = 5000
print(id(a),id(b))
print (a is b)
print (a == b)

In [None]:
a = (1,[2,3],4)
b = (1,[2,3],4)
print(id(a),id(b))
print (a is b)
print (a == b)

Also for optimization reasons, Python will <strong>not</strong> duplicate integers between -5 and 256. Instead, it will  keep an array of integer objects for all integers between -5 and 256, so when you create an int in that range you actually just get back a reference to the existing object. 

In [None]:
a = 256
b = 256
print(id(a),id(b))
print (a is b)
print (a == b)

In [None]:
a = 257
b = 257
print(id(a),id(b))
print (a is b)
print (a == b)

http://foobarnbaz.com/2012/07/08/understanding-python-variables/

## Objects and Naming

Understanding objects and naming in Python can be difficult.  In this demo we'll follow a post [Is Python call-by-value or call-by-reference? Neither.](https://jeffknupp.com/blog/2012/11/13/is-python-callbyvalue-or-callbyreference-neither/).

First, let's make some objects.  *Everything in Python is an object*.

In [None]:
#So when we make a string it's an object. When we call it a variable name, it binds that name to the string object. 
fruit = 'apple'

#When we make a list, it will point to the object bound by fruit
lunch = []
lunch.append(fruit)

dinner = lunch
dinner.append('fish')

fruit = 'pear'

meals = [fruit, lunch, dinner]
print(meals)



Let's check the object ids for both lists

In [None]:
print(id(lunch))
print(id(dinner))

Notice what happens when we append to list that is bound to both `lunch` and `dinner`:

In [None]:
dinner.append('pasta')
print(lunch, dinner)

In [None]:
lunch.append('carrots')
print(lunch,dinner)

## Mutable and Immutable

We've looked at mutable and immutable.  Tuples are an example of immutable objects.

In [None]:
fruits = ['apple', 'banana', 'orange']
veggies = ['carrot', 'broccoli']

food_tuple = (fruits, veggies)


fruits.append('plum')

print(fruits)

print(food_tuple)


## Indexing and Slicing

The `range` function lets us build a list of numbers.

In [None]:
list(range(1,10))

In [None]:
list(range(10, 20,1))

Notice anything funny?

Python uses this convention everywhere.

In [None]:
a = list(range(10, 20))
type(a)

Let's talk about indexing.

Indexing in Python starts at 0.

In [None]:
a[0]

And goes from there.

In [None]:
a[1]

In [None]:
a[2]

What do negative numbers do?

In [None]:
a[-1]

In [None]:
a[-2]

You can get a sub-list by *slicing*.

In [None]:
print(a)
print(a[3:7])

Start and end are optional.

In [None]:
a[3:]

In [None]:
a[:3]

Again, notice how the end entry is not included:

In [None]:
print(a[:3])
print(a[3])

Slicing works on any sequence type! (`list`, `tuple`, `str`, `numpy` array)

In [None]:
a = "CS357"
a[-3:]

In [None]:
#clear
a = [0,1,2,3,4,5,6,7,8,9]
a[1::2][::-1]

## Control Flow

`for` loops in Python always iterate over something list-like:

In [None]:
for i in range(3,10):

    print(i)
    
    


**Note** that Python does block-structuring by leading spaces.

Also note the trailing "`:`".

---
`if`/`else` are as you would expect them to be:

In [None]:
for i in range(10):
    if i % 3 == 0:
        print("{0} is divisible by 3".format(i))
    else:
        print("{0} is not divisible by 3".format(i))

In [None]:
print("My name is %s" % 'Luke')
print("My name is {}".format('Luke'))

`while` loops exist too:

In [None]:
i = 0
while True:
    i += 1
    if i**3 + i**2 + i + 1 == 3616:
        break

print("SOLUTION:", i)

----
Building lists by hand can be a little long. For example, build a list of the squares of integers below 50 divisible by 7:

In [None]:
mylist = []

for i in range(50):

    if i % 7 == 0:

        mylist.append(i**2)

In [None]:
mylist

Python has a something called *list comprehension*:

In [None]:
mylist = [i**2 for i in range(50) if i % 7 == 0]
print(mylist)

## Enumerate

suppose you had a single list:

In [2]:
fruits = ['apples', 'bananas', 'oranges', 'grapes']
for i, f in enumerate(fruits):
    print("%d: the fruit is %s" % (i,f))

0: the fruit is apples
1: the fruit is bananas
2: the fruit is oranges
3: the fruit is grapes


## Zip (another helpful function)

In [1]:
ids = ['a', 'b', 'c', 'd']
fruits = ['apples', 'bananas', 'oranges', 'grapes']
for c in zip(ids, fruits):
    print(c)

('a', 'apples')
('b', 'bananas')
('c', 'oranges')
('d', 'grapes')


## Dictionaries

Dictionaries have *key*:*value* pairs of any Python object:

In [None]:
#mydict = {key: value}
mydict = {'Luke': 15,'Mariana' : 22}
print(mydict["Mariana"])

In [None]:
string = "Batman"
mydict = {key:ord(key) for key in string}

In [None]:
print(mydict)

## Function definitions


Functions help extract out common code blocks.

Let's define a function `print_greeting()`.

In [None]:
def print_greeting():
    print("Hi there, how are you?")
    print("Long time no see.")

And call it:

In [None]:
print_greeting()

That's a bit impersonal.

In [None]:
def print_greeting(name):

    print("Hi there, {0}, how are you?".format(name))

    print("Long time no see.")

In [None]:
print_greeting("Andreas")

In [None]:
print_greeting()

But we might not know their name. So we can set a default value for parameters.

(And we just changed the interface of `print_greeting`!)

In [None]:
def print_greeting(name="my friend"):

    print("Hi there, {0}, how are you?".format(name))

    print("Long time no see.")

In [None]:
print_greeting("Tim")

Note that the order of the parameters does not matter

In [None]:
def printinfo( name , age ):
    print("Name: ", name)
    print("Age: ", age)

printinfo(40,"Mariana")


In [None]:
printinfo( age=8, name="Julia" )

However the parameters "age" and "name" are both required above. What if we want to have optional parameters (without setting default values)?

In [None]:
def printinfo( firstvar , *othervar ):
    print("First parameter:", firstvar)
    print("List of other parameters:")
    for var in othervar:
        print(var)

# Now you can call printinfo function
printinfo(10,20,30,50,60)


## Packing and Unpacking

In [3]:
def give_me_fruits():
    return ['apple', 'banana', 'orange', 'plum']

In [4]:
*myfruits, _ = give_me_fruits()
print(myfruits)

['apple', 'banana', 'orange']


Above we used `_` to denote a return that we want to just ignore (and not bind to a name).

In [5]:
def some_fruits(fruit, *morefruits):
    print('The best fruit is %s' % fruit)
    for f in morefruits:
        print('...not %s' % f)

In [6]:
some_fruits('apple', 'banana', 'orange', 'plum')

The best fruit is apple
...not banana
...not orange
...not plum


A function can also return more than one parameter, and the results appear as a tuple:

In [None]:
def average_total(a,b):
    totalsum = a + b
    average = totalsum/2
    diff = a - b
    return average,totalsum, diff

#We can unpack values returned from a function
ave,tot,diff = average_total(3,5)
print(ave, tot, diff)

#We can unpack values using the star mark (*) for variable length
ave, *othervar, diff = average_total(3,5)
print(ave, diff)

ave, *_ = average_total(3,5)
print(ave)

## Remember mutable and immutable types...

Function parameters work like variables. So what does this do?

In [None]:
def my_func(my_list):
    my_list.append(5)
    print("List printed inside the function: ",my_list)
    
numberlist = [1,2,3]
print("List before function call: ",numberlist)
my_func(numberlist)
print("List after function call: ",numberlist)

Can be very surprising! Here, we are maintaining reference of the passed object and appending values in the same object.

Define a better function `my_func_2`:

In [None]:
def my_func_2(my_list):
    
    my_list = my_list + [5]
    print("List printed inside the function: ",my_list)

    return my_list


numberlist = [1,2,3]
print("List before function call: ",numberlist)
new_list = my_func_2(numberlist)
#inside the function my_list = [1,2,3,5]
print("List after function call: ",numberlist)
print("Modified list after function call: ",new_list)

Note that the parameter my_list is local to the function. 

In [None]:
def change_fruits(fruit):
    fruit='apple'
    print("I'm changing the fruit to %s" % fruit)

In [None]:
myfruit = 'banana'
change_fruits(myfruit)
print("The fruit is %s " % myfruit)

**What happened?!** Remember that the input, `fruit` to `change_fruits` is bound to an object within the scope of the function:
  * If the object is mutable, the object will change
  * If the object is immutable (like a string!), then a new object is formed, only to live within the function scope.

## Try to answer before running the code snippet

In [1]:
def add_minor(person):
    person.append('math')

def switch_majors(person):
    person = ['physics']
    person.append('economics')
    return(person)


In [None]:
# What are the values of Tim and John?
John = ['computer_science']
Tim = John
print(Tim,John)

In [None]:
# What are the values of Tim and John?
add_minor(Tim)
print(Tim,John)

In [None]:
# What are the values of Tim and John?
switch_majors(John)
print(Tim,John)

# Numpy Introduction:
---

## A Difference in Speed

Let's import the `numpy` module.

In [2]:
import numpy as np

In [None]:
n = 5  # CHANGE ME
a1 = list(range(n)) # python list
a2 = np.arange(n)   # numpy array

if n <= 10:
    print(a1)
    print(a2)

In [None]:
%timeit [i**2 for i in a1]

In [None]:
%timeit a2**2

Numpy Arrays: much less flexible, but:

* much faster
* less memory

## Creating a numpy array

* Casting from a list

In [None]:
a = np.array([1,2,3,5])
print(a)
print(a.dtype)

In [None]:
b = np.array([1.0,2.0,3.0])
print(b)
print(b.dtype)

But also noticed that:

In [None]:
c = np.array([1,2,3])
print(c)
print(c.dtype)

d = np.array([1,2.,3])
print(d)
print(d.dtype)

* `linspace`
* np.linspace(start, stop, num=50,...)
* num is the number of sample points

In [None]:
np.linspace(-1, 1, 9)

* `zeros`

In [None]:
np.zeros((10,10), np.float64)

Create 2D arrays, using zeros, using reshape and from list

## Operations on arrays

These propagate to all elements:

In [None]:
a = np.array([1.2, 3, 4])
b = np.array([0.5, 0, 1])

Addition, multiplication, power ... are all elementwise:

In [None]:
a+b

In [None]:
a*b

In [None]:
a**b

## Important Attributes

Numpy arrays have two (most) important attributes:

In [None]:
A = np.random.rand(5, 4, 3)
A.shape

The `.shape` attribute contains the dimensionality array as a tuple. So the tuple `(5,4,3)` means that we're dealing with a three-dimensional array of size $5 \times 4 \times 3$.

(`numpy.random.rand` just generates an array of random numbers of the given shape.)

In [None]:
A.dtype

Other `dtype`s include `np.complex64`, `np.int32`, ...

## Try to answer before running the code snippet

In [None]:
a = np.array([2,2,3])
b = np.array([1,3,2])

# What is the result of the following computation?
c = (a+b)*a**b

In [None]:
print(c)

## 1D arrays

In [None]:
a = np.random.rand(5)
a.shape

In [None]:
a = np.array([2,3,5])
print(a)
print(a.shape)

## 2D arrays

In [None]:
a = np.array([[2],[3],[5]])
print(a)
print(a.shape)

a = np.array([[2,3,5]])
print(a)
print(a.shape)

We can change 1D numpy arrays into 2D numpy arrays using the function `reshape`

In [None]:
a = np.array([2,3,5]).reshape(3,1)
print(a)
print(a.shape)

In [None]:
a = np.array([2,3,5]).reshape(1,3)
print(a)
print(a.shape)

In [None]:
print(np.arange(1,10))
B = np.arange(1,10).reshape(3,3)
print(B)

## Transpose

In [None]:
print(B)

In [None]:
print(B.transpose())
print(B)

In [None]:
print(B.swapaxes(0,1))
print(B)

In [None]:
print(B.T)
print(B)

In [None]:
C = np.transpose(B)
print(C)

What happens when we try to take the transpose of 1D array?

In [None]:
a = np.array([[2,3,5]])
print(a.T)

But it works with 2D arrays

In [None]:
a = np.array([2,3,5]).reshape(3,1)
print(a)
print(a.T)

## Inner and outer products

Matrix multiplication is `np.dot(A, B)` for two 2D arrays.

In [None]:
A = np.random.rand(3, 2)
B = np.random.rand(2, 4)
C = np.dot(A,B)
print(C.shape)

b = np.array([5,6])

d = np.dot(A,b)
print(d.shape)

In [None]:
A = np.array([[1,3],[2,4]])
B = np.array([[2,1],[3,2]])
print(np.dot(A,B))
print(A@B)

In [None]:
a = np.array([1,2,3])
b = np.array([5,6,7])
#Inner Product
print(np.dot(a,b))
print(np.inner(a,b))

In [None]:
#Outer Product C[i,j] = a[i]*b[j]
C = np.outer(a,b)
print(np.shape(C))
print(C)

## Try to answer before running the code snippet

In [3]:
#clear
A = np.array([[2,1],[1,3]])
B = np.array([[1,3],[2,2]])
c = np.array([[1,2]])
d = np.array([[1],[2]])
c = np.array([1,2])

# What is the output of A@B?

In [4]:
print(np.dot(A,B))
print(A@B)

[[4 8]
 [7 9]]
[[4 8]
 [7 9]]


In [5]:
print(np.dot(A,d))
print(A@c)

[[4]
 [7]]
[4 7]


# numpy: Indexing

In [None]:
A = np.array([[1, 4, 9], [2, 8, 18]])
print(A)

In [None]:
A[1,2]

What's the result of this?

In [None]:
A[:,1]

And this?

In [None]:
A[1:,:1]

One more:

In [None]:
A[:,[0,2]]

## Try to answer before running the code snippet

In [6]:
myList = [1,2,3,4,5,6.5]
print(type(myList[0]))

<class 'int'>


In [9]:
vec1 = np.array(myList)
# What is the type of vec1[0]?

In [10]:
print(type(vec1[0]))

<class 'numpy.float64'>


In [None]:
#clear
A = np.array([[1, 4, 9, 3], [2, 8, 3, 18], [4, 5, 1, 2], [6, 4, 6, 3]])
print(A)

In [None]:
a = np.random.rand(3,4,2)
a.shape

In [None]:
a[...,1].shape

---

Indexing into numpy arrays usually results in a so-called *view*.

In [None]:
a = np.zeros((4,4))

Let's call `b` the top-left $2\times 2$ submatrix.

In [None]:
b = a[:2,:2]

What happens if we change `b`?

In [None]:
b[1,0] = 5

In [None]:
a

To decouple `b` from `a`, use `.copy()`.

In [None]:
b = b.copy()
b[1,1] = 7
print(b)
print(a)

## Try to answer before running the code snippet

In [None]:
#clear
A = np.array([[1, 4, 9, 3], [2, 8, 3, 18], [4, 5, 1, 2], [6, 4, 6, 3]])
b = A[:,2]
print(A)
d = b
print(np.shape(d))
d[2] = 2
print(A)

---

You can also index with boolean arrays:

In [None]:
a = np.random.rand(4,4)

In [None]:
a

In [None]:
a_big = a>0.5
a_big

In [None]:
a[a_big]

Also each index individually:

In [None]:
a_row_sel = [True, True, False, True]

In [None]:
a[a_row_sel,:]

---

And with index arrays:

In [None]:
a

In [None]:
x,y = np.nonzero(a > 0.5)

In [None]:
x

In [None]:
y

In [None]:
a[(x,y)]

# numpy: Broadcasting

In [None]:
import numpy as np

In [None]:
a = np.arange(9).reshape(3,3)
print(a.shape)
print(a)

In [None]:
b = np.arange(4, 4+9).reshape(3, 3)
print(b.shape)
print(b)

In [None]:
a+b

So this is easy and one-to-one.


---

What if the shapes do not match?

In [None]:
a = np.arange(9).reshape(3, 3)
print(a.shape)
print(a)

In [None]:
b = np.arange(3)
print(b.shape)
print(b)

What will this do?

In [None]:
a+b

It has *broadcast* along the last axis!

---

Can we broadcast along the *first* axis?

In [None]:
a

In [None]:
c = b.reshape(3, 1)
c

In [None]:
print(a.shape)
print(c.shape)

In [None]:
a+c

Rules:

* Shapes are matched axis-by-axis from last to first.
* A length-1 axis can be *broadcast* if necessary.

## Try to answer before running the code snippet

In [None]:
#clear
A = np.array([[2,4],[1,3]])
B = np.array([[1,3],[2,2]])
c = np.array([1,2])
print(np.shape(c))

In [None]:
print(B + np.dot(A,c))

In [None]:
#clear
# broadcasting in both axis:
a = np.arange(1,5)
b = np.arange(3,7)

d = b*a.reshape((4,1))
print(d)
print(np.shape(d))

# note this is the same as the outer product
print(np.outer(a,b))


## Vectorized computations

In [5]:
def func1(u, v, w):
    n = len(u)
    for i in range(n):
        w[i] = u[i] + 15.2 * v[i]
    
def func2(u, v, w):
    w = u + 15.2 * v


In [6]:
n = 10**7
u = np.random.rand(n)
v = np.random.rand(n)
w = np.zeros(n)


%timeit func1(u,v,w)
%timeit func2(u,v,w)

3.83 s ± 135 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
32.5 ms ± 395 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
