Chapters 3 and 4: Programming in Python
=========

**Note:** This chapter is very important, especially if you have no previous coding experience. I would suggest trying to familiarize yourself as soon as possible with writing code since this will make your life a lot easier when it comes to homeworks and labs (and even exams). I will include some practice exercises in the end of the notebook for you to practice on your own time.

**Note 2:** I will try to take a different stance in talking about python compared to the textbook since just repeating the textbook material won't be of too much help.

<h2> Table of Contents </h2>

* **<a href="#basics">Python Basics</a>**
    * <a href="#calc">Simple Calculator</a>
    * <a href="#bnum">Beyond numbers</a>
    * <a href="#data">Basic Data types</a>
    * <a href="#basic_exe">Python Basics Exercises</a>
* **<a href="#advanced">Advanced Programming</a>** 
    * <a href="#pacs">Packages and built-in functions</a>
    * <a href="#arrays">Arrays</a>
    * <a href="#ranges">Ranges</a>
    * <a href="#advanced_exe">Advanced Programming Exercises</a>

<a id="section"></a>
<h2> Python Basics </h2>

**Python comments**

These are parts of your code that the compiler ignores and which you can add to explain what you are doing

For example:

In [1]:
# This is a one line comment

<a id="calc"></a>
**Simple calculator**

In [2]:
#In its simplest form, you can think of Python as a fancy calculator
#Much like any calculator, you can type in an expression in a cell and press shift+Enter to run the cell
#e.g.
1+2

3

In [3]:
#Note: the notebook displays the value of the last expression in the cell
1+2 #You can't see me
2+2 #I am the last expression so you can see me

4

In [4]:
# Operations in Python
# Addition
1+2 

3

In [5]:
# Subtraction
2-1

1

In [6]:
# Multiplication
2*3

6

In [7]:
# Division
1/2

0.5

In [8]:
# Exponentiation
2**3 # base**exponent, 2 raised to the 3rd power

8

In [9]:
# Remainder
5%2 # Use the % sign to express the remainder of the euclidean division between 5 and 2

1

In [10]:
# Order of operations: These are some rules that dictate which operation happens first
# This is no different than what you learned in highschool algebra
1 + 2**2 / 2 * 5 # First, exponentiation takes place, then division and multiplication and finally addition and subtraction

11.0

In [11]:
# If we want to give priority to some subexpressions, we use ()
(1+2**2)/(2*5) #Everything inside parentheses will be evaluated first and then we follow the rules of the order of operations

0.5

<a id="bnum"></a>
**Beyond numbers**

* **_Variables_**

In [12]:
#variables is the way we can "save" a particular value into a variable name and use it just as we would use a number
#try to keep your variable names simple and as indicative of the variable content as possible
number1 = 1 #assign the variable a to the value 1
number2= 2
result = number1+number2
result

3

<a id="data"></a>
* **_ Basic Data types_**

|       Type       |                     Example                    | Description                                                                                                                                                                                                                                                                                                               |
|:----------------:|:----------------------------------------------:|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Integer (`int`)  |                   `age = 19`                   | In Python, we refer to integers as whole numbers. You can think of those as every number without a decimal place. Remember though that 5.0 is NOT an integer but 5 is.                                                                                                                                                    |
| Float (`float`)  | `macs_per_student = 1/3` OR `one_half = 0.5`   | Float numbers are numbers with decimals. You can use them in operations just like integers. As we saw above, the result of an operation between integers can give you a float number.                                                                                                                                     |
|  String (`str`)  | `student_name = 'Tom'` OR `student_name="Tom"` | Strings are a data type meant to denote regular text. There are two ways to denote a string; either with a single quote ' or with double quotes " as in the example. Beware that in general, you cannot use the same operations we used on integers on strings (although there are some exceptions we will discuss later) |
| Boolean (`bool`) | `1 > 2` OR `True`                              | Booleans are a special data type which is the result of a logical operation. You can think of them as the answer to a true or false question. A variable can either have a value of `True` or `False`, or a logical expression that evaluates into one of those.                                                          |

In [13]:
# Strings 
name = "Vasilis" # same as name='Vasilis'

#Concatenating two strings into one using the + operator
greeting = "Hello " + name + "! It is nice to meet you."
print(greeting) #Prints text or even any variable

#This ONLY works if you concatenate two strings. If you attemot to concatenate a string with a number this won't work
# You would have to turn the number into a string and then concatenate the two
age = 19
print("I am "+str(age)+" years old.")

Hello Vasilis! It is nice to meet you.
I am 19 years old.


In [14]:
# Booleans
# Booleans usually arise when we use what we call logical operators
# The most important ones are ==, >, <, >= and <=
test1 = 1>=1 #greater than or equal
print(test1)
test2 =  2==2 #equal
print(test2)
test3 = 3<2 # smaller than
print(test3)
#You can also assign a boolean to be True or False
true = True
false = False
print(true==false)

True
True
False
False


In [15]:
# Bonus Booleans
#You can also compare strings using operators like ==, < or > just as with numbers (alphabetical order)
print("cat" < "dog") # True
print("Cat"<"cat") # Capital letters take priority over lowercase ones True
print("kitty"<"kite") # False

True
True
False


<a id="basic_exe"></a>
** Python Basics Exercices**

1) My brother is X years old where X is selected at random from a list of ages. Given X, calculate the year in which he will turn 100.

In [27]:
import numpy as np # Numpy is a very important package that we will be using in the class. Ignore that statement for now
age = np.random.choice(np.arange(1,20)) #randomly selects an age between 1 and 19 (ignore for now)
year =  ? #Replace the ? with your answer
print("My brother is "+str(age)+" years old.")
print("He will turn 100 in "+str(year))

My brother is 7 years old.
He will turn 100 in 2110


2) In the context of Euclidean Division, we have the divident, the divisor, the quotient and the remainder (look them up if you don't remember what they are). If I give you the divident and the divisor, can you give me the quotient and the remainder of the euclidean division? Print your results. Your output should look like this:

Divisor: X

Divident: Y

Quotient: Z

Remainder: A

*Hint 1: You may wanna take a look at what the function `int()` does when you call it with a float number. E.g. `int(1/3)`

*Hint 2: Remember that the function `str()` is very useful when trying to print integers. 

In [40]:
# Randomly select a pair of divident and divisor
divident = np.random.choice(np.arange(1, 100)) 
divisor = np.random.choice(np.arange(1, 6))

3) Find if a number is odd or even. Here, we will have to you an if statement. I encourage you to look up how the if statement is used. Replace the `<condition>` with a boolean expression (eg `1==2` or `1>2` etc)

In [46]:
number = np.random.choice(np.arange(1,1000))
if <condition>:
    print(str(number) + " is odd.")
else:
    print(str(number) + " is even.")

941 is odd.


4) Given the expressions, add the necessary parenthesis to get the desired results. When you are done all the print statements should be True.

In [47]:
#example
print(1+2/3==1)
#Should be changed to
print((1+2)/3==1)

False
True


In [62]:
print(1+2/3+4-5*6+7+8+9/10 == -10)
print(2**3+4-(-1)**3*24-4/4+2 == 12)

False
False


In [21]:
#SOLUTIONS
#1
year = 2017 - age + 100

#2
print("Divident is "+str(divident))
print("Divisor is "+str(divisor))
quotient = int(divident/divisor)
remainder = int(divident%divisor)
print("Quotient is "+str(quotient))
print("Remainder is "+str(remainder))

#3
if number%2==1:
    print(str(number) + " is odd.")
else:
    print(str(number) + " is even.")

#4
print((1+2)/3+4-5*(6+7+8+9)/10 == -10)
print(2**3+(4-(-1)**3*24-4)/(4+2) == 12)

<a id="advanced"></a>
<h2> Advanced Programming</h2>

<a id="pacs"></a>
**Packages and built-in functions**

_Built-in Functions_

A `function` in programming is a procedure that takes in some input (usually refered to as argument) and returns some result from processing whatever that input was. In Python and other programming languages, there are two kinds of functions; those that already exist and are preloaded in the language (built-in funtions) and the ones the user can has written by himself (user-defined functions). We will take a look at some useful built-in functions and later on in the course we will learn to write our own. 

_Packages_

From now on we will be working quite a bit with what we will refer to as packages. The `datascience` or the `numpy` packages are just two examples of packages that you will be using extensively in this class. You can think of those as collections of `functions`, collections of code that given some input, allow you to automatically perform some operation. In the beginning of every notebook you will typically find a cell consisting of code that looks like this

`import numpy as np`

`from datascience import *`

`import matplotlib.pyplot as plt`

`import math`

Although you don't really need to know how these statements work, it is good to have an understanding of why they are there. We talked about built-in functions and briefly mentioned user-defined functions, however there are also functions that other people wrote and which are made available through packages such as numpy, math or others. In order to use those, we have to refer to them not just by their name but also by their package. So you can use a package function with `<package_name>.<function_name>` after you do `import <package_name>` or, alternativey you can directly import that function from the package and refer to it by its name. For example, I can do `from <package_name> import <function_name>` and then call the function directly with `<function_name>`.

In [77]:
#Built-in functions
#These are examples of some built-in functions that you could find yourself using in the class

#abs
abs1 = abs(-6) #take the absolute value of a number (float or integer)
abs2 = abs(1.329)
abs1, abs2

(6, 1.329)

In [83]:
#round
#The round function is used to round float numbers. It takes two arguments (input variables).
#The first one is the number we want to round and the second is the number of decimals we want to keep after we round
# e.g round(<some float>, 3) round the float to a 3 decimal place float
round_to_integer = round(1.987) # same as round(1.987, 0)
round_to_first_decimal = round(1.8633, 1)
round_to_second_decimal = round(1.9873737363, 3)
round_to_integer, round_to_first_decimal, round_to_second_decimal

(2, 1.9, 1.987)

In [96]:
#int
#The function int() is only keeps the integer part of a float number. It should not be confused with the round function!
# Note the difference
print(int(1.65))
print(round(1.65))
#int does not perform any rounding. It just keeps the integer part.
#more examples
print(int(1))
print(int(1.00000001))
print(int(1.99999999))

#Another use of int() is for converting an integer from a string format to an integer format
five = '5' # five+1 will cause an error
five = int(5)
five+1

1
2
1
1
1


6

In [89]:
#str
# As we have already seen, str() converts everything to a string. In our case, it is useful mostly when you are trying 
#to incorporate some numeric result to a string for printing.
month = "January"
day = 16
year = 2017
print("The date is "+month+" "+str(16)+", "+str(year))

The date is January 16, 2017


In [92]:
# max and min
max_of_2 = max(1,2)
max_of_3 = max(1,2,3)
min_of_2 = min(1,2)
min_of_3 = min(1,2,3)
#Note: We also use min and max with 
max_of_2, max_of_3, min_of_2, min_of_3

(2, 3, 1, 1)

As you can imagine this is not an exhaustive list of all the built-in functions in Python but these are good for now. We will learn about others later.

In [97]:
#importing functions from packages
#Let's use some functions from the math package
import math

math.log(1)

0.0

In [98]:
#An equivalent way of writing the above would be 
from math import log

log(1)

0.0

<a id="arrays"></a>
**Arrays**

Although there are more types of sequences like lists and tuples, the only sequance we will be using in this class is the numpy array. We refer to it as the numpy array because it is the sequence that is used by that package. To create an array, you first have to `import numpy as np` and then assign your array to a variable. For example `my_array = make_array(1,2,3)` . Now let's explore the properties of arrays.

In [99]:
from datascience import * #ignore this line

In [110]:
my_array = make_array(1,2,3) #This is how I make an array with elements 1,2 and 3. They don't have to be integers.
names = make_array('Tom', 'Tonny', 'Trevor') #In fact an array could contain strings as well

In [134]:
#The reason we use arrays over other types of sequences is that they make it really easy to perform operations on them
#P.S. You can also think (and use) numpy arrays like vectors in linear algebra.
array1 = make_array(1,2,3)
array2 = make_array(3,2,1)
twos = make_array(2,2,2)

#In fact we can add arrays! (element-wise addition)
print(array1+array2)

# Or subtract them (element-wise subtraction)
print(array1-array2)

# Or even raise one array to the power of another!
print(array1**twos) # Note that each element is raised to the second power

# Compare all elements in an array with a number - returns an array of booleans
print(array1==1) #Asks the question to each element: Are you equal to 1
print(array1<3) # Asks the question to each element: Are you less than 3

#We can also do operations between arrays and numbers for example:
result1 = array1+1 #Adds 1 to every element in the array
result2 = array1*2 #Multiplies every element in the array by 2
print(result1)
print(result2)

[4 4 4]
[-2  0  2]
[1 4 9]
[ True False False]
[ True  True False]
[2 3 4]
[2 4 6]


In [146]:
#Array indexing - A way of selecting an element from an array
a = make_array(4,2,1,6,3)
#How do I extract the first element from array a?
first_element = a.item(0)
second_element = a.item(1)
print(a)
print("First element is: "+str(first_element))
print("Second element is: "+str(second_element))
#The reason we use .item(0) for the first element and not .item(1) is simply a Python convention.
#We say that Python is a 0-indexed language. Other languages have their first element in position 1.

[4 2 1 6 3]
First element is: 4
Second element is: 2


In [None]:
# Operations on numpy arrays
# Numpy contains many functions which we can use on arrays. Here are some of them long with some examples.
import numpy as np # instead of doing numpy.<function_name> we want to do np.<function_name> to save time

In [128]:
#np.sum - Sums all elements in an array.
ar = make_array(1,2,3,4)
total = np.sum(ar) # 1+2+3+4
total

10

In [129]:
#np.prod - Takes the product of all elements in an array.
ar = make_array(1,2,3,4)
total = np.prod(ar) # 1*2*3*4
total

24

In [130]:
#np.mean - Takes the mean (average) of all the elements in an array
scores = make_array(97, 85, 73, 90, 88) #Midterm scores corresponding to a section
section_average = np.mean(scores)
section_average

86.599999999999994

In [131]:
#np.diff - Returns an array of the differences between back to back elements in an array
violent_crimes = make_array(4, 8, 11, 5, 7, 3, 1, 0, 10, 13, 6, 9) # number of violent crimes per month in the city of Berkeley
monthly_change = np.diff(violent_crimes) # make_array(8-4, 11-8, 5-11, ...)
monthly_change #monthly_change is the change in the number of crimes in consecutive months

array([ 4,  3, -6,  2, -4, -2, -1, 10,  3, -7,  3])

In [132]:
#np.count_nonzero - To be used with an array of booleans as input. Counts the number of True in the array
dummy_array = make_array(1,4,2,5,3,4,5,8,2,6)
#find how many values in this array are less than 5
less_than_five = dummy_array<5
n_less_than_5 = np.count_nonzero(less_than_five)
print("Number of items in the array that are less than 5 is: " +str(n_less_than_5))

Number of items in the array that are less than 5 is: 6


For more numpy functions check out the course textbook <a href="https://www.inferentialthinking.com/chapters/04/4/arrays.html">here</a>.

<a id="ranges"></a>
**Ranges**

_Textbook definition:_ A range is an array of numbers in increasing or decreasing order, each separated by a regular interval. 

To construct a range, we use numpy's `np.arange` function.

The structure for the the expression is `np.arange(<start>, <end>, <step>)`. (When the step is 1 we can ommit the step and use `np.arange(<start>, <end>)`)

Example: `np.arange(1, 11)` (same as `np.arange(1, 11, 1)`) gives aan array of numbers 1 through 10 in order
(1,2,3,4,5,6,7,8,9,10)

start: The first element of the range.

end: The last element of the range. **Important:** The last element of the range is **NOT** included in the range. 

step: By how much you want your range to increase.

**Note: ** You can also ommit the start and step and only provide the end and just write something like `np.arange(5)` which will assume that you have set start=0, step=1 and end=5 

In [149]:
#make an array of all the numbers from 0 to 100
zero_to_100 = np.arange(101) # or zero_to_100 = np.arange(0,101) or zero_to_100 = np.arange(0,101, 1)
print(zero_to_100)
print('--------------')
#make an array with all the multiples of 2 between 0 and 100 inclusive
even_0_to_100 = np.arange(0, 101, 2)
print(even_0_to_100)

[  0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
  90  91  92  93  94  95  96  97  98  99 100]
--------------
[  0   2   4   6   8  10  12  14  16  18  20  22  24  26  28  30  32  34
  36  38  40  42  44  46  48  50  52  54  56  58  60  62  64  66  68  70
  72  74  76  78  80  82  84  86  88  90  92  94  96  98 100]


<a id="advanced_exe"></a>
**Advanced Programming Exercises**

1) Without using the built-in function max, find the maximum of 3 numbers a , b, c. Your output should be the same should be the same as the output of max(a,b,c). Do not use max.

**Hint: ** You could do that by using if statements but an easier way would be to look at the built-in functions for arrays.

In [153]:
a = int(input("a"))
b = int(input("b"))
c = int(input("c"))

a4
b2
c0
Maximum is: 4


2) According to the Oakland Police Department End of Year Report (found <a href="http://www2.oaklandnet.com/oakca1/groups/police/documents/webcontent/oak062295.pdf">here</a>) the historical total number of crimes per year in Oakland is the following.

| 2012   | 2013   | 2014   | 2015   | 2016   |
|--------|--------|--------|--------|--------|
| 33,685 | 33,965 | 31,612 | 31,470 | 29,919 |

Find how many years saw more crime than 2012 and how many saw less crime.

In [155]:
total_crimes_by_year =  make_array(33685, 33965, 31612, 31470, 29919)

In [None]:
#SOLUTIONS
#1
ar = make_array(a,b,c)
ar_sorted = np.sort(ar)
print("Maximum is: "+str(ar_sorted.item(2)))

#2
total_change = total_crimes_by_year-total_crimes_by_year.item(0)
print("Increase compare to 2012: "+str(np.count_nonzero(total_change>0)))
print("Decrease compare to 2012: "+str(np.count_nonzero(total_change<0)))