<sup>This notebook is adapted from the Lab01 notebook in https://github.com/data-8/data8assets and licensed for reuse under [Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](http://creativecommons.org/licenses/by-nc/4.0/).</sup>

# Exercise 1: Expressions <img src="kaip_logo_header.png" align="right">

In this exercise, you'll learn how to:

1. navigate Jupyter notebooks (like this one);
2. write and evaluate some basic *expressions* in Python, the computer language of the course.
3. learn about the main building blocks of programs, like functions and classes

# 1. Jupyter notebooks
This webpage is called a Jupyter notebook. A notebook is a place to write programs and view their results.

## 1.1. Text cells
In a notebook, each rectangle containing text or code is called a *cell*.

Text cells (like this one) can be edited by double-clicking on them. They're written in a simple format called [Markdown](http://daringfireball.net/projects/markdown/syntax) to add formatting and section headings.  You don't need to learn Markdown, but you might want to.

After you edit a text cell, click the "run cell" button at the top that looks like ▶| to confirm any changes. (Try not to delete the instructions of the lab.)

**Question 1.1.1.** This paragraph is in its own text cell.  Try editing it so that this sentence is the last sentence in the paragraph, and then click the "run cell" ▶| button .  This sentence, for example, should be deleted.  So should this one.

## 1.2. Code cells
Other cells contain code in the Python 3 language. Running a code cell will execute all of the code it contains.

To run the code in a code cell, first click on that cell to activate it.  It'll be highlighted with a little green or blue rectangle.  Next, either press ▶| or hold down the `shift` key and press `return` or `enter`.

Try running this cell:

In [2]:
print("Hello, World!")

Hello, World!


And this one:

In [3]:
print("\N{WAVING HAND SIGN}, \N{EARTH GLOBE ASIA-AUSTRALIA}!")

👋, 🌏!


The fundamental building block of Python code is an expression. Cells can contain multiple lines with multiple expressions. When you run a cell, the lines of code are executed in the order in which they appear. Every `print` expression prints a line. Run the next cell and notice the order of the output.

In [4]:
print("First this line,")
print("and then this one.")

First this line is printed,
and then this one.


**Question 1.2.1.** Change the cell above so that it prints out:

    First this line,
    then the whole 🌏,
    and then this one.

*Hint:* If you're stuck on the Earth symbol for more than a few minutes, try talking to a neighbor or a TA.  That's a good idea for any lab problem.

In [5]:
print("First this line,")
print("then the whole \N{EARTH GLOBE ASIA-AUSTRALIA}")
print("and then this one.")

First this line,
then the whole 🌏
and then this one.


## 1.3. Writing Jupyter notebooks
You can use Jupyter notebooks for your own projects or documents.  When you make your own notebook, you'll need to create your own cells for text and code.

To add a cell, click the + button in the menu bar.  It'll start out as a text cell.  You can change it to a code cell by clicking inside it so it's highlighted, clicking the drop-down box next to the restart (⟳) button in the menu bar, and choosing "Code".

**Question 1.3.1.** Add a code cell below this one.  Write code in it that prints out:
   
    A whole new cell! ♪🌏♪

(That musical note symbol is like the Earth symbol.  Its long-form name is `\N{EIGHTH NOTE}`.)

Run your cell to verify that it works.

In [6]:
print("a whole new cell! \N{EIGHTH NOTE}\N{EARTH GLOBE ASIA-AUSTRALIA}\N{EARTH GLOBE ASIA-AUSTRALIA}")

a whole new cell! ♪🌏🌏


## 1.4. Errors
Python is a language, and like natural human languages, it has rules.  It differs from natural language in two important ways:
1. The rules are *simple*.  You can learn most of them in a few weeks and gain reasonable proficiency with the language in a semester.
2. The rules are *rigid*.  If you're proficient in a natural language, you can understand a non-proficient speaker, glossing over small mistakes.  A computer running Python code is not smart enough to do that.

Whenever you write code, you'll make mistakes.  When you run a code cell that has errors, Python will sometimes produce error messages to tell you what you did wrong.

Errors are okay; even experienced programmers make many errors.  When you make an error, you just have to find the source of the problem, fix it, and move on.

We have made an error in the next cell.  Run it and see what happens.

In [8]:
print("This line is missing something.")

This line is missing something.


You should see something like this (minus our annotations):

<img src="error.jpg"/>

The last line of the error output attempts to tell you what went wrong.  The *syntax* of a language is its structure, and this `SyntaxError` tells you that you have created an illegal structure.  "`EOF`" means "end of file," so the message is saying Python expected you to write something more (in this case, a right parenthesis) before finishing the cell.

There's a lot of terminology in programming languages, but you don't need to know it all in order to program effectively. If you see a cryptic message like this, you can often get by without deciphering it.  (Of course, if you're frustrated, ask a neighbor or the lecturer for help.)

Try to fix the code above so that you can run the cell and see the intended message instead of an error.

## 1.5. Shortcuts and tips!

Pressing TAB can kick off tab-completion. For example, press TAB after `math.` below and see what happens:

In [9]:
import math
            

# 2. Numbers

Quantitative information arises everywhere in data science. In addition to representing commands to print out lines, expressions can represent numbers and methods of combining numbers. The expression `3.2500` evaluates to the number 3.25. (Run the cell and see.)

In [11]:
3.2500

3.25

Notice that we didn't have to `print`. When you run a notebook cell, if the last line has a value, then Jupyter helpfully prints out that value for you. However, it won't print out prior lines automatically.

In [16]:
print(2)
print(3)
4

2
3


4

Above, you should see that 4 is the value of the last expression, 2 is printed, but 3 is lost forever because it was neither printed nor last.

You don't want to print everything all the time anyway.  But if you feel sorry for 3, change the cell above to print it.

## 2.1. Arithmetic
The line in the next cell subtracts.  Its value is what you'd expect.  Run it.

In [17]:
3.25 - 1.5

1.75

Many basic arithmetic operations are built in to Python.  The textbook section on [Expressions](http://www.inferentialthinking.com/chapters/03/1/expressions.html) describes all the arithmetic operators used in the course.  The common operator that differs from typical math notation is `**`, which raises one number to the power of the other. So, `2**3` stands for $2^3$ and evaluates to 8. 

The order of operations is what you learned in elementary school, and Python also has parentheses.  For example, compare the outputs of the cells below. Use parentheses for a happy new year!

In [18]:
1+6*5-6*3**2*2**3/4*7

-725.0

In [19]:
1+(6*5-(6*3))**2*((2**3)/4*7)

2017.0

In standard math notation, the first expression is

$$1 + 6 \times 5 - 6 \times 3^2 \times \frac{2^3}{4} \times 7,$$

while the second expression is

$$1 + (6 \times 5 - (6 \times 3))^2 \times (\frac{(2^3)}{4} \times 7).$$

**Question 2.1.1.** Write a Python expression in this next cell that's equal to $5 \times (3 \frac{10}{11}) - 50 \frac{1}{3} + 2^{.5 \times 22} - \frac{7}{33}$.  That's five times three and ten elevenths, minus fifty and a third, plus two to the power of half 22, minus 7 33rds.  By "$3 \frac{10}{11}$" we mean $3+\frac{10}{11}$, not $3 \times \frac{10}{11}$.

Replace the ellipses (`...`) with your expression.  Try to use parentheses only when necessary.

*Hint:* The correct output should start with a familiar number.

In [20]:
5*(3+10/11)-(50+1/3)+(2**(0.5*22))-(7/33)

2017.0

# 3. Names
In natural language, we have terminology that lets us quickly reference very complicated concepts.  We don't say, "That's a large mammal with brown fur and sharp teeth!"  Instead, we just say, "Bear!"

Similarly, an effective strategy for writing code is to define names for data as we compute it, like a lawyer would define terms for complex ideas at the start of a legal document to simplify the rest of the writing.

In Python, we do this with *assignment statements*. An assignment statement has a name on the left side of an `=` sign and an expression to be evaluated on the right.

In [22]:
ten = 3 * 2 + 4

When you run that cell, Python first evaluates the first line.  It computes the value of the expression `3 * 2 + 4`, which is the number 10.  Then it gives that value the name `ten`.  At that point, the code in the cell is done running.

After you run that cell, the value 10 is bound to the name `ten`:

In [23]:
ten

10

The statement `ten = 3 * 2 + 4` is not asserting that `ten` is already equal to `3 * 2 + 4`, as we might expect by analogy with math notation.  Rather, that line of code changes what `ten` means; it now refers to the value 10, whereas before it meant nothing at all.

If the designers of Python had been ruthlessly pedantic, they might have made us write

    define the name ten to hereafter have the value of 3 * 2 + 4 

instead.  You will probably appreciate the brevity of "`=`"!  But keep in mind that this is the real meaning.

**Question 3.1.** Try writing code that uses a name (like `eleven`) that hasn't been assigned to anything.  You'll see an error!

In [24]:
eleven

NameError: name 'eleven' is not defined

 A common pattern in Jupyter notebooks is to assign a value to a name and then immediately evaluate the name in the last line in the cell so that the value is displayed as output. 

In [25]:
close_to_pi = 355/113
close_to_pi

3.1415929203539825

Another common pattern is that a series of lines in a single cell will build up a complex computation in stages, naming the intermediate results.

In [26]:
bimonthly_salary = 840
monthly_salary = 2 * bimonthly_salary
number_of_months_in_a_year = 12
yearly_salary = number_of_months_in_a_year * monthly_salary
yearly_salary

20160

Names in Python can have letters (upper- and lower-case letters are both okay and count as different letters), underscores, and numbers.  The first character can't be a number (otherwise a name might look like a number).  And names can't contain spaces, since spaces are used to separate pieces of code from each other.

Other than those rules, what you name something doesn't matter *to Python*.  For example, this cell does the same thing as the above cell, except everything has a different name:

In [27]:
a = 840
b = 2 * a
c = 12
d = c * b
d

20160

**However**, names are very important for making your code *readable* to yourself and others.  The cell above is shorter, but it's totally useless without an explanation of what it does.

According to a famous joke among computer scientists, naming things is one of the two hardest problems in computer science.  (The other two are cache invalidation and "off-by-one" errors.  And people say computer scientists have an odd sense of humor...)

**Question 3.2.** Assign the name `seconds_in_a_decade` to the number of seconds between midnight January 1, 2010 and midnight January 1, 2020.

*Hint:* If you're stuck, the next section shows you how to get hints.

In [31]:
# Change the next line so that it computes the number of
# seconds in a decade and assigns that number the name
# seconds_in_a_decade.
minute = 60 
hour = 60*minute
day = 24*hour
week = 7*day
month = 4*week
year = 12*month
seconds_in_a_decade = 10*year

# We've put this line in this cell so that it will print
# the value you've given to seconds_in_a_decade when you
# run it.  You don't need to change this.
seconds_in_a_decade

290304000

## 3.1. Comments
You may have noticed this line in the cell above:

    # Change the next line so that it computes the number of

That is called a *comment*.  It doesn't make anything happen in Python; Python ignores anything on a line after a #.  Instead, it's there to communicate something about the code to you, the human reader.  Comments are extremely useful.

<img src="http://imgs.xkcd.com/comics/future_self.png">

## 3.2. Application: A physics experiment

On the Apollo 15 mission to the Moon, astronaut David Scott famously replicated Galileo's physics experiment in which he showed that gravity accelerates objects of different mass at the same rate. Because there is no air resistance for a falling object on the surface of the Moon, even two objects with very different masses and densities should fall at the same rate. David Scott compared a feather and a hammer.

You can run the following cell to watch a video of the experiment.

In [32]:
from IPython.display import YouTubeVideo
# The original URL is:
#   https://www.youtube.com/watch?v=U7db6ZeLR5s
YouTubeVideo("U7db6ZeLR5s")

Here's the transcript of the video:

**167:22:06 Scott**: Well, in my left hand, I have a feather; in my right hand, a hammer. And I guess one of the reasons we got here today was because of a gentleman named Galileo, a long time ago, who made a rather significant discovery about falling objects in gravity fields. And we thought where would be a better place to confirm his findings than on the Moon. And so we thought we'd try it here for you. The feather happens to be, appropriately, a falcon feather for our Falcon. And I'll drop the two of them here and, hopefully, they'll hit the ground at the same time. 

**167:22:43 Scott**: How about that!

**167:22:45 Allen**: How about that! (Applause in Houston)

**167:22:46 Scott**: Which proves that Mr. Galileo was correct in his findings.

**Newton's Law.** Using this footage, we can also attempt to confirm another famous bit of physics: Newton's law of universal gravitation. Newton's laws predict that any object dropped near the surface of the Moon should fall

$$\frac{1}{2} G \frac{M}{R^2} t^2 \text{ meters}$$

after $t$ seconds, where $G$ is a universal constant, $M$ is the moon's mass in kilograms, and $R$ is the moon's radius in meters.  So if we know $G$, $M$, and $R$, then Newton's laws let us predict how far an object will fall over any amount of time.

To verify the accuracy of this law, we will calculate the difference between the predicted distance the hammer drops and the actual distance.  (If they are different, it might be because Newton's laws are wrong, or because our measurements are imprecise, or because there are other factors affecting the hammer for which we haven't accounted.)

Someone studied the video and estimated that the hammer was dropped 113 cm from the surface. Counting frames in the video, the hammer falls for 1.2 seconds (36 frames).

**Question 3.3.1.** Complete the code in the next cell to fill in the *data* from the experiment.

In [33]:
# t, the duration of the fall in the experiment, in seconds.
# Fill this in.
time = 1.2

# The estimated distance the hammer actually fell, in meters.
# Fill this in.
estimated_distance_m = 1.13

**Question 3.3.2.** Now, complete the code in the next cell to compute the difference between the predicted and estimated distances (in meters) that the hammer fell in this experiment.

This just means translating the formula above ($\frac{1}{2}G\frac{M}{R^2}t^2$) into Python code.  You'll have to replace each variable in the math formula with the name we gave that number in Python code.

In [34]:
# First, we've written down the values of the 3 universal
# constants that show up in Newton's formula.

# G, the universal constant measuring the strength of gravity.
gravity_constant = 6.674 * 10**-11

# M, the moon's mass, in kilograms.
moon_mass_kg = 7.34767309 * 10**22

# R, the radius of the moon, in meters.
moon_radius_m = 1.737 * 10**6

# The distance the hammer should have fallen over the
# duration of the fall, in meters, according to Newton's
# law of gravity.  The text above describes the formula
# for this distance given by Newton's law.
# **YOU FILL THIS PART IN.**
predicted_distance_m = 1/2*gravity_constant*moon_mass_kg/(moon_radius_m**2)*(time**2)

# Here we've computed the difference between the predicted
# fall distance and the distance we actually measured.
# If you've filled in the above code, this should just work.
difference = predicted_distance_m - estimated_distance_m
difference

0.040223694659304865

# 4. Lists

It is very convenient, including in machine learning, to want to structure data. The most basic data structure in Python is the sequence. Each element of a sequence is assigned a number - its position or index. The first index is zero, the second index is one, and so forth.
One of the most common and important sequences is the List.

In [36]:
# Creating a list is as simple as putting different comma-separated values between square brackets.

list1 = ['physics', 'chemistry', 1997, 2000];
list2 = [1, 2, 3, 4, 5 ];
list3 = ["a", "b", "c", "d"]

In [39]:
# To access values in lists, use the square brackets with the index to obtain value available at that index
# Watch out ! In Python, indices start at 0.

print(list1[0])
print(list2[3])

# You can also access a range of values with "slicing", specifying start and end indices, separating by a colon

print(list3[-1])

physics
4
d


# 5. Control Flow

In the programs we have seen until now, there has always been a series of statements faithfully executed by Python in exact top-down order. What if you wanted to change the flow of how it works?


## 5.1 For loop

The for loop is very convenient if, for example, you want to perform repeated operations on a series of numbers. It allows you to automate this, rather than writing everything one by one.

In [40]:
# Let's say we want to double each number in the following list and print it.

a_list = [5, 10, 7, 3.5, 9]

# We could simply write 

print(2 * a_list[0])
print(2 * a_list[1])
# etc...

# But imagine if the list contains thousands of numbers !
# This can be very conveniently done with a for loop.
# Notice how the syntax is similar to spoken english, which is one of the beauties of Python.

for element in a_list:
    print(2 * number)
    
print('hi')

for index in range(4):
    print(a_list[index])

10
20
10
20
14
7.0
18
hi


## 5.2 If statement

This control flow is central to programming. It allows you to perform different operations depending on what happens in the program. 

This can be for basic arithmetic
"If the variable if greater than 10, double it. Else, add 2 to it."


But it's the same idea for very advanced applications like self driving cars
"If there is a pedestrian, stop. Else, keep driving. "

In [None]:
### Perform different arithmetic operations depending on the value of x.

# Notice the basic equality conditions: 
# x greater than z: x > z (greater than or equal is >=)
# x smaller than z: x < z (smaller than or equal is <=)
# x equals to z: x == z 

# Change the value of x to see how the value of y is impacted.


x = -50

if (x >= 50):
    
    y = 2 * x
    
elif (x == 10):
    
    y = x + 6
    
else:
    y = -x
    
print(y)
    

# 6. Functions
## 6.1 Calling Functions

The most common way to combine or manipulate values in Python is by calling functions. Python comes with many built-in functions that perform common operations.

For example, the `abs` function takes a single number as its argument and returns the absolute value of that number.  The absolute value of a number is its distance from 0 on the number line, so `abs(5)` is 5 and `abs(-5)` is also 5.

In [None]:
abs(5)

In [None]:
abs(-5)

## 6.2 Defining functions

You could also define your own functions to perform a set of operations. Defining functions involves using the `def` keyword. 

In [None]:
# For example, we define a multiplication function as follows:
def multiplication(x, y):
    return x * y

# And then the function can be called using its name and by providing its input:
print(multiplication(2, 3))

**Question 6.2.1** Define a function that takes the height and width of an object, and returns its perimeter as an output. Test your function to calculate the perimeter of The Dubai Frame. 

In [None]:
... 

# 7. Classes

Python supports the programming paradigm called Object Oriented Programming. It allows you to define "real world" objects, their behavior, and the way they interact with each other. This makes coding very intuitive and natural, as we can replicate the world we experience around us.
The great thing about classes is that once you've implemented the class cat (see below), you can reuse it and create as many objects (cats) as you want.

In [None]:
class cat:
    # The __init__ function is the most important one for a class,
    # without which it would not work.
    # This function gets called when you create an object of this class (like in the cell below),
    # and it assigns the parameters to the object
    
    def __init__(self, name, age):
        # Notice how the keyword 'self' appears in every function:
        # this simply refers to the object itself, to which we for example can assign
        # parameters, like the way we assign the name below
        self.name = name
        self.age = age 
        self.legs = 4 
        
    def __repr__(self):
        # The __repr__ function is useful, as it allows you to call 'print' on your object,
        # and output any sentence you want
        return "This is a cat, his name is " + self.name
    
    def meow(self):
        # Besides a few predefined functions such as __init__ and __repr__
        # you can define ANY function you want, to define behaviour for your class.
        # For instance, a cat will meow and eat, or a car will turn.
        print("MEOOOOW")
        
    def eat(self, cat_food):
        # A beautiful thing about objects, is that they can interact with each other.
        # Here, the cat interacts with an object 'catFood' (defined below), and can 
        # modify ther attributes.
        if (cat_food.num_chunks>0):
            cat_food.num_chunks -= 1
        else: 
            print ("No more food :(")
    
    
class catFood:
    def __init__(self, num_chunks):
        self.num_chunks = num_chunks
    
    def __repr__(self):
        return "There is " + str(self.num_chunks) + " chunks of cat food"

In [None]:
Cookie_the_cat = cat(name = "Cookie", age = 10)
Bob_the_cat = cat(name = 'Bob', age = 2)
food = catFood(num_chunks = 100)

### Note: we have defined a class, and this class has methods (or functions) to define its behavior 
### How do we call the functions of an object? Simply object.function()
### /!\ You don't need to include the 'self' argument.

Cookie_the_cat.meow()

### Another cool thing about objects, is that you can change their variables after you've defined them ! 

print("The cat is " + str(Cookie_the_cat.age) + ' years old')

Cookie_the_cat.age = 4 # You can modify this

print("Now he cat is " + str(Cookie_the_cat.age) + ' years old')


** Question ** Print the number of chunks of cat food. Note that the class 'catFood' has the method '__repr__', so you can call print() directly on an object of the class catFood.
Make Cookie eat some food, then print the number of chunks again.