# 1. Very simple 'programs'
## 1.1 Running Python code
Typically, in order to run Python code, you can run it directly from the command line. But with this Jupyter Notebook you can type the commands in the fields and execute them, thus running code in your browser.<br>

To start, we will print a "Hello World". In the field below, type:<br>
```python
print('Hello, World')
```
Then press `<shift> + <return>` to execute the command. 



In [None]:
# type the code below this line
print('Hello, World')

Hello, World


What happened?

You just created a program, that prints the words 'Hello, World'. The Python environment that you are in immediately [compiles](https://en.wikipedia.org/wiki/Compiler) whatever you have typed in. This is useful for testing things, e.g. define a few variables, and then test to see if a certain line will work. That will come in a later lesson, though.

## 1.2 Math in Python

Next, we will introduce you to some basic math exercises.

Type<br>
```python
1 + 1
```
and execute the code.

In [None]:
 1 + 1

2

Now type
```python
20 + 80
```
and execute the code.

In [None]:
20 + 80


100

These are additions. We can of course use other mathematical operators.<br>
Try this subtraction:<br>
```python
6 - 5
```

In [None]:
6 - 5

1

and this multiplication:<br>
```python
2 * 5
```

Try:
```python
5 ** 2
```

`**` is the exponential operator, so we executed 5 squared.

Type:
```python
print('1 + 2 is an addition')
```

You see that the `print` statement writes something on the screen.

Try this:
```python
print('one kilobyte is 2^10 bytes, or', 2 ** 10, 'bytes')
```

This demonstrates that you can print text and calculations in a sentence. The commas separating each section are a way of separating strings (text) from calculations or variable.

Now try this:
```python
23 / 3
```

And this:<br>
```python
23 % 3
```

`%` returns the remainder of the division.

## 1.3 Order of Operations

Remember that thing called order of operation that they taught in maths? Well, it applies in Python, too. Here it is, if you need reminding:<br>
1. Parenthesis `()`
2. Exponents `**`
3. Multiplication `*`, division `/` and remainder `%`
4. Addition `+` and subtraction `-`

Here are some examples that you might want to try, if you're rusty on this:<br>
```python
1 + 2 * 3
(1 + 2) * 3
```

## 1.4 Comments, Please
The final thing you'll need to know to move on to multi-line programs is the comment. Type the following (and yes, the output is shown):
```python
# I am a comment. Fear my wrath!
```

A comment is a piece of code that does not run. In Python, you make something a comment by putting a hash (#) in front of it. A hash comments everything after it in the line, and nothing before it. So you could type this:
```python
print("food is very nice") #this is a print sentence
```

This results in a normal output, without the smutty comment, thank you very much.

Now try this:
```python
# print("food is very nice")
```

Nothing happens, because the code was after a comment.

Comments are important for adding necessary information for another programmer to read, but not the computer. For example, an explanation of a section of code, saying what it does, or what is wrong with it. You can also comment bits of code by putting a `#` in front of it - if you don't want it to compile, but can't delete it because you might need it later.

# 2. Programs in a file, variables and strings


## 2.1 Writing scripts
Python programs are simply text documents - you can open them up in notepad, or in an [IDE](https://en.wikipedia.org/wiki/Integrated_development_environment#:~:text=An%20integrated%20development%20environment%20(IDE,automation%20tools%20and%20a%20debugger.) like [Visual Studio Code](https://code.visualstudio.com/) or [Sublime](https://www.sublimetext.com/) and edit them. Think of it as a collection of statements and commands that execute top to bottom with some predefined rules. If you copy the below code cell and save it in a file poem.py, you got yourself a first program that could be executed from the command line by:

```
python poem.py
```

However, this is Jupyter Notebook and in a way it is a python program. If you execute the below code by clicking on the "Play" button, it will execute the statements from top to bottom like a normal .py program would.

In [None]:
#A simple program.
print("Mary had a little lamb")
print("it's fleece was white as snow;")
print("and everywhere that Mary went", end = " ")
print("her lamb was sure to go.")

Mary had a little lamb
it's fleece was white as snow;
and everywhere that Mary went her lamb was sure to go.


However, there is one major distinction between traditional Python programs and Jupyter Notebooks: in a notebook a cell can be run by itself or it can be run as part of a group of cells. Code cells are also not isolated though. A code cell could refer to a variable or function defined in another cell and everything will work fine if the cells are executed in the right sequence. For instance: below we have two code cells, in the first cell we define the  variable "foo" and in the second code cell we print the variable "foo". 

Everything will work fine if you execute the first cell and then the second. If you try to execute the second cell without the first one you will receive a NameErroor exception because the second cell does not know what "foo" is. Go ahead try it. **Keep this in mind when working with notebooks.**



In [None]:
foo = "This is the foo variable and it is a string"

In [None]:
print(foo)

This is the foo variable and it is a string


## 2.2 Variables
Now let's start introducing variables. Variables store a value, that can be looked at or changed at a later time. Let's make a program that uses variables:

In [None]:
#variables demonstrated
print("This program is a demo of variables")
v = 1
print("The value of v is now", v)
v = v + 1
print("v now equals itself plus one, making it worth", v)
v = 51
print("v can store any numerical value, to be used elsewhere.")
print("for example, in a sentence. v is now worth", v)
print("v times 5 equals", v * 5)
print("but v still only remains", v)
print("to make v five times bigger, you would have to type v = v * 5")
v = v * 5
print("there you go, now v equals", v, "and not", v / 5)

This program is a demo of variables
The value of v is now 1
v now equals itself plus one, making it worth 2
v can store any numerical value, to be used elsewhere.
for example, in a sentence. v is now worth 51
v times 5 equals 255
but v still only remains 51
to make v five times bigger, you would have to type v = v * 5
there you go, now v equals 255 and not 51.0


Run the script and try to understand the results.

Note that we can also write `v = v + 1` as `v += 1`. This can be used for all operators (e.g. `-=`, `*=`,`/=`). Try it in the code above.

It is good practice to use lowercase or camelCase for variables. Don't use special characters and don't start with a number!

## 2.3 Strings
As you can see, variables store values, for use at a later time. You can change them at any time. You can put in more than numbers, though. Variables can hold things like text. A variable that holds text is called a string. Try this program:

In [None]:
#giving variables text, and adding text.
word1 = "Good"
word2 = "morning"
word3 = "to you too!"
print(word1, word2)
sentence = word1 + " " + word2 + " " + word3
print(sentence)

As you see, the variables above were text based. Variable names can have any name like: `word1`, `word2`, and `word3`. 

As you can also see, strings can be added together to make longer words or sentences. However, when you add two string variables, it will not add spaces in between the words. To add spaces between two words you can use `" "` as we did in the example above.

Try the following code and explain what it does:

In [2]:
text = "Anyone_AI"
len(text)

9

Yes, it shows us the amount of characters in a string.

Now try this:

In [None]:
print(text[4])

n


Here we wanted to print the character at position 4. Note that the first character "A" is at position 0! So position 4 gives us back "n".

Now try:

In [None]:
print(text[:4])

Anyo


In [None]:
print(text[4:])

ne_AI


Here you see that `[:4]` selects characters 0,1,2,3, which is "Anyo". With `[4:]` we start with position 4 (counting from 0!) until the end of the string, which results in "ne_AI".

We can also specify a range. Try this:

In [None]:
print(text[4:8])

ne_A


Was it what you expected?
With ranges, the maximum value is not included in the selection.

Now try this:

In [None]:
print(text[-2])

A


What did it do?

Right, you have selected the second character from the right (like if you read from righ to left)! Note that here we start counting with -1 for the first character on the right.

Try this:

In [3]:
print(text[-4:])

e_AI


In human language: start with the 4th character from the right and give me all characters from that position to the end of the string.

You can also add values inside of strings, the {} and .format() function:

In [None]:
sentence = "Hello world" 
nrOfCharacters = len(sentence)
print("The sentence 'Hello World' has {} characters".format(nrOfCharacters))

The sentence 'Hello World' has 11 characters


So you can easily insert variables inside strings using `{}`. The variable will be printed in the place where you have the `{}`, inside the string. The variable has to go inside the parentheses of `.format()`

This can also be written as:

In [None]:
print(f"The sentence has {nrOfCharacters} characters")

There are also other operations that we can apply to strings.
We can count the number of occurences of a specific character in a string:

In [None]:
print(sentence.count('o'))

2


So, "Hello World" has 2 characters 'o'

We can also find the position of character:

In [None]:
print(sentence.find('e'))

1


The character 'e' is in the 2nd position (remember we count 0, 1, 2)

In [None]:
sometext = "Hey you, how are you doing?"
print(sometext.rfind("you"))

`rfind` returns the lasts occurrence of a string. So in `sometext` we have the word "you" twice. `rfind` returns `17` meaning that the last time that it found "you" is starting at position 17 (counting from 0).

There are a few other useful string operations. Run the code below and will be obvious what it does:

In [None]:
# Changes the string to upper case
print(sometext.upper())

# Splits the string on a character and returns it as list items. You'll learn about lists later
print(sometext.split(","))

# Replaces strings
print(sometext.replace("?","!"))

There are also some special characters:

`\n` jumps to a new line

`\` is an escape character. You can put it before another character that has a meaning in the code and is not considered a string. This is often used to have strings with backslashes for file names on Windows (e.g. `"C:\\folder\\filename.txt"`). Because `\` is already an escape character we need to use it twice to escape the escape character!

Examples:

In [None]:
print("This is a very long sentence and I want to split it into two lines.")
print("This is a very long sentence\nand I want to split it into two lines.")

print("This sentence contains a quote and I don't want the string to end (yet)\"")

This is a very long sentence and I want to split it into two lines.
This is a very long sentence
and I want to split it into two lines.
This sentence contains a quote and I don't want the string to end (yet)"


# 3. Loops, Loops, Loops, Loops...


## 3.1 Introduction
This is our final lesson before we get into interacting with human input. Can't wait, can you?

Just imagine you needed a program to do something 20 times. What would you do? You could copy and paste the code 20 times, and have an unreadable program, not to mention it will be very slow. Or, you could tell the computer to repeat a chunk of code between point A and point B, until the time comes that you need it to stop. Such a thing is called a loop.

## 3.2 The 'While' loop
So let's assume we want to write a simple program that prints numbers from 1 to 10. The following are examples of a type of loop, called the 'while' loop:


In [None]:
a = 0
while a < 10:
    a = a + 1
    print(a)

1
2
3
4
5
6
7
8
9
10


How does this program work? Lets go through it in English:
```
'a' now equals 0
As long as 'a' is less than 10, do the following:
   (a) Make 'a' one larger than what it already is.
   (b) Print on-screen what 'a' is now worth.
```

What does this do? Let's go through what the computer would be 'thinking' when it is in the 'while' loop:

```
#JUST GLANCE OVER THIS QUICKLY
#(It looks fancy, but is really simple)
Is 'a' less than 10? YES (its 0)
Make 'a' one larger (now 1)
print on-screen what 'a' is (1)

Is 'a' less than 10? YES (its 1)
Make 'a' one larger (now 2)
print on-screen what 'a' is (2)

Is 'a' less than 10? YES (its 2)
Make 'a' one larger (now 3)
print on-screen what 'a' is (3)

Is 'a' less than 10? YES (its 3)
Make 'a' one larger (now 4)
print on-screen what 'a' is (4)

Is 'a' less than 10? YES (its 4)
Make 'a' one larger (now 5)
print on-screen what 'a' is (5)

Is 'a' less than 10? YES (its 5)
Make 'a' one larger (now 6)
print on-screen what 'a' is (6)

Is 'a' less than 10? YES (its 6)
Make 'a' one larger (now 7)
print on-screen what 'a' is (7)

Is 'a' less than 10? YES (are you still here?)
Make 'a' one larger (now 8)
print on-screen what 'a' is (8)

Is 'a' less than 10? YES (its 8)
Make 'a' one larger (now 9)
print on-screen what 'a' is (9)

Is 'a' less than 10? YES (its 9)
Make 'a' one larger (now 10)
print on-screen what 'a' is (10)

Is 'a' less than 10? NO (its 10, therefore isn't less than 10)
Don't do the loop
There's no code left to do, so the program ends
```

So in short, try to think of it that way when you write 'while' loops. This is how you write them, by the way (syntax):
```
while {condition that the loop continues}:
    {what to do in the loop}
    {have it indented, usually four spaces}
the code here is not looped
because it isn't indented
```

Now try to understand this example and run to see if it is what you expected.

In [None]:
x = 10
while x != 0:
    print(x)
    x = x - 1
    print("wow, we've counted x down, and now it equals", x)
print("And now the loop has ended.")

## 3.3 Boolean Expressions (Boolean... what!?)
What we were using right after the `while` sentence is a variable that is evaluated a boolean expression, meaning the result of it's evaluation can be TRUE or FALSE.<br>
What? A boolean expression just means a question that can be answered with a TRUE or FALSE response. For example, if you wanted to say:


*   Your age is the same as the person next to you, you would type: `My age == the age of the person next to me` and the statement would be TRUE. 
*   If you were younger than the person opposite, you'd say: `My age < the age of the person opposite me` and the statement would be TRUE. 
*   If, however, you were to say the following, and the person opposite of you was younger than you: `My age < the age of the person opposite me`. The statement would be FALSE - the truth is that it is the other way around. 

This is how a loop thinks - if the expression is true, keep looping (keep executing code). If it is false, don't loop (it basically continues the execution after the while loop). 

With this in mind, let's have a look at the operators (symbols that represent an action) that are involved in boolean expressions:<br>

| Expression | Function  |
|   :---:    |   :---:   |
|    `<`     | Less than |
| `<=` | Less than or equal to |
| `>` | Greater than |
| `>=` | Greater than or equal to |
| `!=` | Not equal to |
| `<>` | Not equal to (alternate) |
| `==` | Equal to |

Don't confuse `=` and `==`. 

*   The `=` operator will make what is on the left equal to what is on the right. 
*   The `==` operator says whether the thing on the left is the same as what is on the right, and returns True or False.




## 3.4 Conditional Statements
OK! We've (hopefully) covered 'while' loops. Now let's look at something a little different - conditionals.<br>
Conditionals are where a section of code is only run if certain conditions are met. This is similar to the 'while' loop you just wrote, which only runs when x doesn't equal 0. However, Conditionals are only run once. The most common conditional in any program language, is the 'if' statement. Here is how it works:<br>
```
if {conditions to be met}:
    {do this}
    {and this}
    {and this}
{but this happens regardless}
{because it isn't indented}
```
<br>
Now some examples in Python:

In [None]:
#EXAMPLE 1
y = 1
if y == 1:
    print("y still equals 1, I was just checking")

The following example (Example 2) is tricky. The code evaluates the if statement every time the while loop runs (so 20 times). Inside the if statement, we are using the operator % (which if you remember, it returns the remainder from a division). So this if statement is checking that when we do a division by 2, there is nothing left over (for spanish speakers, si el resto de la division entera es 0, imprime n), e.g: <br>
1/2 = 0.5 -> prints nothing (as the if statement is FALSE)<br>
2/2 =   1 -> prints 2<br>
3/2 = 1.5 -> prints nothing (as the if statement is FALSE)<br>
4/2 =   2 -> prints 4 <br>
...<br>
As you can see, another conclusion of this exercise is that we are printing even numbers, and not the odd ones. If it is even, it prints what n is. Interesting, right?

In [None]:
#EXAMPLE 2
print("We will show the even numbers up to 20")
n = 1
while n <= 20:
    if n % 2 == 0:
        print(n)
    n = n + 1
print("there, done.")

We will show the even numbers up to 20
2
4
6
8
10
12
14
16
18
20
there, done.


## 3.5 `else` and `elif` - When it isn't True
There are many ways you can use the `if` statement to deal with situations where your boolean expression is FALSE. They are `else` and `elif`.<br>
`else` simply tells the computer what to do if the conditions of `if` aren't met. For example, read the following:

In [None]:
a = 1
if a > 5:
    print("This shouldn't happen.")
else:
    print("This should happen.")

`a` is not greater than five, therefore what is under `else` is done.

`elif` is just a shortened way of saying `else if`. When the `if` statement fails to be true, `elif` will do what is under it IF the conditions are met. For example:

In [None]:
z = 4
if z > 70:
    print("Something is very wrong")
elif z < 7:
    print("This is normal")

The `if` statement, along with `else` and `elif` follow this form:
```Python
if {conditions}:
    {run this code}
elif {conditions}:
    {run this code}
elif {conditions}:
    {run this code}
else:
    {run this code}
#You can have as many or as little elif statements as you need
#anywhere from zero to the sky.
#You can have at most one else statement
#and only after all other ifs and elifs.
```

***One of the most important syntax rules to remember is that you MUST have a colon `:` at the end of every line with an `if`, `elif`, `else` or `while` in it.***

## 3.6 Indentation
Another important syntax rule to remember is that the code to be executed if the conditions are met, MUST BE INDENTED. That means that if you want to loop the next five lines with a `while` loop, you must put a set number of spaces at the beginning of each of the next five lines. This is good programming practice in any language, but Python requires that you do it. Here is an example of both of the above points:

In [None]:
a = 10
while a > 0:
    print(a)
    if a > 5:
        print("Big number!")
    elif a % 2 != 0:
        print("This is an odd number")
        print("It isn't greater than five, either")
    else:
        print("this number isn't greater than 5")
        print("nor is it odd")
        print("feeling special?")
    a = a - 1
    print("we just made 'a' one less than what it was!")
    print("and unless a is not greater than 0, we'll do the loop again.")
print("well, it seems as if 'a' is now no bigger than 0!")
print("the loop is now over, and without furthur adue, so is this program!")

Notice the three levels of indents there:
1.	Each line in the first level starts with no spaces. It is the main program, and will always execute.
2.	Each line in the second level starts with four spaces. When there is an `if` or loop on the first level, everything on the second level after that will be looped/'ifed', until a new line starts back on the first level again.
3.	Each line in the third level starts with eight spaces. When there is an `if` or loop on the second level, everything on the third level after that will be looped/'ifed', until a new line starts back on the second level again.
4.	This goes on infinitely, until the person writing the program has an internal brain explosion, and cannot understand anything he/she has written.

There is another loop, called the 'for' loop, but we will cover that in a later lesson, after we have learnt about lists.


# 4. Functions


## 4.1 Introduction
In the last lesson we said that we would introduce to you some purposeful programming. That involves user input, and user input requires a thing called ***functions***.

What are functions? Well, in effect, functions are little self-contained programs that perform a specific task, which you can incorporate into your own, larger programs. After you have created a function, you can use it at any time (by calling it), in any place. This saves you the time and effort of having to re-write repeatable portions of code to do a common task every time.


## 4.2 Using a Function
Python has lots of pre-made functions, that you can use right now, simply by 'calling' them. 'Calling' a function involves you giving a function input, and it will return a value (like a variable would) as output. Don't understand? Here is the general form that calling a function takes:<br>
`returned values = function_name(input parameters)`

- `function_name` is the name that you can use to identify  your function.
- `Parameters` are the values you pass to the function to use inside. For example, let's say you've made a new function which essentially performs a multiplication of any value `X` by 5. The value `X` you pass in the parameters will tell the function which number it should multiply by five. Now let's say you put the number `X=70` into parameters, so the function will compute `70 * 5`.


## 4.3 Parameters and Returned Values. Communicating with Functions
Well, that's all well and good that the program can multiply a number by five, but your program will need to see the result of what the function did, to get the result. So how does the function shows its result?

Well, in fact, when a computer runs a function, it doesn't see the function name as we do, but the computer sees the result of what the function did. Variables do the exact same thing - the computer doesn't see the variable name, it sees the value that the variable holds. 

So let's now build our function that multiplied any number `X` by five, let's call the function `multiply()`. 

Now, you put the value `X` you want to multiply in the brackets (the parameter of the function). So if you typed this:

`a = multiply(70)`

The computer would actually get as a result this:

`a = 350`

The function ran itself, then returned a number to the main program, based on what parameters it was given.

Note: the function `multiply()` is called above, but it isn't a real function yet, we have to create it.

In [None]:
# Below is the multiply function implementation
def multiply(X):
    aux = 5*X
    return aux

# And here is the function being called with an input parameter 70
a = multiply(70)
print(a)

350


## 4.4 Making A Calculator Program
Let's write another program, that will act as a calculator. This time it will do something more adventurous than what we have done before. There will be a menu, that will ask you whether you want to multiply two numbers together, add two numbers together, divide one number by another, or subtract one number from another. 

To start, we will use a pre-defined function from Python called `input`. The function asks the user to type in something. It then turns it into a string of text. Try the code below and see what it does:

In [None]:
# this line makes 'a' equal to whatever you type in
a = input("Type in something, and it will be repeated on screen: ")
# this line prints what 'a' is now worth
print(a)

Type in something, and it will be repeated on screen: 1
1


Now, we have to create a function that will analyze the input the user introduces, and depending on each option 1, 2, 3, or 4 it will execute a portion of code for the different operations of the calculator, e.g Option 1 is `add`, option 2 is `subtract`, etc.


There is only one problem though. The `input` function returns what you type in as a string, and we want the number 1, not the letter 1. 

Luckily, somebody wrote a function called `eval`, which returns what you typed in, to the main program, in the format that you introduced it, e.g if you enter a number it will be interpreted as a number (integer) and not text (string). 



In [None]:
# this line makes 'a' equal to the value that you type. It doesn't accept strings
a = eval(input("Type in something, and it will be repeated on screen: "))
# this line prints what 'a' is now worth
print(a)

Type in something, and it will be repeated on screen: 1
1


Now, let's design this calculator properly! We want a menu that is shown to the user every time you finish an operation (add, subtract, etc). In other words, for the program to run continuously, you have to loop while asking the user to introduce an option.
Once the user introduces an option, we want the program to execute a portion of code that will execute the operation the user selected. That involves the user typing in an option (e.g Option 1 - add), so the fist `if` will return TRUE, meaning the code inside will be executed. In the `if`, the program will ask the user to introduce a number (a.k.a. input 1) and another number (a.k.a. input 1), and then it will perform the `add` operation. Then, it will return to the first line of the while loop waiting for another option the user can enter.<br>
Let's write it out in understandable English first (pseudocode):

```
START PROGRAM
print opening message

while we let the program run, do this:
    #Print what options you have
    print Option 1 - add
    print Option 2 - subtract
    print Option 3 - multiply
    print Option 4 - divide
    print Option 5 - quit program

    ask for which option it is you want
    if it is option 1:
        ask for first number
        ask for second number
        add them together
        print the result onscreen
    if it is option 2:
        ask for first number
        ask for second number
        subtract one from the other
        print the result onscreen
    if it is option 3:
        ask for first number
        ask for second number
        multiply!
        print the result onscreen
    if it is option 4:
        ask for first number
        ask for second number
        divide one by the other
        print the result onscreen
    if it is option 5:
        tell the loop to stop looping
Print onscreen a goodbye message
END PROGRAM
```
Let's put this in something that Python can understand:

In [None]:
#calculator program

#this variable tells the loop whether it should loop or not.
#1 means loop. Anything else means don't loop.

loop = 1

#this variable holds the user's choice in the menu:

choice = 0

while loop == 1:
    #print what options you have
    print("Welcome to calculator.py")

    print("your options are:")
    print(" ")
    print("1. Addition")
    print("2. Subtraction")
    print("3. Multiplication")
    print("4. Division")
    print("5. Quit calculator.py")
    print(" ")

    choice = eval(input("Choose your option: "))
    if choice == 1:
        add1 = eval(input("Add this: "))
        add2 = eval(input("to this: "))
        print(add1, "+", add2, "=", add1 + add2)
    elif choice == 2:
        sub2 = eval(input("Subtract this: "))
        sub1 = eval(input("from this: "))
        print(sub1, "-", sub2, "=", sub1 - sub2)
    elif choice == 3:
        mul1 = eval(input("Multiply this: "))
        mul2 = eval(input("with this: "))
        print(mul1, "*", mul2, "=", mul1 * mul2)
    elif choice == 4:
        div1 = eval(input("Divide this: "))
        div2 = eval(input("by this: "))
        print(div1, "/", div2, "=", div1 / div2)
    elif choice == 5:
        loop = 0
        
print("Thank you for using calculator.py!")

Welcome to calculator.py
your options are:
 
1. Addition
2. Subtraction
3. Multiplication
4. Division
5. Quit calculator.py
 
Choose your option: 1
Add this: 1
to this: 1
1 + 1 = 2
Welcome to calculator.py
your options are:
 
1. Addition
2. Subtraction
3. Multiplication
4. Division
5. Quit calculator.py
 
Choose your option: aaa


NameError: ignored

Play around with it - try all options, entering in integers (numbers without decimal points), and numbers with stuff after the decimal point (known in programming as a floating point). Try typing in text, and see how the program gets an error, and stops running (that can be dealt with, using error handling, which we can address later).

## 4.5 Define Your Own Functions
Well, it is all well and good that you can use other people's functions, but what if you want to write your own functions? This is where the `def` operator comes in. (An operator is just something that tells Python what to do, e.g. the `+` operator tells Python to add things, the `if` operator tells Python to do something if conditions are met, the `def` operator tells Python to define a function)

This is how the `def` operator works:

```
def function_name(parameter_1,parameter_2):
    {this is the code in the function}
    {more code}
    {more code}
    return {value to return to the main program}
{this code isn't in the function, because it isn't indented}
#remember to put a colon ":" at the end of the line that starts with 'def'
```


`function_name` is the name of the function. You write the code that is in the function below that line, and have it indented. (We will worry about `parameter_1` and `parameter_2` later, for now imagine there is nothing between the parentheses).

Functions run completely independent of the main program. When the computer comes to a function, it doesn't see the function, but instead it sees a value which is what the function returns. Similar to variables, to the computer, the variable 'a' doesn't look like 'a' - it looks like the value that is stored inside it. Functions works the same way, to the main program (that is, the program that is running the function), they look like the value of what they give in return of running that portion of code.

A function is like a miniature program that some parameters are given to - it then runs itself, and then returns a value. Your main program sees only the returned value. Because it is a separate program, a function doesn't see any of the variables that are in your main program, and your main program doesn't see any of the variables that are in a function. For example, here is a function that prints the words `"hello"` on screen, and then returns the number `'1234'` to the main program:

In [None]:
# Below is the function
def hello():
    print("hello")
    return 1234

# And here is the function being used
result = hello()
print(result)

So what happened?
1.	when `def hello()` was run, a function called `hello` was created
2.	When the line `print(hello())` was run, the function `hello` was executed (The code inside it was run)
3.	The function `hello` printed `"hello"` on screen, then returned the number `1234` back to the main program
4.	The main program now sees the line as `print("1234")` and as a result, printed `1234`

That accounts for everything that happened. Remember, that the main program had NO IDEA that the words `hello` were printed on screen. All it saw was `1234`, and printed that on screen.

## 4.6 Passing Parameters to functions
There is one more thing we will cover in this lesson - passing parameters to a function. Think back to how we defined functions:<br>
```
def function_name(parameter_1,parameter_2):
    {this is the code in the function}
    {more code}
    {more code}
    return {value (e.g. text or number) to return to the main program}
```

Where `parameter_1` and `parameter_2` are (between the parentheses), you put the names of variables that you want to put the parameters into. You can have as many variables as you need, just have them separated by commas. When you run a function, the first value you put inside the parentheses would go into the variable where `parameter_1` is. The second one (after the first comma) would go to the variable where `parameter_2` is. This goes on for however many parameters there are in the function (from zero, to the sky). For example:

In [None]:
def funnyfunction(first_word, second_word, third_word):
    print("The word created is: " + first_word + second_word + third_word)
    return first_word + second_word + third_word

When you run the function above, you would type in something like this: `funnyfunction("meat","eater","man")`. The first value (that is, "meat") would be put into the variable called first_word. The second value inside the brackets (that is, "eater") would be put into the variable called second_word, and so on. This is how values are passed from the main program to functions - inside the parentheses, after the function name.

## 4.7 A Final Program
Think back to that calculator program. Did it look a bit complicated to you? I think it did, so let's re-write it, now using functions.

To design - First we will define all the functions we are going to use with the `def` operator. Then we will have the main program, with all that messy code replaced with nice, neat functions. This will make it so much easier to look at again in the future.

In [None]:
# Calculator program

# Here we will define our functions
# this prints the main menu, and prompts for a choice
def menu():
    #print what options you have
    print("Welcome to calculator.py")
    print("your options are:")
    print(" ")
    print("1. Addition")
    print("2. Subtraction")
    print("3. Multiplication")
    print("4. Division")
    print("5. Quit calculator.py")
    print(" ")
    return eval(input("Choose your option: "))
    
# this adds two numbers given
def add(a,b):
    print(a, "+", b, "=", a + b)
    
# this subtracts two numbers given
def sub(a,b):
    print(b, "-", a, "=", b - a)
    
# this multiplies two numbers given
def mul(a,b):
    print(a, "*", b, "=", a * b)
    
# this divides two numbers given
def div(a,b):
    print(a, "/", b, "=", a / b)
    
# PROGRAM REALLY STARTS HERE, AS CODE IS RUN
loop = 1
choice = 0
while loop == 1:
    choice = menu()
    if choice == 1:
        add(eval(input("Add this: ")),eval(input("to this: ")))
    elif choice == 2:
        sub(eval(input("Subtract this: ")),eval(input("from this: ")))
    elif choice == 3:
        mul(eval(input("Multiply this: ")),eval(input("by this: ")))
    elif choice == 4:
        div(eval(input("Divide this: ")),eval(input("by this: ")))
    elif choice == 5:
        loop = 0

print("Thank you for using calculator.py!")

# NOW THE PROGRAM REALLY FINISHES

The initial program had 34 lines of code. The new one actually had 35 lines of code! It is a little longer, but if you look at it the right way, it is actually simpler.

You defined all your functions at the top. This really isn't part of your main program - they are just lots of little programs that you will call upon later. You could even re-use these in another program if you needed them, and didn't want to tell the computer how to add and subtract again.

If you look at the main part of the program (between the line `loop = 1` and `print("Thank you for...")`), it is only 15 lines of code. That means that if you wanted to write this program differently, you would only have to write 15 or so lines, as opposed to the 34 lines you would normally have to without functions.

# 5.	Tuples, Lists, and Dictionaries

## 5.1	Introduction
Your brain still hurting from the last lesson? Never worry, this one will require a little less thought. We're going back to something simple - variables - but a little more in depth.

Variables are great at what they do - storing a piece of data that may change over time.

But what if you need to store a long list of information, which doesn't change over time? Say, for example, the names of the months of the year. Or maybe a long list of information that does change over time? Say, for example, the names of all your cats. You might get new cats, some may die one day. What about a phone book? For that you need to do a bit of referencing - you would have a list of names, and attached to each of those names, a phone number. How would you do that? Here's where lists, tuples and dictionaries come handy.

## 5.2	The Solution - Lists, Tuples, and Dictionaries
For these three problems, Python uses three different solutions - Tuples, Lists, and Dictionaries:
* **Lists** are what they seem - a list of values. You can remove values from the list, and add new values to the end. Each one of the values is numbered, starting from zero - the first one is numbered zero, the second 1, the third 2, etc. Example: Your many cats' names.
* **Tuples** are just like lists, but you can't change their values. The values that you give it first, are the values that stay for the rest of the program. Again, each value is numbered starting from zero, for easy reference. Example: the names of the months of the year.
* **Dictionaries** are similar to what their name suggests - a dictionary. In a dictionary, you have an 'index' of words, and for each of them a definition. In Python, the word is called a 'key', and the definition a 'value'. The values in a dictionary aren't numbered - they aren't in any specific order, either - the key does the same thing. You can add, remove, and modify the values in dictionaries. Example: telephone book.

### 5.2.1 Tuples
Tuples are pretty easy to make. You give your tuple a name, then after that the list of values it will carry. For example, the months of the year:

In [None]:
months = ('January','February','March','April','May','June',\
'July','August','September','October','November','  December')

* Note that the `\` thing at the end of the first line carries over that line of code to the next line. It is useful way of making big lines more readable.
* You may have spaces after the commas if you feel it necessary - it doesn't really matter

Python then organizes those values in a handy, numbered index - starting from zero, in the order that you entered them in. It would be organized like this:<br>

| Index | Value |
| :---: | :---: |
| 0 | January |
| 1 | February |
| 2 | March |
| 3 | April |
| 4 | May |
| 5 | June |
| 6 | July |
| 7 | August |
| 8 | September |
| 9 | October |
| 10 | November |
| 11 | December |


You can recall values from tuples like in the example below. For example, to print the month November which is on the 10th position in the tuple (10 is the index) you can do:

In [None]:
print(months[10])

November


### 5.2.2 Lists
Lists are extremely similar to tuples. Lists are modifiable (or 'mutable', as a programmer may say), so their values can be changed. Most of the time we use lists, not tuples, because we want to easily change the values of things if we need to.

Lists are defined very similarly to tuples. Say you have FIVE cats, called Tom, Snappy, Kitty, Jessie and Chester. To put them in a list, you would do this:<br>

In [None]:
cats = ['Tom', 'Snappy', 'Kitty', 'Jessie', 'Chester']

As you see, the code is exactly the same as a tuple, EXCEPT that all the values are put between square brackets, not parentheses. Again, you don't have to have spaces after the comma.

You recall values from lists exactly the same as you do with tuples. For example, to print the name of your 3rd cat you would do this:

In [None]:
print(cats[2])

Kitty


You can also recall a range of values. For example, `cats[0:2]` would recall your 1st and 2nd cats in the list. Try it in the field above.

Now, remember, lists can be modified. To add a value to a list, you use the `append()` function. Let's say you got a new cat called Catherine. To add her to the list you'd do this:

In [None]:
cats.append('Catherine')

Use the field below to check if the cat has been added to the list.

In [None]:
print(cats)

['Tom', 'Snappy', 'Kitty', 'Jessie', 'Chester', 'Catherine', 'Catherine']


So this is the way of adding a new value to a list:
```Python
#add a new value to the end of a list:
list_name.append(value-to-add)

#e.g. to add the number 5038 to the list 'numbers':
numbers.append(5038)
```

Now to a sad situation - One of your cats, Snappy, disappeared, OMG! Very sad, but now you need to remove him (or her) from the list. Removing that sorry cat is an easy task, thankfully, using the `del` operator.

In [None]:
#Remove your 2nd cat, Snappy. 
del cats[1]

Check again which cats are in the list:

In [None]:
print(cats)

['Tom', 'Kitty', 'Jessie', 'Chester', 'Catherine', 'Catherine']


You've just removed the 2nd cat in your list - poor Snappy.
And with that morbid message, lets move on to...

### 5.3 Dictionaries
Okay, so there is more to life than the names of your cats. You need to call your sister, mother, son, the fruit man, and anyone else who needs to know that their favorite cat is, unfortunately, missing. For that you need a telephone book.

Now, the lists we've used above aren't really suitable for a telephone book. For this scenario, you need to know a number based on someone's name. For this we need *Dictionaries*.

So how do we make a dictionary? Remember, dictionaries have keys (e.g a person's name), and values (e.g its phone number). 

When you initially create a dictionary, it is very much the same as creating a tuple or a list. Remember, tuples had `(` and `)`, lists had `[` and `]`. Guess what! Dictionaries have `{` and `}` - curly braces. 

Here is an example below, showing a dictionary with four phone numbers in it:

In [None]:
#Make the phone book:
phonebook = { 'Andrew Parson':8806336, \
              'Emily Everett':6784346, \
              'Peter Power':7658344, \
              'Lewis Lame':1122345}
print(phonebook['Lewis Lame'])

1122345


When you run it you see that Lewis Lame's number is printed on screen. Notice how instead of identifying the value by a number, like in the cats and months examples, we identify the value, using another value - in this case the person's name.

OK, you've created a new phone book. Now you want to add new numbers to the book. What do you do? A very simple line of code:

In [None]:
#Add the person 'Gingerbread Man' to the phonebook:

phonebook['Gingerbread Man'] = 1234567

All that line is saying is that there is a person called Gingerbread Man in the phone book, and his number is `1234567`. In other words - the key is `Gingerbread Man`, and the value is `1234567`.

Check if it's added using the field below.

In [None]:
print(phonebook)

{'Andrew Parson': 8806336, 'Emily Everett': 6784346, 'Peter Power': 7658344, 'Lewis Lame': 1122345, 'Gingerbread Man': 1234567}


You delete entries in a dictionary just like in a list. Let's say Andrew Parson is your neighbor, and was involved in the disappearance of your cat. You never want to talk to him again, and therefore don't need his number. Just like in a list, you'd do this:

In [None]:
del phonebook['Andrew Parson']

Again, very easy, the `del` operator deletes any function, variable, or entry in a list or dictionary (An entry in a dictionary is just a variable with a number or text string as a name. This comes in handy later on.)

Check if the number is gone using the field below.

In [None]:
print(phonebook)

{'Emily Everett': 6784346, 'Peter Power': 7658344, 'Lewis Lame': 1122345, 'Gingerbread Man': 1234567}


Remember that append function that we used with the list? Well, there are quite a few of those that can be used with dictionaries. Below, I will write you a program, and it will incorporate some of those functions in. It will have comments along the way explaining what it does. Experiment as much as you like with it.

In [None]:
#A few examples of a dictionary

#First we define the dictionary
#it will have nothing in it this time
ages = {}

#Add a couple of names to the dictionary
ages['Sue'] = 23
ages['Peter'] = 19
ages['Andrew'] = 78
ages['Karren'] = 45

#Use an 'if' statement to find a key in the list.
#Remember - this is how 'if' statements work -
#they run if something is true
#and they don't when something is false.
if 'Sue' in ages:
    print("Sue is in the dictionary. She is", \
ages['Sue'], "years old")

else:
    print("Sue is not in the dictionary")

#Use the function keys() - 
#This function returns a list
#of all the names of the keys.
#E.g.
print("The following people are in the dictionary:")
print(ages.keys())

#You could use this function to
#put all the key names in a list:
keys = ages.keys()

#You can also get a list
#of all the values in a dictionary.
#You use the values() function:
print("People are aged the following:", \
ages.values())

#Put it in a list:
values = ages.values()

#You can sort lists, with the sorted() function
#It will sort all values in a list
#alphabetically, numerically, etc...
#You can't sort dictionaries - 
#they are in no particular order
print(keys)
sortedkeys = sorted(keys)
print(sortedkeys)

print(values)
sortedvalues = sorted(values)
print(sortedvalues)

#You can find the number of entries
#with the len() function:
print("The dictionary has", \
len(ages), "entries in it")

Of course, keys in a dictionary don't have to be of type string. Here's an example having numbers as the key in the dictionary:

In [None]:
phone_lookup = {
    8005551212: "Info number",
    9495551212: "John Bates",
    3105551212: "Angie Higgs"}

print(phone_lookup[3105551212])

Angie Higgs


And the dictionary can store any arbitrary object/value. How about a dictionary that holds a dictionary, that holds another dictionary?

In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

print(offices)
print("San Francisco phone: %s" % offices["sf"]["phone"])

{'atlanta': {'phone': '8885551212', 'city': 'Atlanta', 'state': 'GA'}, 'sf': {'phone': '8005551212', 'city': 'San Francisco', 'state': 'CA'}}
San Francisco phone: 8005551212


Quite powerful isn't it? But with great power comes great responsibility. What if if a key is not in a dictionary and get an error like in the example below?

In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

# here we will try to get a phone for a undefined office. This will blow up with a KeyError exception
print("Get the phone of the NY office: %s" % offices["ny"]["phone_mobile"])

KeyError: ignored

You will get a KeyError. It's no big deal for hard-coded, predefined values but what about user input or if the key is some other dynamic source?

There are a few ways to deal with that. The most [Pythonic](https://docs.python-guide.org/writing/style/) might be using the "in" keyword

```
KEY in DICT
```
 

In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

if "sf" in offices:
  print("sf key is in the offices dictionary")

if "ny" not in offices:
  print("ny key is NOT in the offices dictionary")

sf key is in the offices dictionary
ny key is NOT in the offices dictionary


and of course you can always ask the dictionary for the list of keys that it holds:

In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

print(offices.keys())

dict_keys(['atlanta', 'sf'])


or for the list of values that it holds:

In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

print(offices.values())

dict_values([{'phone': '8885551212', 'city': 'Atlanta', 'state': 'GA'}, {'phone': '8005551212', 'city': 'San Francisco', 'state': 'CA'}])


# 6. For Loop

## 6.1 Introduction
Well, in the first lesson about loops, I said I would put off teaching you the for loop, until we had reached lists. Well, here it is!

## 6.2 The `for` Loop
Basically, the `for` loop does something for every value in a list. Here is an example of it in code:

In [None]:
# Example 'for' loop
# First, create a list to loop through:
newList = [45, 'eat me', 90210, "The day has come, the walrus said, \
to speak of many things", -67]

# create the loop:
# Goes through newList, and sequentially puts each bit of information
# into the variable value, and runs the loop
for value in newList:
    print(value)

45
eat me
90210
The day has come, the walrus said, to speak of many things
-67


As you see, when the loop executes, it runs through all of the values in the list mentioned after `in`. It then puts them into `value`, and executes through the loop, each time with value being worth something different. 

In [None]:
# example
sentence = "This is Anyone AI"

for word in sentence:
    print(word)
    


T
h
i
s
 
i
s
 
A
n
y
o
n
e
 
A
I


A couple of things you've just learned:
* As you see, strings (remember - strings are lines of text) are just lists with lots of characters.
* The program went through each of the letters (or values) in word, and it printed them on screen.<br>

Loops are very powerful and used everywhere. Remember the dictionary from last lesson? Well, when we ask for 

```
keys()
```

the dictionary returned a list that we can loop through



In [None]:
offices = {
    "atlanta": {
        "phone": "8885551212",
        "city": "Atlanta",
        "state": "GA"
    },
    "sf": {
        "phone": "8005551212",
        "city": "San Francisco",
        "state": "CA"
    }
}

for key in offices.keys():
  print(key)

atlanta
sf


you can also use the for loop to enumerate through the dictionary with key and value. For instance:

In [None]:
phonebook = {
    "John": "3105551212",
    "Bob": "9496667777",
    "Alice": "7144445555"
}

# similar to a simple loop, but items() returns to values: key and the value from the dictionary
for key, item in phonebook.items():
  print("Here is a key item combo. Key: %s, Number: %s" % (key, item))

Here is a key item combo. Key: John, Number: 3105551212
Here is a key item combo. Key: Bob, Number: 9496667777
Here is a key item combo. Key: Alice, Number: 7144445555


When would you use a loop like that? Well, when for instance you want to look at the key or value and make a decision to do something based on the value.

All these examples are simple and the lists are small but in the real world you will deal with much more data and you have to keep in mind that looping a large dataset takes time. Usually we loop through a list because we are looking for something or we want to do something with the value. If you are looking for specific data or value, it's a good practice to finish the loop as soon as you can. Like most languages, Python has the ```break``` keyword that will exit the current loop when instructed:



In [None]:
loop_count = 0

# the range() function will give us a list of numbers 0 through 1000
for number in range(1000):
  loop_count += 1
  if number == 667:
    break

print("Found the number: %s after %s loops" % (number, loop_count))

Found the number: 667 after 668 loops


Python also has the ```continue``` statement that will "skip" the for loop to the next value. The statement is usually used when you want to ignore certain values:

In [None]:
my_list_of_numbers = []
for number in range(1, 100):
  # add all numbers to the list except the 15
  if number == 15:
    continue
  my_list_of_numbers.append(number)

print(my_list_of_numbers)


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]


## 6.3 Nested loops
A nested loop is a loop inside the body of the outer loop. The inner or outer loop can be any type, such as a ```while``` loop or ```for``` loop. For example, the outer for loop can contain a ```while``` loop and vice versa.

The outer loop can contain more than one inner loop. There is no limitation on the chaining of loops.

In the nested loop, the number of iterations will be equal to the number of iterations in the outer loop multiplied by the iterations in the inner loop.

So be careful, things might get slow if you chain too many large loops.

In [None]:
total_loop_count = 0

for first in range(0, 100):
  for second in range(0, 100):
    for third in range(0, 100):
      total_loop_count += 1

print("total_loop_count is %s" % total_loop_count)


total_loop_count is 1000000


The ```break``` and ```continue``` statements are your friends when dealing nested loops and use them whenever possible.

## 6.4 List comprehension
As you have seen, lists and loops are very powerful and you will be in many situations where you will loop through a list to evaluate the value to make a new list based on a subset of the values. Something like this is quite common:

In [None]:
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = []

for x in fruits:
  if "a" in x:
    newlist.append(x)

print(newlist)

['apple', 'banana', 'mango']


The ```newlist``` now contains ```['apple', 'banana', 'mango']```.

A Pythonic way to write the above code is to use the list comprehension syntax. Outside of Python, not many languages have this concept but it is very prevalent in Python, so you are bound to encounter it everywhere

In [None]:
fruits = ["apple", "banana", "cherry", "kiwi", "mango"]
newlist = [fruit for fruit in fruits if "a" in fruit]

print(newlist)

['apple', 'banana', 'mango']


Same result, less code, but it will take a while to get used to! The above example is quite contrived. Something more common is to modify values of the list. For instance:

In [None]:
integers = [1,2,3,4,5,6,7,8,9]
floats = [float(i) for i in integers]
print(floats)

[1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0]


# 7. File I/O

## 7.1 Introduction
As a data scientist, your job will be to deal with data all day long. Whether you read it from the web or from file, or you output to a file or to some server, it's all Input and Output. Where it's get a little bit tricky is a Jupyter Notebook environment you are not running python on your local computer and hence you don't have access to your local file system. 

## 7.2 Opening a file
To open a text file you use, well, the `open()` function. Seems sensible. You pass certain parameters to `open()` to tell it in which way the file should be opened - `r` for read only, `w` for writing only (if there is an old file, it will be written over), `a` for appending (adding things on to the end of the file) and `r+` for both reading and writing. But less talk, let's open a normal text file for reading:

In [None]:
filename = './sample_data/california_housing_test.csv'
fl = open(filename, 'r') # Open the file
linecount = 0
for line in fl:
    print(line)
    linecount += 1
    if linecount > 10: break # let's not print too many lines to the console
fl.close() #close the file 

"longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","median_house_value"

-122.050000,37.370000,27.000000,3885.000000,661.000000,1537.000000,606.000000,6.608500,344700.000000

-118.300000,34.260000,43.000000,1510.000000,310.000000,809.000000,277.000000,3.599000,176500.000000

-117.810000,33.780000,27.000000,3589.000000,507.000000,1484.000000,495.000000,5.793400,270500.000000

-118.360000,33.820000,28.000000,67.000000,15.000000,49.000000,11.000000,6.135900,330000.000000

-119.670000,36.330000,19.000000,1241.000000,244.000000,850.000000,237.000000,2.937500,81700.000000

-119.560000,36.510000,37.000000,1018.000000,213.000000,663.000000,204.000000,1.663500,67000.000000

-121.430000,38.630000,43.000000,1009.000000,225.000000,604.000000,218.000000,1.664100,67000.000000

-120.650000,35.480000,19.000000,2310.000000,471.000000,1341.000000,441.000000,3.225000,166900.000000

-122.840000,38.400000,15.000000,3080.000000,617.000000,144

A better way to write this is:

In [None]:
filename = './sample_data/california_housing_test.csv'
linecount = 0
with open(filename, 'r') as fl:
    for line in fl:
        print(line)
        linecount += 1
        if linecount > 10: break # let's not print too many lines to the console

With the second method you don't have to add `fl.close`, it is auto-magically closed.

## 7.3 Read the whole thing...
If you want to print the whole file, instead of looping over the lines you can


In [None]:
filename = './sample_data/california_housing_test.csv'
with open(filename, 'r') as fl:
  print(fl.read())

while this is convenient, it's not really advisable for large files!

## 7.4 Other I/O Functions
There are many other functions that help you with dealing with files. They have many uses that empower you to do more, and make the things you can do easier. Let's have a look at `tell()`, `readline()`, `readlines()`, `write()` and `close()`.

`tell()` returns where the cursor is in the file. It has no parameters, just type it in (like what the example below will show). This is infinitely useful, for knowing what you are referring to, where it is, and simple control of the cursor. To use it, type `fileobjectname.tell()` - where fileobjectname is the name of the file object you created when you opened the file (in `openfile = open('pathtofile', 'r')` the file object name is `openfile`).

`readline()` reads from where the cursor is till the end of the line. Remember that the end of the line isn't the edge of your screen - the line ends when you press enter to create a new line. This is useful for things like reading a log of events, or going through something progressively to process it. There are no parameters you have to pass to `readline()`, though you can optionally tell it the maximum number of bytes/letters to read by putting a number in the brackets. Use it with `fileobjectname.readline()`.<br>

`readlines()` is much like `readline()`, however `readlines()` reads all the lines from the cursor onwards, and returns a list, with each list element holding a line of code. Use it with `fileobjectname.readlines()`. For example, if you had the text file:
```
Line 1

Line 3
Line 4

Line 6
```
then the returned list from `readlines()` would be:<br>

| Index | Value |
| :--: | :--: |
| 0 | 'Line 1' |
| 1 | " |
| 2 | 'Line 3' |
| 3 | 'Line 4' |
| 4 | " |
| 5 | 'Line 6' |

The `write()` function, writes to the file. How did you guess??? It writes from where the cursor is, and overwrites text in front of it - like in MS Word, where you press 'insert' and it writes over the top of old text. To utilize this most purposeful function, put a string between the brackets to write e.g. `fileobjectname.write('this is a string')`.

`close`, you may figure, closes the file so that you can no longer read or write to it until you reopen in again. Simple enough. To use, you would write `fileobjectname.close()`. Simple!

Later in the course you can try this in the Python command line. Open up a test file (or create a new one...) and play around with these functions. You can do some simple (and very inconvenient) text editing.

## 7.5 File I/O Handson

Luckily for us, Google Colab environment ships with a little bit of sample data. If you click on the small folder icon in the left-hand toolbar, you will see "smaple_data" folder that contains a few CSV files. You can download these files if you want and take a look at them in Excel or similar app. If you take a look a california_housing_test.csv you will see this:

```
"longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","median_house_value"
-122.050000,37.370000,27.000000,3885.000000,661.000000,1537.000000,606.000000,6.608500,344700.000000
-118.300000,34.260000,43.000000,1510.000000,310.000000,809.000000,277.000000,3.599000,176500.000000
-117.810000,33.780000,27.000000,3589.000000,507.000000,1484.000000,495.000000,5.793400,270500.000000
-118.360000,33.820000,28.000000,67.000000,15.000000,49.000000,11.000000,6.135900,330000.000000
-119.670000,36.330000,19.000000,1241.000000,244.000000,850.000000,237.000000,2.937500,81700.000000
.
.
.
.
```

Lets open the file and print out the first 5 lines:

In [None]:
filename = './sample_data/california_housing_test.csv'
line_count = 0
with open(filename, 'r') as fl:
    for line in fl:
        if line_count > 5: break
        print(line)
        line_count += 1


"longitude","latitude","housing_median_age","total_rooms","total_bedrooms","population","households","median_income","median_house_value"

-122.050000,37.370000,27.000000,3885.000000,661.000000,1537.000000,606.000000,6.608500,344700.000000

-118.300000,34.260000,43.000000,1510.000000,310.000000,809.000000,277.000000,3.599000,176500.000000

-117.810000,33.780000,27.000000,3589.000000,507.000000,1484.000000,495.000000,5.793400,270500.000000

-118.360000,33.820000,28.000000,67.000000,15.000000,49.000000,11.000000,6.135900,330000.000000

-119.670000,36.330000,19.000000,1241.000000,244.000000,850.000000,237.000000,2.937500,81700.000000



## 7.6 Parsing data

Reading some text out of a file is easy but to work with the data you actually have to parse it from a format (csv, json etc.) into something that you can work with in Python.

Python has MANY ways to do this. Later on you will most likely use something like PANDAS but for now lets use the plain old ```CSV``` module:

In [None]:
import csv

filename = './sample_data/california_housing_test.csv'
row_count = 0
with open(filename, 'r') as fl:
  csvreader = csv.reader(fl, delimiter=',', quotechar='"')
  print("Here are CSV contents")
  for row in csvreader:
    # only process the first few rows
    if row_count > 2: break

    # row is a list
    print("Row: %s | %s | %s ... and so on" % (row[0], row[1], row[2]))
    row_count += 1


Here are CSV contents
Row: longitude | latitude | housing_median_age ... and so on
Row: -122.050000 | 37.370000 | 27.000000 ... and so on
Row: -118.300000 | 34.260000 | 43.000000 ... and so on


Pretty much all the parsing of data follows the same format: open the source, pass it to some reader that understands the format and finally read and consume the data.

Of course, now that you have access to the data and it has been parsed into rows and columns (like an Excel spreadsheet), you might have to do some conversion, cleaning and sometimes even interpretation of data.

Case in point: a record might have a "date_created" value like '2017-11-13 10:22:54.677'. You know it's a date but when you loop through the rows and columns, the CSV parser returns as as a simple string and if you need to perform any calculation on the date or change the format how it is displayed you will need to parse the string into a dat/time type. Fortunately, Python has many modules that will help you to parse and convert data from strings.

## 7.7 Dealing with dates
Dates and times are complicated. You will have to deal with many formats, timezones and at times just plain bad data. Python has the [```datatime```](https://docs.python.org/3/library/datetime.html) module that can deal with reasonable data and quite a few libraries if something more robust is needed.

The below code, imports the datetime library and uses the datetime module (it's unfortunate that both have the same name) and specifically, the ```strptime``` method, to parse a string and convert it into a date by telling it what format the date has:

In [None]:
# import the lib
import datetime

# store a string representation of a date
my_date = "2017-11-13 10:22:54.677"

# parse the string into a date
date = datetime.datetime.strptime(my_date, '%Y-%m-%d %H:%M:%S.%f')

print("Date:%s" % date)
print("Day: %s" % date.day)
print("Month: %s" % date.month)
print("Year: %s" % date.year)

Once you have a date type, you can easily perform calculations on it using [```timedelta```](https://docs.python.org/3/library/datetime.html#timedelta-objects) or format it a certain way using [```strftime```](https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior).

## 7.8 Dealing with numbers
Converting strings to numbers, for instance integers and floats is very easy:

In [None]:

numbers = ["1", "99", "45", "9999"]

for n in numbers:
  print("This is an integer: %s" % int(n))
  print("and this is a float: %s" % float(n))


9.9 Dealing with bad strings
Don't make any assumptions about your data and for the most part don't trust it is a good mantra to live by. Any data supplied from an external source and specifically when it's user based data will have many anomalies that you will have to deal with.

CustomerId, FirstName, LastName, Phone
1, JOHN, SMITH, 9497778888
1, JOhn, smith , 9497778888
Are the above the same customer? Most likely, but to your Python program that might not be the same.

We have covered strings in section 2.5 a little bit, but here are a few more string functions that will make your life easier when dealing with bad data:

In [None]:
# dealing with white space: strip, lstrip and rstrip
string_with_white_space = " smith  "
print("Value:%s!" % string_with_white_space.strip())
print("Value:%s!" % string_with_white_space.lstrip())
print("Value:%s!" % string_with_white_space.rstrip())

# converting case: upper, lower, capitalize
last_name = string_with_white_space.strip()
print("Lower case: %s" % last_name.lower())
print("Upper case: %s" % last_name.upper())
print("Capitalized: %s" % last_name.capitalize())

Here is a great summary of all the string functions.