# Python Workshop Week 2: Conditions, Loops and Files

Katy Brown

kab84@cam.ac.uk

After the workshop last week, you are now familiar with functions and variables.

We also covered the following types of variable:
* string `str`
* integer `int`
* float `float`
* boolean `bool`
* list `list`
* tuple `tuple`
* dictionary `dict`

We also covered using Python to perform simple arithmetic and True / False statements.

This week, the code examples will be in Jupyter but we'll use Spyder, a Python interactive development environment (**IDE**) to do the exercises.

In Spyder you have two windows - the **editor** and the **console**.

The console is used to test out Python commands interactively.

The editor is used to write longer bits of code or **scripts**.

**Exercise** - using the Spyder **console**
* Create two integer variables
* Add them together
* Create a dictionary with your variables as the *values* and any two strings as *keys*
* Compare your two variables to give a `True` and a `False` result

This week we will cover more complex functions, then conditions and loops.

Conditions and loops two most important parts of any programming language - once you understand these you can write code to do almost anything.

We will then cover using Python to read and write files.

## Modules

There are lots of things we can do with basic Python only, however there are hundreds of additional Python **modules**.  These are essentially sets of extra functions you can add to Python.

Many additional modules are automatically included with all Python versions, and even more are included with Anaconda.

We add the functions from these modules to Python using the `import` command.

For example, to import the `os` module.

In [None]:
import os

We can then access the functions from the random module by typing `os.functionname`.

In [None]:
os.getcwd() # Returns the current working directory (the directory the notebook is saved in)

In [None]:
x = os.getcwd()
os.listdir(x) # Lists the contents of the current directory

**Exercise**

In the **console** in Spyder, import the `math` module and use the `math.sqrt` function to find the square root of 144.

*Bonus: Import the numpy module and use the `numpy.mean` function to find the mean of a list*

## More Complex Functions

All the functions we discussed last week (`print`, `type`, `sorted`, `len`) take a single variable as their input.

However, it is common for functions to take multiple **arguments** instead - the function needs more than one piece of information.  These arguments are provided after the function name and separated by commas.

Some arguments are required, others are optional.

For example, the `round` function rounds a float to a certain number of decimal places.

If we just provide one argument, it will round to the default number of decimal places, zero.

In [None]:
round(100.141041)

However, if we add an additional argument after a comma we can specify the number of decimal places.

In [None]:
round(100.141041, 2)

The `help` function can help us with this.

In [None]:
help(round)

Arguments in square brackets are optional, the others are required.  So we have to pass `round` a float to round (`number`) and we can optionally pass it the number of decimal places to round to (`ndigits`)

We can either use the order listed in the `help` function or provide the names of the arguments.

In [None]:
another_rounded_number = round(ndigits=4, number=5.12341234)

The `randint` function from the `random` module generates a random integer between two values.

In [None]:
import random

This function needs two pieces of information:
* The minimum value
* The maximum value

This function therefore has two required arguments.

The `help` function tells us the order to provide this information.

In [None]:
help(random.randint)

This function expects first the minimum, which is referred to as `a`, then the maximum, referred to as `b`

In [None]:
a_random_integer = random.randint(3, 8)

In [None]:
print (a_random_integer)

In [None]:
another_random_integer = random.randint(a=10, b=50)

In [None]:
print(another_random_integer)

**Exercise**

Look at the help page for the `sample` function from the `random` module (`random.sample`).

In the **console** in Spyder, try to run this function and provide the two arguments it needs in the order specified in the help.

*Hint - population should be a list and k should be an integer*

*Bonus: Provide the arguments to the function in a different order, using their names to tell Python which is which*

## Conditions

Conditions are an important part of programming in any language.

These allow you to have parts of your program which only run under some conditions but not others.

Conditions are built by combining **if statements** with the True / False statements you learned about last week.

In [None]:
x = 1
print (x == 1)

In [None]:
if x == 1:
    print ("x is one")

If the statement is `True`, the code following the statement will be run.

In [None]:
x = 2
print (x == 2)

In [None]:
if x == 1:
    print ("x is one")

If the statement is `False`, nothing happens.

Python decides which code to run when the condition is met using **indentations**.

In many languages it is encouraged to use indents to make your code easier to read, but in Python it is required.

In [None]:
x = 2
if x == 2:
    print ("This code is run when x is equal to 2")
    print ("So is this code")
print ("But this code runs whether x == 2 is True or False")

**Exercise**

Change `x` to equal `3` in the above example and run it again.

**Exercise** - in the Spyder **editor**

Write a statement which is `True` and then an `if` statement which prints something if it is `True`.

Change the statement so it is `False` and run it the `if` statement again.

Add another `print` statement at the end which runs regardless of whether the statement is `True` or `False`.

*Bonus: Repeat this but include an `and` or `or` in your `if` statement.*

*Bonus: Assign a boolean variable and use this in your `if` statement.*

We make `if` statements more useful by combining them with `else` and `elif` (else if) statements.

`else` and `elif` can only be used after an `if` statement

`else` runs the code when the `if` statement is not `True`

In [None]:
x = 1
if x == 2:
    print ("x equals two")
else:
    print ("x doesn't equal two")

In [None]:
x = 2
if x == 2:
    print ("x equals two")
else:
    print ("x doesn't equal two")

`elif` is used when you want to make a chain of `if` statements - it is short for else if.

If the first statement is `True` it will run, if it is `False` the following `elif` statement will be evaluated.

In [None]:
x = 2
if x == 1:
    print ("x equals one")
elif x == 2:
    print ("x equals two")

In [None]:
x = "hello"
if "h" in x:
    print ("yes")
elif "y" in x:
    print ("no")

In [None]:
x = "yello"
if "h" in x:
    print ("yes")
elif "y" in x:
    print ("no")

You can add `else` to the end of a series of `elif` statements to provide code to run if nothing else is `True`.

In [None]:
x = [1, 2, 3]

if type(x) == int:
    print ("x is an integer")
elif type(x) == float:
    print ("x is a float")
elif type(x) == tuple:
    print ("x is a tuple")
else:
    print ("x is not an integer, float or tuple")

In [None]:
x = 1

if type(x) == int:
    print ("x is an integer")
elif type(x) == float:
    print ("x is a float")
elif type(x) == tuple:
    print ("x is a tuple")
else:
    print ("x is not an integer, float or tuple")

In [None]:
x = ((1, 2, 3))

if type(x) == int:
    print ("x is an integer")
elif type(x) == float:
    print ("x is a float")
elif type(x) == tuple:
    print ("x is a tuple")
else:
    print ("x is not an integer, float or tuple")

It's important to remember that once there is a `True` statement, none of the following `elif` statements will be evalulated, so nothing will happen whether they are `True` or not.

In [None]:
x = 1
if type(x) == int:
    print ("x is an integer")
elif x == 1: # this is not evalulated because the previous statement is True
    print ("x is one") 

However, if you use two `if` statements they will both be evaluated regardless of the result of the first one.

In [None]:
x = 1
if type(x) == int:
    print ("x is an integer")
if x == 1: # this is always evalulated
    print ("x is one") 

**Exercise**

Write a script in the Spyder editor using `if`, `elif` and `else` to do the following.
* Print "ONE" if x is greater than 5
* Print "TWO" if x is less than or equal to 5 but greater than 2
* Print "THREE" if x is less than or equal to 2

Run the script with three different values of x - `10`, `3` and `1`.

*Bonus: Add a step to filter out non-integer values of x*

*Bonus: Alter your script to add 1, 2 or 3 to the original value of x instead of printing a string.*

## Loops

A **loop** is used to repeat something lots of times.

Loops are one of the main ways programming saves time compared to doing things manually - you can repeat the same action millions of times in a few seconds.

We define loops in Python using `for` or `while`.  Today we will focus on `for` loops, as these are more common.

With `for` loops we use the syntax `for x in y:`

`y` represents an **iterable** - something containing a sequence of items.  This could be a list, tuple, dictionary or range (we'll cover ranges in a minute).

`x` represents the current item in the iterable - so on each **iteration** of the loop, the value of `x` changes to the next thing in the iterable.

In [None]:
y = [1, 2, 3, 4, 5]
print (type(y))

In [None]:
for x in y:
    print ("The current value of x is:", x)

We can use any variable name instead of x and y.

In [None]:
banana = ((1, "some_text", True))
for pyjama in banana:
    print (pyjama)

With a dictionary, we automatically iterate through the **keys**.

In [None]:
D = {"a": 100, "b": "hello", "c": 0.25}

In [None]:
for item in D:
    print (D[item])

**Exercise**

Write a script in Jupyter to iterate through a list and print each item one by one.

*Bonus: Repeat this action but with a dictionary.  Try printing the keys, the values, then both.*

Instead of just printing the items, we can do things to them.

In [None]:
my_list = [10, 20, 30, 40, 50, 60]
for list_item in my_list:
    print (list_item ** 2) # print list_item squared

In [None]:
my_list = [10, 20, 30, 40, 50, 60]
y = 0
for list_item in my_list:
    y = y + list_item
print (y)

We can also store the results in a new variable - the easiest is a list.

In [None]:
original_list = [10, 20, 30, 40]
new_list = [] # This makes an empty list which we can use to store things
for item in original_list:
    new_list.append(item * 2) # add item * 2 to the new list
print (new_list)

**Exercise**

Write a loop to take a list of numbers - `3`, `6`, `9`, `12`, iterate through them, add 100 to each value and store the output in a new list.

*Bonus: Try storing the output in a dictionary with the input list as the keys and the output list as the values.*

Sometimes we want to do something a fixed number of times rather than for each value in a specific iterable.

For this we can use the `range` function.

`range(x, y, z)` is a function which produces a sequence of integers from `x` to `y`, with an interval of `z`.

If you don't provide a value of `z`, the default is 1.

In [None]:
my_range = range(0, 10)

In [None]:
for m in my_range:
    print (m)

In [None]:
my_range = range(5, 9)
for m in my_range:
    print (m)

In [None]:
my_range = range(0, 10, 2)
for m in my_range:
    print (m)

**Exercise**

* Generate a range from 2 to 20, with an interval of 1 and print it using a loop.
* Generate a range from 5 to 15 with an interval of 5 and print it using a loop.

*Bonus: Generate a range which runs backwards from 10 to 0 with an interval of 1*

We can use this range for lots of things, one of the most common is to generate an `index`, which we can then use to access items in a `list` or `tuple`.


For example, we might have several lists with data in the same order.

In [None]:
L1 = ['John', 12, 'Cambridge']
L2 = ['Jim', 22, 'London']
L3 = ['Mary', 20, 'Manchester']

In [None]:
for i in range(0, 3):
    print ("i = ", i)
    print (L1[i], L2[i], L3[i])

We also might just want to repeat an action lots of times without using an iterable.

For example, if we wanted to add up all the integers <= 100.

In [None]:
x = 0
for i in range(0, 101):
    x = x + i

In [None]:
print (x)

**Exercise**

* Add up the square (`x**2`) of every number less than 1000 using a `for` loop and the `range` function.
* Make two lists and access the values using an index and the `range` function

*Bonus: Generate a for loop to calculate the "factorial" of 5 (i.e. 5x4x3x2x1).*

## Combining `for` loops and `if` statements

Now you know about `for` loops and `if` statements you have a lot of power to do things using Python.


We will spend a bit of time on the following few examples.

### 21

**Exercise**

* Use a `for` loop to generate and print a range of integers from 1 to 21 (including 1 and 21)
* Using a series of `if` and `else` statements, alter it to:

    * Print "abracadabra" instead of the integer for numbers which are a multiple of three and five.  *Hint: to find out if a number is divisble by another number, using the modulo operator `%`: `if x % 3 == 0` = if the number is divisible by 3.*

    * Print "shazam" instead of the integer for numbers which are only a multiple of three and not five.
    
    * Print "hullaballoo" instead of the integer for numbers which are only a multiple of five and not three.
    
*Bonus : Alter the `for` loop to print the square root of any integer which is divisible by 7, instead of the original value (using the `math.sqrt` function).*

*Bonus : Alter the `for` loop to print "hocus pocus" if the square of the original integer is less than or equal to 100, **as well as** the text described above.*

### Random Number Game

To do this exercise we need one extra function and one extra Python command.

Inside a `for` loop, the `break` command stops the loop prematurely.

In [None]:
for i in range(1, 10):
    print (i)
    if i == 4:
        break

The `input` function asks the user to input a value before continuing.

In [None]:
x = input("What is your name? ")

In [None]:
print ("Your name is", x)

**Exercise**
We will generate a short game.

* Python will generate a random number between one and ten.
* The user gets 6 guesses to try to guess the number.
* If the number is too high, `print` "Too high!  Try again!".
* If the number is too low, `print` "Too low! Try again!".
* If the number is correct, `print` "Correct!" and `break` the loop.
* 

*Bonus: If the user runs out of tries `print` "Too slow!  The answer is" and the answer.  You will need to create an extra variable to keep track of if the attempt was a success or a failure*

*Bonus: Add a second for loop around the first to allow the user three "plays" of the game.  `print` "Attempt 1" on the first try, "Attempt 2" on the second try, etc.*


I will give you the first part of this code - copy it into your Spyder script.

This code does the following:
* Using the random.randint function from the random package, generate a random number between 1 and 10
* Generate a for loop from 0 to 6 (to run the code 6 times)
* On each iteration through the loop, ask the user to guess the number.


In [None]:
x = random.randint(1, 11)
for i in range(0, 6):
    val = int(input("Guess a number between 1 and 10:  "))

*NB: I have converted the varible generated by the `input` function from a string to an integer using the `int` function.*

## Reading and Writing Files

To understand reading and writing files in Python we have to learn another new concept - **methods**

Methods are very similar to functions.
There are two main differences:
* Methods are usually used on a single type of variable
* Methods names are provided after, instead of before the variable name, seperated by a `.`

We already know two methods: the `append` and `pop` methods for lists.

In [None]:
L = ['a', 'b', 'c']

In [None]:
print (L)

In [None]:
L.append('d')

In [None]:
print (L)

In [None]:
L.pop()

In [None]:
print (L)

There are hundreds of methods for different types of variables.

## Reading Files

The simplest way to read an input file is using the `open` function and the `readlines` method.
This reads the whole file into memory - it is only suitable for relatively short files (maybe less than a million lines).

The `open` function opens a file to either read or write.
The `readlines` method reads all the lines in the open file into a `list`.

I have provided you with a file called "List1.txt".

In [None]:
lines = open("List1.txt").readlines()

In [None]:
print (lines)

The `\n` in this string is a "line break" character - Python doesn't know to interpret these as line breaks.

We can remove them using a `for` loop and the `strip` method. 

In [None]:
for line in lines:
    line = line.strip()
    print (line)

Depending on what we are trying to do, we may be able to use this list directly.

For example, if we wanted to find all the lines with an "s" in them.

In [None]:
for line in lines:
    line = line.strip()
    if "s" in line:
        print (line)

We can also use indices to access parts of the list.

To get the first line in the file:

In [None]:
lines[0].strip()

To get the last line in the file:

In [None]:
lines[-1].strip()

**Exercise**
* Read the file "List2.txt" into a variable
* Loop through the file and remove the line break characters
* Print the second line in the file
* Print the last line in the file
* Print all the lines with the word "the" in them.

## Writing Files

To write a file, we first need to open it with the `open` function.

In [None]:
output = open("List3.txt", "w")

In [None]:
output.write("Hello")

You always need to rememeber to close the output file when you've finished writing to it, using the `close` method.

In [None]:
output.close()

We can only write strings to our output file, and we also need to provide all the formatting characters such as line breaks and tab characters, as nothing will be added automatically.

The cleanest way to do this is using **string formatting**.

With string formatting, we use the following symbols to represent different types of variable:
* `%s` string
* `%i` integer
* `%.2f` float (2 is the number of decimal places)

We insert these characters into a text string, then list the values we want to replace them with afterwards in brackets and seperated by commas.

In [None]:
x = 'string'
y = 5
z = 1.00
my_string = 'This %s contains %i integers and %.2f floats' % (x, y, z)

In [None]:
print (my_string)

In [None]:
a = 100
b = 200
c = 300

my_string = "Here are some integers: %i, %i, %i" % (a, b, c)

In [None]:
print (my_string)

There are two important non-text characters we can include:
* `\n` is a line break
* `\t` is a tab

In [None]:
a = 100
b = 200
c = 300

my_string = "Here are some integers:\t%i, %i\nHere is another one:\t%i" % (a, b, c)

In [None]:
print (my_string)

We can output this string directly into a file.

In [None]:
output_file = open("String_output.txt", "w")
output_file.write(my_string)
output_file.close()

We can then put the output of our programs into a file.

In [None]:
output_file = open("Square_numbers.txt", "w")
for i in range(0, 100):
    i_squared = i**2
    output_file.write("%i squared = %i\n" % (i, i_squared))
output_file.close()

We can also read a file, process it and write the results to another file.

In [None]:
input_file = open("List_of_numbers.txt").readlines()
output_file = open("output_numbers.txt", "w")

for item in input_file:
    item = int(item.strip())
    output_file.write("%i squared = %i\n" % (item, item ** 2))
output_file.close()

**Exercise**
* Read the "List_of_numbers.txt" file again, multiply each number by two and write it to an output file.
* Generate a file containing the square root (using the `math.sqrt()` function) of all numbers up to 50

*Bonus: Generate a file containing the square root of all numbers up to 50 only if they are multiples of 3*

|**Term**|**Definition**|
|--------------|------------------------------------------------------------------------------------|
|**IDE**|Interactive development environment, a piece of software designed to make programming easier|
|**console**|The interactive component of an IDE|
|**editor**|The static component of an IDE|
|**module**|A set of additional functions you can import into Python|
|**argument**|A varible or another piece of information passed to a function|
|**if statement**|A clause which only runs a piece of code when the following statement is `True`|
|**loop**|A programming construct used to repeat an action a number of times|
|**iterable**|A variable containing a sequence of other variables (often a list, tuple, dictionary or range) which you can step through|
|**range**|A sequence of integers|
|**method**|A function which belongs to a specific variable type|
|**string formatting**|A technique for adding your variables to a text-based output file|

## Homework (optional)

* Using a `for` loop, generate a file containing a list of numbers from 1 to 100.
* Read this file into a `list` in Python.
* Generate a new `list` containg only the even numbers (remember to convert them to integers using the `int` function)
* Add 50 to each number in the new list 
* Open an output file and write "x + 50 = y" where x is the original value and y is the final value for each of the values in the list.

It would be great if you could fill in this feedback form, especially if there's any improvements you'd like to see next week.

https://katybrown2.typeform.com/to/zH0xoI