# Python Workshop

## Overview

This workshop will cover the basics of python and put you in a position where you will be able to begin your journey into machine learning. We will be covering:
+ basic operators
+ variables
+ strings
+ conditional logic
+ lists and strings
+ selection
+ iteration 
+ functions
+ basic file io
+ plotting


## Introduction to python

Python is an interpreted high level language. An interpreted language is one that will compile and execute line by line, as opposed to a compiled language, in which all code must be compiled first before execution. It has many uses as it is a general purpose language, however, recently has been used a lot for data manipulation and machine learning, due to its ease of use and the powerful libraries that it gives access to. 

## Basic Operators

Python is very powerful and allows us to do many operations such as mathematic operations. Things such as addition and subtraction, multiplication and division can easily be done using +, -, * and / repsectively.

In [7]:
# add 1 and 2

In [2]:
# subtract 4 from 5

In [3]:
# multiply 2 and 5

In [4]:
# divide 10 by 2

<details><summary>Click for a hint</summary>
   Think about how you would enter the equation into a calculator but press run instead of the equals button.
<details><summary>Click to cheat</summary>
  <p>

  ---

  ```python
  1 + 2 # add 1 and 2
  ```

  ---

  ```python
  5 - 4 # subtract 4 from 5
  ```

  ---

  ```python
  2 * 5 # multiple 2 and 5
  ```

  ---

  ```python
  10 / 2 # divide 10 by 2
  ```
  </p>
</details>
</details>

As you can seen within this jupyter environment it will display output for us. However, in a terminal environment you need to use the ```print()``` function. 

In [5]:
# print the number 1

<details><summary>Click for a hint</summary>

   Use ```print()``` like a mathematical function
<details><summary>Click to cheat</summary>

  ```python
  print(1)  # print the number 1
  ```
</details>
</details>

If we want to print text we must first enclose it in "" or ''. Try printing the text "Hello World".

In [6]:
# quotations examples

<details><summary>Click to cheat</summary>

  ```python
  print("Hello World")
  ```
  or
  ```python
  print('Hello World')
  ```
</details>

Without escapting the second ' it closes the string early resulting in a error. To resolve this we must include the \ before the '.

In [7]:
print('This isn't so hard is it') # escape character

<details><summary>Click to cheat</summary>

  ```python
  print('This isn\'t so hard is it')
  ```
</details>

You can also print multiple strings in one print statement.

In [None]:
print("Hello", "world!")

## More Mathematical Operators

Back to the math operations as before, there are still more. We can also use indices as well as floor divison and finding the remainder. These are done with `**`, `//`, and `%` respectively. Brackets can also be used for order of operations.

In [8]:
# Raise 5 to the power of 2

In [9]:
# integer division 5 by 2

In [10]:
# modulo 5 by 2

In [11]:
# Use brackets to add 5 and 2 before mutliplying by 10

<details><summary>Click to cheat</summary>

  ```python
  5 ** 2 # Raise 5 to the power of 2
  ```
  ---
  ```python
  5 // 2 # integer division 5 by 2
  ```
  ---
  ```python
  5 % 2 # modulo 5 by 2
  ```
  ---
  ```python
  (5 + 2) * 10 # Use brackets to add 5 and 2 before multiplying by 10
  ```
</details>

## Variables

Variables can be thought of as a container for a piece of data, or something that will store something. Variables in python allow for data to be retained for further processing later on. We use the equals symbol (`=`) to assign a value to a variable.

Create a variable called `var`, assign it a number then print the result. Create a new variable `var2` by multiply `var` by 2 and print out the new value.

In [12]:
  # Create the variable by assigning it a number
  # Print the value
  # Multiply by 2
  # Print the new value

<details><summary>Click to cheat</summary>

  ```python
  var = 10 # Create the variable by assigning it a number
  print(var) # Print the value
  var2 = var * 2 # Multiply by 2
  print(var2) # Print the new value
  ```
</details>

Notice that after multiplying by 2, `var` still has the old value. You can confirm this by printing `var` after creating `var2`.

There are however some rules about what we can name variables:
+ A variable name can only contain alpha-numeric characters and underscores (`A-Z`, `a-z`, `0-9`, and `_`)
+ A variable name must start with a letter or the underscore character (`_`)
+ A variable name cannot start with a digit
+ Variable names are case-sensitive (`age`, `Age` and `AGE` are three different variables)
+ Can't be reserved words such as ```for```, ```if```, ```while```, ... etc. As these words have special meaning within Python

We also have the ability to comment in our code. It allows for us to explain certain things within our code and helps to improve the overall readability of our code, which is very important as we want other people and our future selves to easily be able to read code. (Most of programming reading and using existing code, not writing it!)

In [13]:
#This is a comment

As shown we can comment using the ```#``` character. Anything after this will be a comment. We can put this character anywhere and all characters after will be commented, hence, will not run as code.

In [14]:
var1 = 10 # we can also comment at the end of a line

While comments are useful, they are no substitute for good variable naming! Comments should compliment good naming, not compensate it!

Try renaming the variables based on the comments then delete the comments.

In [15]:
yay = 100  # my income
no = 0.1  # tax rate
sad = yay * no   # the taxes I have to pay
# Display taxes that need to be paid
print(sad)

<details><summary>Click to cheat</summary>

  Using camelCase,
  ```python
  myIncome = 100
  taxRate = 0.1
  myTaxes = myIncome * taxRate
  print(myTaxes)
  ```
  ---
  or using snake_case,
  ```python
  my_income = 100
  tax_rate = 0.1
  my_taxes = my_income * tax_rate
  print(my_taxes)
  ```

</details>

### Types

Variables have types. These relate to the type of data. So far we have seen integers, strings (text), and floats. Python has many built in types, some of which we will cover later. Python has loose typing, meaning variables typing is not fixed, for example we can store an integer in a variable then overwrite it with a string.

####  Integers
There are whole numbers, such as `1`, `12343`, and `1080`.

#### Floats
There are also numbers however contain a floating point, hence can be seen as decimals, such as `3.14` and `2.18`.

#### Strings
These are text or any sequence of characters. As these are sequences we can access individual elements within these sequences, however, this will be touched on later. Exampels include `"Hello World"`. As stated before to declare something as a string it must be contained in either ```''``` or ```""```

Try declaring some variables of the specified types below

In [16]:
# this is an integer

# this is a float 

# this is a string

<details><summary>Click to cheat</summary>

  ```python
  # this is an integer
  var1 = 10

  # this is a float
  var2 = 3.14

  # this is a string
  var3 = "hello"
  ```
</details>

It is also possible to change the type of certain variables such as converting an integer to a string, or a string to an integer. This is done through type casting, which changes the type of a piece of data. The keywords ```int()``` and ```str()``` are used to achieve this.

In [None]:
# Convert 10 from a string to an int


In [None]:
# Convert 10 from an int to a string


<details><summary>Click to cheat</summary>

  ```python
  # Convert 10 from a string to an int
  int("10")
  ```
  ---
  ```python
  # Convert 10 from an int to a string
  str(10)
  ```
</details>

### Strings in-depth


As stated earlier jupyter will print things for us if we enter just it within a block. Otherwise we must make use of the ```print()``` function.

There also exist special characters that make use of the escape character (`\`). These are the return character `\r`, newline character `\n`, and tab character `\t`.

In [None]:
print("This \n represents a new line")
print("This \t is a tab")
print("This \r is a carriage return")

Once again we can use `\` to "escape" special characters such as `\n` and also single quotes!

In [17]:
# \ allows us to print special characters including single quotes
print("Now we can print \\n")
print('I\'m also able to use single quotes now')

The length of a string can be found using the ```len()``` function. This will give us the number of characters in a string, as it counts all the characters, including spaces.

In [None]:
# Find the length of any string you want


<details><summary>Click to cheat</summary>

  ```python
  len("This is a very long string")
  ```
</details>

We can concatenate strings using the ```+``` operator, as it is not just used for addition.

In [None]:
# Concatenate the strings "Hello" and "World"


<details><summary>Click to cheat</summary>

  ```python
  "Hello" + "World"
  ```
</details>

This concatenation can also be used with variables that are strings

In [18]:
# two string variables
hello = "Hello"
world = "World"

# string concatenation


<details><summary>Click to cheat</summary>

  ```python
  # two string variables
  hello = "Hello"
  world = "World"

  # string concatenation
  hello + world
  ```
</details>

We can also use the `*` operator to repeat a string multiple times. Once again this operator is not just for maths, so be careful operators and what data type they are operating on.

In [None]:
myString = "my string"
# Mutliply the string variable by 4


<details><summary>Click to cheat</summary>

  ```python
  myString = "my string"
  # Mutliply the string variable by 4
  myString * 4
  ```
</details>

Notice that spaces are not automatically inserted inbetween concatenated strings.

It is also possible to take a string from the user to do as you so please. This is done through the function ```input(prompt)``` where `prompt` is a string that prompts the user.

In [None]:
# Use input() to prompt the user to enter their name


<details><summary>Click to cheat</summary>

  ```python
  # Use input() to prompt the user to enter their name
  input("Enter your name: ")
  ```
</details>

## Variables revisitied

There are a few more things to mention with variables. They can be reassigned and done so with reference to itself. 

In [None]:
num = 10
  # Increase num by 10
print(num)

<details><summary>Click to cheat</summary>

  ```python
num = 10
num = num + 10 # Increase num by 10
print(num)
  ```
</details>

This is possible as the right hand side of the expression is evaluated first, before then assigning the result to the variable. This logic can also be simplied using `+=`. There is also `/=`, `-=`, `*=`.

In [19]:
  # Increase num by 10 using +=
print(num)

<details><summary>Click to cheat</summary>

  ```python
num += 10 # Increase num by 10 using +=
print(num)
  ```
</details>

### Using type

```type()``` allows us to find the type of a variable. Run the cell below to find out the types.

In [None]:
var = 10
var2 = '10'
var3 = 10.0

print(type(var))
print(type(var2))
print(type(var3))

## Conditional Logic

We are able to execute logic operations as well within python using the boolean type. Booleans can be one of two values ```True``` or ```False```.

In [None]:
# Create a boolean variable

# Print the type of the variable


<details><summary>Click to cheat</summary>

  ```python
# Create a boolean variable
var = True
# Print the type of the variable
print(type(var))
  ```
</details>

The logic operators we have access too are ```and```, ```or``` and ```not```. These will be best described with examples.

In [None]:
# and will only result in True if both sides are True
print("True and True is", True and True)
print("True and False is", True and False)
print("False and True is", False and True)
print("False and False is", False and False)

# or will result in True if either side is True
print("True or True is", True or True)
print("True or False is", True or False)
print("False or True is", False or True)
print("False or False is", False or False)

# lastly not with change to the other one eg:
print("not True is", not True)
print("not False is", not False)

These logic operators are very simple, however, when chained together can allow for some really powerful expression and powerful logic. They can also be used with variables as well as the ```==``` operator and ```!=``` which are once again best explained with examples.

In [None]:
# using == and !=
var1 = 10
print("var1 == 10: ", var1 == 10)
print("var1 != 10: ", var1 != 10)
print("var1 == 12: ", var1 == 12)

# we can then chain these expression together
print(not (var1 == 10 and var1 != 12))

Boolean logic can also be used for selection and iteration within a program, which is what makes it so powerful. This will make sense shortly.

## Lists and Strings.

A list is an ordered group of data. It can be any data. An example is ```[1,2,3,4]```. Which is a list containing 4 elements or members. They are arranged in a specific order and can be thought of as a sequence. Unlike strings, they are mutable, this means elements inside them can be changed. 

Lists are constructed with ```[]``` and commas seperating each element of the list.

In [20]:
# Assign a list with the elements 1, 2, 3, and 4 to variable named lst


<details><summary>Click to cheat</summary>

  ```python
  lst = [1, 2, 3, 4]
  ```
</details>

We created a list of integers, but lists can hold any object type.

In [None]:
# Create a list with an integer, a float, and a string


<details><summary>Click to cheat</summary>

  ```python
  lst = [10, 3.14, "pi"]
  ```
</details>

Lists, just like strings have acces to the `len()` function to determine their length

In [None]:
# Find the length of lst


<details><summary>Click to cheat</summary>

  ```python
  len(lst)
  ```
</details>

### Indexing and slicing
Strings and arrays, as they are both sequences are able to be indexed and sliced.

For example to access the first element of a string or a list:

In [None]:
lst = [1, 2, 3, 4]
lst[0]

In [None]:
string = "Hello"
print(string[0])

Indexing for both lists and strings starts at 0 and ends at the length of the sequence (list or string) - 1

In [21]:
# Last letter of Hello

print(string[len(string) - 1])

# Last element of the list
print(lst[len(lst) - 1])

Python lets us access the last element by -1.

In [22]:
# print the last letter of string using a negative index


<details><summary>Click to cheat</summary>

  ```python
  string[-1]
  ```
</details>

Following this trend the second last can be grabbed using -2 and so on.

In [23]:
# Second last letter of string


# Third last letter of string


<details><summary>Click to cheat</summary>

  ```python
  string[-2]
  string[-3]
  ```
</details>

We can use slicing to grab everything from a certain index. We use `lst[start:end]` or `string[start:end]` to specify a slice. Note that `start` is inclusive and `end` is exclusive, so `lst[0:3]` gets elements 0, 1, and 2, but not 3. By not specifying start or end `0` and `-1` are assumed.

In [24]:
lst = [1, 2, 3, 4, 5]

# Grab index 1 and everything past it


In [25]:
# Grab everything UP TO index 3


<details><summary>Click to cheat</summary>

  ```python
  lst[1:]

  lst[:3]
  ```
</details>

Slicing works using a `start:stop:step` Where the `start` is the starting index, `stop` is ending index, and the `step` is the increments it takes. If `step` is omitted, 1 is assumed.

In [28]:
# Step only
lst[::2]

# Start and step
lst[2::2]

# stop and step
lst[:4:2]

# Start, stop, and step
lst[1:4:3]

**Slicing works like this with strings as well as lists.**

### Basic List Methods
Lists have no fixed size and no fixed type constraint, which make them more flexible than arrays in other languages. 

There are some more special methods for lists:



In [33]:
# Create a new list
newList = [1, 3, 4]

We can use ```append()``` to add an item to the end of a list:

In [34]:
# Append the string "append"


# Show the list


<details><summary>Click to cheat</summary>

  ```python
  # Append the string "append"
  newList.append("append")

  # Show the list
  newList
  ```
</details>

We can also "pop off" an item from the list. By default this would be the last index but it can also be specified.

In [35]:
# Pop of the 1 indexed item


# Show


<details><summary>Click to cheat</summary>

  ```python
# Pop of the 1 indexed item
newList.pop(1)

# Show
newList
  ```
</details>

In [37]:
# Hence we can assign a variable the value returned from pop()


# Show remaining list


<details><summary>Click to cheat</summary>

  ```python
# Hence we can assign a variable the value returned from pop()
poppedItem = newList.pop()

# Show remaining list
newList
  ```
</details>

We can also add items to a list using the ```extend()``` function, however this is a little different to the ```append()``` function. This difference is shown best when adding a list to another list.

In [39]:
# Append the list [1, 2, 3]


# Show


<details><summary>Click to cheat</summary>

  ```python
# Append the list [1, 2, 3]
newList.append([1, 2, 3])

# Show
newList
  ```
</details>

As you can see this makes the next element exactly what the argument to the append function is. This is useful if you want nested lists. However, if you don't you can use the ```extend()``` function.

In [40]:
# Extend the list [1, 2, 3]


# Show


<details><summary>Click to cheat</summary>

  ```python
# extend the list [1, 2, 3]
newList.extend([1, 2, 3])

# Show
newList
  ```
</details>

As you can see this time it added in new elements for each element of the list we passed to the extend function.

### Nesting Lists.

As shown above we can nest lists. This allows us to create 2D Lists which are similar to a matrix. You can then access each element in this 2D list using *list[row][column]*. For example:

In [41]:
# Start of by making some lists
list1 = [1,2,3]
list2 = [4,5,6]
list3 = [7,8,9]

# Make a list of the three lists for a 2D list
list2D = [list1, list2, list3]

# Show
list2D

In [44]:
# Grab the item from the second row and third column


<details><summary>Click to cheat</summary>

  ```python
list2D[2][3]
  ```
</details>

## Selection Statements

We are able to select which path to go down within our code using logic expressions and selection statements. Selection statements allow us to only execute certain code if a condition is met (is ```True```).

These are `if`, `elif`, and `else`.

The best way once again to explain these is to show them and what they do.

In [45]:
# The if statement allows us to execute code only if the statement is true
if var1 == 10:
    # Any code indented after the above ':' will be executed.
    print("var1 is 10")

We must indent all the code for it to be executed as part of the if statement otherwise we get an error.

In [None]:
if var1 == 12:
print('var1 is 12')

If the condition in the if statement is not met we can have code execute through the use of ```else:``` statement.

In [None]:
var = int(input("Enter a number"))

if var == 12:
    print('input is 12')
# Write an else statement that prints 'input is not 12'


<details><summary>Click to cheat</summary>

  ```python
var = int(input("Enter a number"))

if var == 12:
    print('input is 12')
# Write an else statement that prints 'input is not 12'
else:
    print('input is not 12')
  ```
</details>

Lastly, the ```elif```, this one works as an 'else if', hence, it will come after an if and will check another condition, which if true will then execute the code indented.

In [None]:
var = int(input("Enter your number: "))

if var == 12:
    print('input is 12')
# Write an elif statement here for checking if input is 10
else:
    print('input is not 12 nor 10')


<details><summary>Click to cheat</summary>

  ```python
var = int(input("Enter a number"))

if var == 12:
    print('input is 12')
# Write an elif statement here for checking if input is 10
elif var == 10:
    print('input is 10')
else:
    print('input is not 12')
  ```
</details>

Any number of `elif` statements can be used after an `if` statement and before an `else` statement.

In [None]:
var2 = int(input("Enter your number: "))

if var2 == 1:
    print('var2 is 1')
elif var2 == 2:
    print('var2 is 2')
elif var2 == 3:
    print('var2 is 3')
elif var2 == 4:
    print('var1 is 4')
elif var2 == 5:
    print('var2 is 5')
else:
    print('var2 is ?')

Selection statements can also be used within other selection statements. These are called nested statements, as they are inside another.

In [None]:
var = int(input("Enter your number: "))

if var >= 1:
    print('input is positive')
    # Write an if statement that checks if var is 1


<details><summary>Click to cheat</summary>

  ```python
var = int(input("Enter your number: "))

if var >= 1:
    print('input is positive')
    # Write an if statement that checks if var is 1
    if var == 1:
        print('input is 1')
  ```
</details>

These statements allow for control over how we run our programs. It allows for things to execute only if conditions are met which is very powerful. However, it is important that we think about what conditions we are using and how we are using if statements. The more you use the harder your code will be to follow, so think about the best way to write you code so it works and is easy to read, as you won't be the only one reading your code, or you might forget all about it next time you look at it.

## Iteration 

Next we will go over how to run the same block of code multiple times. This allows code reuse as it is possible to call the same code multiple times. This is done through loops, where, code will be executed until the loop is exited. There are a few different types of loops. These are ```for``` loops and ```while``` loops. There is also another type of loop which is absent from Python, this is the `do-while` loop. 

The `for` loop will execute the indented code for each element of a list or other collection. The `while` loop works by executing while a certain conditoin is met. The `do-while` loop is very similar to the `while` loop but will check the condition at the end of the code inside the loop. This means this type of loop will *always* run at least once.

### for loops

There are a few different ways to use `for` loops, for now I will go over ways without using lists, and will touch back on this once lists are covered.

The general format of a `for` loop in Python is:

```python
for item in object:
    # perform operation len(object) many times
```

or

```python
for i in range(n):
    # perform operation n many times
```

The name used for the item can be whatever you want it to be, this allows us to increase readability of our code. This item name can then be referenced inside your loop, for example if you wanted to use `if` statements to then perform checks.

When using a `range(n)` object, the `for` loop will iterate `n` times, from `0` to `n-1`. The `i` variable will hold these values in ascending order.

In [None]:
# Write a for loop using a range() object that prints the values 0, 1, 2, ... , 9 in two lines of code


<details><summary>Click to cheat</summary>

  ```python
# Write a for loop using a range() object that prints the values 0, 1, 2, ... , 9 in two lines of code
for i in range(10):
    print(i)
  ```
</details>


To explain what this code does first the range function will need to be explained. It works by taking a number and returning a list from 0 to that number. i will take on each value inside the list created by range, hence, when we print it in each iteration it will be the repsective number from the list.

We can also iterate through an existing list and then i will become each element from the list as shown below. **Note: the variable `i` can be named anything a variable can be named. It is also only accessible from within the `for` loop.**

In [46]:
# Create a list of any numbers you want


# Use a for loop to print out the values in the list


<details><summary>Click to cheat</summary>

  ```python
# Create a list of any numbers you want
lst = [10, -5, 20]

# Use a for loop to print out the values in the list
for item in lst:
    print(item)
  ```
</details>


A lot of the time we will be iterating through a list and doing something with each value:

In [47]:
# Create a list
lst = [1, 2, 3, 4, 5, 6, 7]

# Only print the even numbers with a for loop


<details><summary>Click to see hint</summary>

You can use `num % 2 == 0` to check if `num` is even.

<details><summary>Click to cheat</summary>

  ```python
# Create a list of any numbers you want
lst = [10, -5, 20]

# Only print the even numbers with a for loop
for num in lst:
    if num % 2 == 0:
      print(num)
  ```
</details>
</details>


Another common use of a for loop is to keep a counter on some value. For example, let's create a for loop that sums up the list:

In [48]:
#Start sum at 0
listSum = 0

# Add the elements of the list using a for loop


print(listSum)

<details><summary>Click to cheat</summary>

  ```python
#Start sum at 0
listSum = 0

# Add the elements of the list using a for loop
for num in lst:
    listSum += num

print(listSum)
  ```
</details>

We've used `for` loops with lists, now let's try strings. Remember strings are a sequence so when we iterate through them we will be accessing each character in that string.

In [None]:
# Iterate over a string and print out each character


<details><summary>Click to cheat</summary>

  ```python
for char in "my string":
    print(char)
  ```
</details>

Sometimes, we want to iterate over both the index of an element and the element itself at the same time. We can do this using the `enumerate()` function, which returns the index first, then the element of the collection.

In [None]:
lst = ["bob", 12, 3.14, "snale"]

# Use enumerate() to iterate over the indexes and elements of lst


<details><summary>Click to see hint</summary>

`enumerate()` takes in `lst` and returns `i, item`.

<details><summary>Click to cheat</summary>

  ```python
lst = ["bob", 12, 3.14, "snale"]

# Use enumerate() to iterate over the indexes and elements of lst
for i, item in enumerate(lst):
    print("The", i, "element is", item)
  ```
</details>
</details>

Now we have covered lists. But what about multi-dimensional lists, such as the lists in lists we covered earlier? These can also be accessed and used within `for` loops. To access each element of these we require two `for` loops.

In [None]:
lst = [[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]]

# Use an outer for loop to iterate over the rows in lst

    # Use an inner for loop to iterate over the numbers in the row

        # Print the number


<details><summary>Click to see hint</summary>

The outer `for` loop is `for row in lst:`

<details><summary>Click to cheat</summary>

  ```python
lst = [[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]]

# Use an outer for loop to iterate over the rows in lst
for row in lst:
    # Use an inner for loop to iterate over the numbers in the row
    for num in row:
        # Print the number
        print(num)
  ```
</details>
</details>

This can also be done using the index instead using the `range()` function as shown earlier.

In [None]:
lst2 = [[4, 2, 1],
        [0, -3, 4],
        [0, 0, 5]]

for i in range(len(list2)):
    for j in range(len(list2[i])):
        print(list2[i][j])

Notice that `len(lst2) == 3`, and not `9` as each inner list only counts as one element, even if the inner list is empty!

If you want to get the number of innermost elements of the list, you can use
```python
len(lst2) * len(lst2[0])
```
**Note: This will not always work, only if the lists inside the outter list are all of the same dimension!!! E.g. a list of lists that are 4 elements long.**

As you can see for loops allow us to write more condensed code as we can perform operations over whole lists without having to repeat code and can instead be written in a loop.

### while loops

`while` loops also allow us to perform iteration, however in a different way as seen in the `for` loop. A `while` statement will continue to execute a single statement or group of statements as long as a specified condition is true. The general format of a `while` loop is:

```python
while condition: 
    # nested code
else:
    # exit code
```

This is slightly different to other languages. Now let's go over some examples!

In [None]:
x = 0

while x < 5:
    print("x is curently:", x)
    print("x is still less than 10, adding 1 to x")
    x += 1

print("Exited the while loop")

In [None]:
x = 0

while x < 5:
    print("x is curently:", x)
    print("x is still less than 10, adding 1 to x")
    x += 1
else:
    print("All Done!")

Now try your own `while` loop:

In [None]:
# Prompt the user to enter a number using input() and save it to a variable

# If the number isn't 5, enter the while loop

    # Use a print() statement to tell the user they guessed wrong

    # Prompt the user for another number

# Exit the while loop
# Print that the user guessed the correct number


<details><summary>Click to see hint</summary>

The `while` loop starts as
```python
while num != 5:
```

<details><summary>Click to cheat</summary>

  ```python
# Prompt the user to enter a number using input() and save it to a variable
num = int(input("Guess the number: "))

# If the number isn't 5, enter the while loop
while num != 5:
    # Use a print() statement to tell the user they guessed wrong
    print("Wrong!")
    # Prompt the user for another number
    num = int(input("Try again: "))

# Exit the while loop
# Print that the user guessed the correct number
print("Correct!")
  ```
</details>
</details>

#### break, continue pass

We can use `break`, `continue`, and `pass` statements in our loops to add functionality. However, if done wrong this can lead to very unreadable code and make it very hard to debug. For this reason it will be shown that these exist and how they work but it is recommended that these not be used.

```break```: Breaks out of the current closest enclosing loop. 

```continue```: Goes to the top of the closest enclosing loop. 

```pass```: Denotes a loop that does nothing when entered

An example of these:

In [None]:
x = 0

while x < 5:
    if x==2:
        x += 1
        print("continuing!!!")
        continue

    print("x is currently: ",x)
    print(" x is still less than 10, adding 1 to x")
    x+=1

Now try to rewrite the `while` loop above without the `continue` statement.

In [None]:
x = 0

while x < 5:
    if x==2:
        x += 1
        print("continuing!!!")
        continue

    print("x is currently: ",x)
    print(" x is still less than 10, adding 1 to x")
    x+=1

<details><summary>Click to see hint</summary>

Use an `else` statement.

<details><summary>Click to cheat</summary>

  ```python
x = 0

while x < 5:
    if x==2:
        x += 1
    else:
        print("x is currently: ",x)
        print(" x is still less than 10, adding 1 to x")
        x+=1
  ```
</details>
</details>

In [None]:
num = int(input("Enter a number: "))

while num < 10
    print('Your number is too small")

    if num==3:
        print('Forbidden guess!')
        break

    num = int(input("Try again: "))

print("Fin")

Try rewritting the above `while` loop without the `break` statement.

In [None]:
num = int(input("Enter a number: "))

while num < 10
    if num==3:
        print('Forbidden guess!')
        break

    print('Your number is too small")
    num = int(input("Try again: "))

print("Fin")

<details><summary>Click to see hint</summary>

Use an additional check for the exit condition.

<details><summary>Click to cheat</summary>

  ```python
num = int(input("Enter a number: "))

while num < 10 and num != 3:
    print('Your number is too small")
    num = int(input("Try again: "))
  
if num == 3:
    print("Forbidden!")

print("Fin")
  ```
</details>
</details>

**Note: It is possible to create infinitely running loops with while statements.** Sometimes this is useful for creating an event loop for programs that repeat the same actions over and over again, such as servers.

In [50]:
#DO NOT RUN THIS CODE!!!!
print("Again...")
while True:
    print("and again...")

## Strech Goal: Calculator app

With iteration and selection you are now able to create very powerful programs. A simple program you can now make is a calculator. One that will keep looping and asking for numbers and operations until exit is entered. It's entirely up to you want operations you want to include (e.g. addition and subtraction).

In [None]:
# define initial variables
op = ''

# Use a while loop here

    # Get the user's operation and two numbers

    # Check that the operation is valid

        # Compute and print the result
    
        # Otherwise, print an error message

<details><summary>Click to see hint</summary>

Inside a while loop, get the user to enter an operation and two numbers.

If the operation is valid, calculate the results and print it out.

<details><summary>Click to cheat</summary>

  ```python
# define initial variables
op = ''

while op != 'q':
    op = input("Enter an operation or 'q' to quit")

    if op != 'q':
        num1 = int(input("Enter your first number: "))
        num2 = int(input("Enter your second number: "))

        if op == '+':
            res = num1 + num2
        elif op == '-':
            res = num1 - num2
        elif op == '*' or op == 'x':
            res = num1 * num2
        elif op == '/':
            if num2 != 0:
                res = num1 / num2
            else:
                res = 'divide by zero error'
        elif op == '//':
            if num2 != 0:
                res = num1 // num2
            else:
                res = 'divide by zero error'
        elif op == '**':
            res = num1 ** num2
        elif op == '%':
            if num2 != 0:
                res = num1 // num2
            else:
                res = 'divide by zero error'
        else:
            res = 'unknown operation'
        
        print(num1, op, num2, '==', res)
  ```
</details>
</details>

## Functions

We have seen how to write simple programs. However, continuing with the idea of code reuse, wouldn't it be nice to have to not repeat similar code. Well, we can do this with the use of functions. These allow us to call the same blocks of code multiple times. We can also pass certain variables into these functions which make them very powerful. This will make sense with some examples.

To create a function the keyword ```def``` is used. This allows us to define a function. A function needs a name and may have input parameters. We would write a simple function that takes no inputs as seen below.

In [None]:
def print10():
    print(10)

We are now able to call this function anywhere are it has been defined. Due to python being an interpreted language (runs it line by line instead of compiling), we need to have the function defined above where we are going to use it. This ensures that python knows that it has been defined.

Run the cell below, read the error, then fix it.

In [None]:
print2()

def print2():
    print(2)

<details><summary>Click to see hint</summary>

Place the function call after the function definition.

<details><summary>Click to cheat</summary>

  ```python
def print2():
    print(2)

print2()

  ```
</details>
</details>

Notice the brackets after the function name, this is how the function is called, it basically just lets python know that it is a function to run not a variable. If we do not include this then the function will not run as shown below. So remember to use the brackets to run functions

In [None]:
print2

### Input parameters

We are also able to pass values into the function. We define these input parameters inside the brackets in the definiton of the function. This can be seen below where a simple function that finds the area of a circle given its radius:

In [None]:
## Define the function
def calcCircleArea(radius):
    print(radius * radius * 3.14)

## Call the function
calcCircleArea(5.0)

Notice that the function name starts with a verb, as functions are "doing" blocks of code. Variable names should be nouns, as variables represent independent entities, ideas, or objects.

### Returning values

Continuing with the idea of a function that finds a circle's area, we can also have a function that instead of printing the area it returns it. This means we can assign variables to the output of functions as can be seen below.

In [None]:
## Define the function
def calcCircleArea(radius):
    return radius * radius * 3.14

## Call the function
circleArea = calcCircleArea(5.0)


### Print and Return

As we can see print and return may seem similar, however all print is doing is displaying a variable or displaying something to the screen. Where return allows us to save it into a variable for later as show in the example above. This is extremely useful for separating your code into blocks that perform calculations and blocks that do user IO. Keeping our code modular like this makes it easy to reuse code later.

### More Advanced Functions

From here you can now write much more advanced functions to do multiple things. Just remember the purpose of a function is to reduce the amount of times you need to use an operation, so use them instead of having repeated chunks of code.

### Check For A value In A List

An example of a more complicated function is one that will return true if it finds a value in a list. First I will give you some time to come up with an answer then we will go over an incorrect approach before the correct one.

The incorrect approach is as follows:

In [None]:
# Incorrect
def checkForValue(sequence, value):
    for element in sequence:
        if element != value:
            return False
        else:
            return True

In [None]:
list1 = [1, 2, 3, 4]
value = 2

checkForValue(list1, value)

As we can see the function did not work as we are returning false if the value does not equal the value we are looking for, so in the first loop it will return false. Also notice the multiple returns, this is not the best practise as it reduces the readability of the code. Try to fix the example above in the cell below.

In [None]:
# TODO: fix me!
def checkForValue(sequence, value):
    for i in sequence:
        if i != value:
            return False
        else:
            return True

<details><summary>Click to see hint</summary>

You need a boolean variable to keep track of whether the value has been found.

<details><summary>Click to cheat</summary>

  ```python
def checkForValue(sequence, value):
    contains = False
    
    for element in sequence:
        if element == value:
            contains = True
    
    return contains
  ```
</details>
</details>

In [None]:
# Now test your function by running this cell
list1 = [1, 2, 3, 4]
value1 = 3
value2 = 7

print(checkForValue(list1, value1))
print(checkForValue(list1, value2))

### Testing functions

It is also good practise to write tests for our functions to ensure they have the expected functionality. For a function like this we would just test for each case. We can write a test function list.

In [None]:
def functionTest():
    print("Testing function")

    expectedOutputs = [output1, output2]
    inputs = [input1, input2]

    for i in range(len(inputs)):
        print("Test " + i + " : ")
        if function(inputs[i]) == expectedOutputs[i]: 
            print("\tPassed")
        else:
            print("\tFailed")


For the checkForValue Function we could write the test function as below. However this is just a guide and all we need to do for the test function is test each case of inputs for the function.

In [None]:
# define function
def testCheckForValue():
    # write definition

# run the function
testCheckForValue()

<details><summary>Click to see hint</summary>

Copy the generic `functionTest()` above and modify it.

<details><summary>Click to cheat</summary>

  ```python
def testCheckForValue():
    print("Testing checkForValue")
    
    expectedOutputs = [True, False]
    inputVal = [1, 7]
    inputList = [[1, 2, 3], [1, 2, 3]]
    
    for i in range(len(inputVal)):
        print("Test " + str(i) + " : ")
        
        if checkForValue(inputList[i], inputVal[i]) == expectedOutputs[i]:
            print("\tPassed.")
        else:
            print("\tFailed.")

testCheckForValue()
  ```
</details>
</details>

## File IO



Although variables are very powerful and allow for us to store data, this data will be lost when the program ends. This is as it is stored in RAM, which will then be written over by other programs. If we want to keep any data from our program to use later or for other purposes we need to write it to files. There are many types of files that can be used and different file formats. We will go over txt files and csv files, as these are a good starting spot and will allow for you to retain data after a programs runtime.

### Reading

To read files in python we first need to open the file. The syntax for this is as follows:

```f = open(filename, "r")```

This gives us a variable called file that we can now refrence in our program. In python there are three main methods for reading a file: ```readline()```, ```readlines()``` and ```read()```

```readline()``` : Reads a single line from the file and then returns that line as a string

```readlines()``` : Reads all the lines of the file and returns an array of each line, where each line ends with the \n character

```read()``` : Reads the whole file, we can also specify how many bytes to be read. (usually won't use this latter functionality)

We also need to close files. This can be done using ```f.close()```

Some examples:

In [None]:
f = open("testfile.txt", "r")

print(f.readline())
print("\n-----")
f.close

f = open("testfile.txt", "r")
print(f.readlines())
print("\n-----")
f.close

f = open("testfile.txt", "r")
print(f.read())
f.close()

Instead of having to close the file each time we can make use of the `with` keyword, which will clean up our file for us. This is good practise as even if an exception occurs it will make sure to still close the file. This is demonstrated below:

In [None]:
with open("testfile.txt", "r") as f:
    print(f.read())

Each subsequent read will move where the text cursor is in the file, so that subsequent reads will continue from that position.

In [None]:
with open("testfile.txt", "r") as f:
    print(f.readline())
    print("---")
    print(f.readlines())
    print("---")
    print(f.read())
    print("---")

Hence it is possible to use either function to read the whole file, with read we get all the text, with readlines we get it all seperated by each line and with readline we just keep reading until there are no lines left.

We can use a `for` loop to iterate over the lines in a file.

In [None]:
# Open the file in a with statement

    # Use a for loop

        # print the line's contents


<details><summary>Click to cheat</summary>

  ```python
# Open the file in a with statement
with open("testfile.txt", "r") as f:
    # Use a for loop
    for line in f:
        # print the line's contents
        print(line)
  ```
</details>

### Writing

When we write to a file we need to open the file in write mode as shown below:

```f = open(filename, "w")```
However, as shown before we can make use of the with keyword and hence, would write it as:

```with open(filename, "w") as f:```
To write string to a file we use the function write. This can be shown below, where we are writing to a new file then reading the content we added to it.

In [None]:
with open("test.txt", "w") as f:
    f.write("THis is the first line")
    f.write("hello")

# To check what was written
with open("test.txt", "r") as f:
    print(f.read())


As we can see, this does not add in the `\n` after each write. We need to put this in ourselves.

In [None]:
with open("test.txt", "w") as f:
    f.write("This is the first line\n")
    f.write("hello\n")

# To check what was written
with open("test.txt", "r") as f:
    print(f.read())


**WARNING:** Whenever we open a file in write mode, if a file already exists with the same filename, it **WILL** be overwritten!

### Appending

To get around the case where we want to add lines to a file, we have two options. We can read the file first, add the data we want then write it again, or we can open the file in append mode to append to the end of the file. Hence, there is one last mode to learn about for fileIO, the append mode. It is the exact same as before except we use 'a':
```
open(filename, "a")
```

In [1]:
with open("test.txt", "a") as f:
    f.write("This is the lastline\n")

# To read the file to see the line above being added at the end
with open("test.txt", "r") as f:
    print(f.read())    

THis is the first line
hello
This is the lastline



### CSV files

A csv file or comma seperated variable file is a file commonly used to store data. It is similar to an excel spreadsheet where all the values are separated into rows and columns. An example of such file is:

```
day,temperature
Monday,26
Tuesday,29
Wednesday,30
```

Where all values are separated by commas, with each row being associated to a particular variable.



In [51]:
## reading a csv file
with open('names.csv', 'r') as namesfile:
    namesfile = open('names.csv')
    linelist = line.split(',')
    print(linelist)

Then to read the whole file we just need to loop through and read each line. Or read all lines then split based on the `\n`. We can also then pass the data to whatever is required

In [52]:
## writing a csv file
with open('names3.csv', 'w') as newnames:
    newline = ','.join(linelist)
    newnames.write(newline)

# Checking that the file's contents was saved
with open('names3.csv', 'r') as newnames:
    print(newnames.read())

Now we can apply this to writing and reading much more complex csv files.

The task is to parse the data from the weather csv from the BOM. The file is included in the directory. It is the data on weather from march last year where the 1st row is the min temperature of the day and the 2nd row is the max temperature of the day. Hence, there are 31 columns, one for each day of March.

In [None]:
# Open the file called "marchweather.csv"

# Get the file's contents and store inside a variable

# close the file

# print the file's contents

<details><summary>Click to cheat</summary>

  ```python
# Open the file called "marchweather.csv"
with open('marchweather.csv', 'r') as f:
    # Get the file's contents and store inside a variable
    lines = f.readlines()
# close the file

# print the file's contents
mins = lines[0]
maxs = lines[1]

print("maxs:", str(maxs) + '\n')
print("mins:", str(mins))
  ```
</details>

## Pandas

Python's built in file IO functions provide convenient ways of loading data to and from a file. However, when the data becomes complex, Python's built-in file IO functions become tedious to work with.

Thankfully, a third party library called Pandas allows us to easily load, analyse, clean, and save text based data.

First, let's import pandas and load a dataset:

In [19]:
import pandas as pd

planets = pd.read_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/planets.csv')

While Pandas offers two key data structure: Series and DataFrames. Series are simply 1D sequences of data whereas DataFrames are 2D. Since CSV files are 2D in nature, `pd.read_csv()` returns a DataFrame if successful. We can view key information about our DataFrame using:
+ `head()`: view the first few rows of your DataFrame.
+ `tail()`: view the last few rows of your DataFrame.
+ `columns`: view the columns our DataFrame contains.
+ `describe()`: view basic statistical information for each column of your DataFrame.
+ `info()`: view basic information about the DataFrame, such as number of rows, number of columns, and number of NaN values.

In [None]:
# Let's view the first few rows of our data
planets.head(10)

In [None]:
# Now let's look at the columns our data is made up on
planets.columns

In [None]:
# We can also treat the DataFrame as a list by viewing the rows by index
planets[50:60]

In [None]:
# Let's view the mass column by itself
planets['mass']

In [None]:
# Lastly, let's see the planets whose year is 2008
planets[planets['year'] == 2008]

You may have noticed several values that are `NaN`. This is a special value meaning "Not a Number" as in the value is missing or errorenous.

We can see just how many `NaN` values are in our dataset with `info()`.

In [None]:
# info shows us how many rows there are in total and, for each column, how many rows are non-null
planets.info()

In [None]:
# describe() attempts to calculate basic stats for each column, if possible
# Note that NaN values are ignored
planets.describe()

We can count how many `NaN` values there are for each column by summing the `isnull()` output.

In [None]:
planets.isnull().sum()

A quick and dirty fix is to simply drop the erroneous rows by using `dropna()`. However, this would mean we would **drop 522 rows!**

Sometimes, it's better to replace the `NaN` values with the mean or median to avoid dropping too much data.

Let's do an example with the `orbital_peroid` column:

In [None]:
# Get the orbital_period column
# notice that this is a Pandas Series
orb_period = planets['orbital_period']
orb_period.info()

In [None]:
# Calculate the mean
mean_orb_period = orb_period.mean()

# replace the NaN values with the mean
orb_period.fillna(mean_orb_period, inplace=True)

By using `inplace=True` we have modified the `planets` DataFrame.

Now do the same for the other columns with `NaN` values.

In [None]:
# Load the columns

# Calculate the means

# Replace the NaN values


<details><summary>Click to cheat</summary>

```python
# Load the columns
mass = planets["mean"]
distance = planets["distance"]

# Calculate the means
mass_mean = mass.mean()
dist_mean distance.mean()

# Replace the NaN values
mass.fillna(mass_mean, inplace=True)
distance.fillna(dist_mean, inplace=True)
```

</details>

Pandas is an incrediably powerful library that this workshop does not adequetly cover. It is strongly recommended that you visit the offical documentation [here](https://pandas.pydata.org/).

## Plotting

In science, engineering and especially machine learning, being able to display our results in an easy to understand way is very important. The most popular python library for this task would be `matplotlib.pyplot`. It allows us to graph our results and display them and save them with minimal code.

We start by importing the library

In [53]:
# importing the library is done through the import statement
import matplotlib.pyplot

Python also allows for alias for imports, so lets import it as plt to simplify its usage later on

In [54]:
# importing the library is done through the import statement
import matplotlib.pyplot as plt

Now lets generate create a simple plot.

In [None]:
plt.plot([1,2,3,4])
plt.title('Example plot with only y-values')
plt.ylabel('Some data')
plt.xlabel('Input values')
plt.show()

As shown above we can plot an array of data using the plot function from the `matplotlib.pyplot` library. We can add an x or y label using ```xlabel()``` or ```ylabel()```. Lastly, to show the plot the ```show()``` function is used.

When only one array is passed in, it is assumed that the data is for y-values whereas the values 0, 1, 2, ... , n-1 are used for the x-values.

Now let's plot something more interesting.

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [5, 6, 2, 1, 9]
plt.plot(x, y)
plt.title('Some plot of data')
plt.show()

We can also plot with other colours and symbols.

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [5, 6, 2, 1, 9]
plt.plot(x, y, 'ro')
plt.title('Some plot of data')
plt.show()

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y = [5, 6, 2, 1, 9]
plt.plot(x, y, 'g^')
plt.title('Some plot of data')
plt.show()

Or plot multiple things on the same plot

In [None]:
import matplotlib.pyplot as plt

x = [1, 2, 3, 4, 5]
y1 = [5, 6, 2, 1, 9]
y2 = [9, 2, 5, 2, 1]
y3 = [5, 7, 2, 9 ,1]
y4 = [0, 10, 20, 4, 5]
plt.plot(x, y1, 'r--', x, y2, 'bs', x, y3, 'g^', x, y4, 'kp')
plt.title('Some plot of data')
plt.show()

You may have noticed some strange looking strings in the `plt.plot()` function calls. These are flags that allow us to change the linestyles, markers, and colours.

Some of the other symbols that you can plot with are:\
**linestyles**: `'-'` (solid line), `'--'` (dashed line), `'-.'` (dash-dot), `':'` (dotted line), `'steps'`\
**markers**: `'s'` (square), `'^'` (triangle), `'p'` (pentagon), `'+'` (cross), `'o'` (circle), `','` (pixel), `'.'`, `'1'` (tri-down), `'2'` (tri-up), `'3'` (tri-left), `'4'` (tri-right)\
**colours (short names)**: `'b'`, `'g'`, `'r'`, `'c'`, `'m'`, `'y'`, `'k'`, `'w'`\
**colours(long names)**: `'blue'`, `'green'`, `'red'`, `'cyan'`, `'magneta'`, `'yellow'`, `'black'`, `'white'`

Take a look at the graphs above and see how the different linestyles, markers, and colours are being used.

More information on the different colours and symbols can be found in the documentation which can be accessed [here](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.html) 
Reading documentation is very important as a programmer so we must be able to find out what functions we can use when using certain languages or libraries. We can access the python documentation through [here](https://docs.python.org/3/) if we need to look up anything.

Continuing with plotting, we can also have multiple plots in the same image. This is done through the ```subplot()``` function. It works by rows and columns. This can be explained best with an example.

In [None]:
import matplotlib.pyplot as plt

plt.figure(1) # To set up the sub plot
plt.subplot(221) # This means the plot will have two rows and two colums and this is the first plot of the four
plt.plot(x, y1, '--')
plt.ylabel('some data')
plt.title('A plot of some data')

plt.subplot(222) # This is the second plot of the four. The first number is the rows, then columns, then plot in the sequence
plt.plot(x, y2, 'ro')

plt.subplot(223) # Now the third plot
plt.plot(x, y3, 'g^')
plt.ylabel('some data')

plt.subplot(224) # And finally the fourth
plt.plot(x, y4, 'k+')

plt.show()

### Bar charts
There are other types of plots that we can use as well such as bar charts. This instead uses ```plt.bar()```. A horizontal barchart can also be done using the ```plt.barh()``` function.

In [None]:
# Lets create some data from rolling a dice
import matplotlib.pyplot as plt
import numpy as np

rng = np.random.default_rng()
rollCounts = rng.integers(2, 10, size=6)

plt.title('Dice rolling')
plt.xlabel('number')
plt.ylabel('count')

plt.bar(range(1, 7), rollCounts)
plt.show()

### Histograms
The last type of plot covered today is a histogram, best to show distributions. These are plotted using ```plt.hist()```

In [None]:
from numpy.random import normal

gaussian_numbers = normal(size=1000)
plt.hist(gaussian_numbers)
plt.title("Gaussian Histogram")
plt.xlabel("Value")
plt.ylabel("Frequency")
plt.show()

## Final Stretch Goal

Now let's put it all together! See if you can master this final goal. There's lots of ways of doing this so feel free to experiment.

Remember what you've learnt: functions, saving data to a file, `for` loops, Pandas DataFrames, and plotting, because you'll need all of them for this task!

We're going to play with the penguins dataset, clean the data, save the statistical information to a CSV file, and plot the data on a single plot.

In [None]:
# import the required libraries
import pandas as pd
import matplotlib.pyplot as plt

# Load the dataset
penguins = pd.load_csv('https://raw.githubusercontent.com/mwaskom/seaborn-data/master/penguins.csv')

# Do some basic analysis with Pandas and see what needs to be cleaned up


# Get the basic statistical info and save it to a CSV file


# Plot the cleaned data using subplots
# Each plot should have it's on plot type or linestyle, colour, and marker type




Now with everything covered, you may be thinking where can I go from here? If you wish to learn more about python there are tons of resources online, there are also a few other workshops I have written if you wish to view them. They can be found here: https://github.com/Curtin-Machine-Learning-Club/Week-1-Content/tree/master/PythonIntroduction  . If you wish to apply your new skills to machine learning, you're in luck as we will be running machine learning workshops.