# Welcome to Python programming
For many of you this will be your first experience of programming while others will have learnt some at school, college or your undergraduate degree (maybe even some of you from previous jobs/careers). Our goal is to introduce the Python programming language and the principles behind computer programming in a way that will allow you to continue using this new skill, or develop it further if you so wish.

You are currently reading a jupyter notebook. This is also called an integrated development environment (IDE), which allows a programmer to write and test code in short blocks. This is an _ideal_ way to learn progrmaming as it will allow you to play around and test out bits of Python code to understand how they work. They are commonly used in data science as they allow the quick execution of code for analysis. We will use Jupyter notebooks for all practical sessions, and I recommend using them for your assignment too. If you have done a lot of programming previously then you may use your favourite IDE or text editor, but for trouble-shooting and assignments please use the jupyter notebook.


## Using Jupyter notebooks
This text is written in a language called 'Markdown'. It's not a programming language but something called a mark-up language that allows text formatting, including maths equations and other special symbols using a plain text editor.


Jupyter notebook blocks can contain code or markdown - you can switch between the two using the drop down menu:
![Switching between code and Markdown](Code_markdown_menu.png "Screenshot")

Go ahead an play around with writing some text in a new block and see what happens when it is Markdown vs. code. You might see something like these two examples.

![Markdown](Markdown_example.png)

![Code](Code_example.png)

The text is coloured in the code - this is called syntax highlighting, and is one of the reasons why we use an IDE - it makes it easier for us to write code as special types of code are coloured differently. Below is an example of syntax highlighting on a function.

![Syntax highlighting](syntax_highlight.png)

The <span style="color:green">green</span> code are special built-in functions that come in Python, the <span style="color:blue">blue</span> code is the name of my function and the <span style="color:rgb(175,0,255)">pink</span> are operators. Anything that is a string is in <span style="color:red">red</span> and normal code is in <span style="color:black">black</span>.

# Variables and assignment
Variables are how we store values in programming languages. When we make a new variable or change its value we call this _assignment_. In Python we use the equals operator (<span style="color:rgb(175,0,255)">=</span>) to assign a value to a variable. The name of the variable always goes on the left and the value to be assigned on the right.

In [1]:
x = 3
my_name = "Mike"

It doesn't matter if we are assigning a <span style="color:green">**number**</span> or <span style="color:red">**string**</span> we use the same operator - Python figures out what you are assigning. For instance, it knows that <span style="color:red">**"Mike"**</span> is a string because it is in quotes where as <span style="color:green">**3**</span> is not so it knows to store it as a number (specifically an integer).

### **Exercise break**

Try creating some of your own variables, both numbers and strings.

When naming variables it is best to use something useful. **'x'** isn't very informative because it doesn't give me any information about what is stored in the variable, but **'my_name'** gives me an idea that it is a name and therefore must be stored as a string. There is no length to how long a variable name is, so you could call it **'a_string_is_stored_here'** but obviously if you create another string variable it will need a different name, but just calling them **'a_string_is_stored_here2'** isn't realy very informative. You get the picture. Have a try and see what works and what doesn't.

If you have a variable already assigned you can _evaluate_ it (what we say when we want to know what is stored in the variable) or you can display the value using the <span style="color:green">**print()**</span> function.

In [2]:
print(x)

3


In [3]:
x

3

Evaluating **'x'** like this only works in an interactive Python session, such as this notebook. If you are running a script (more on these in a later practical) then you will need to use the <span style="color:green">**print()**</span> function to display the value.

<span style="color:green">**print()**</span> can take multiple arguments which it will **concatenate** together. You can mix numbers and strings which have to be separated by commas.

In [4]:
print("My name is", my_name, "and the magic number is ", x)

My name is Mike and the magic number is  3


To evaluate an argument or use it in a function you have to assign it first. If you try to evaluate a variable that doesn't exist you will get an error.

In [5]:
your_name

NameError: name 'your_name' is not defined

Errors are useful and not something to be scared of. They tell us that there is a problem in the code and the messaeg gives us a clue about what the cause is. Here the error is a <span style="color:red">NameError</span> which is what happens when you evaluate a variable that doesn't exist. In fact, it tells us which variable doesn't exist - **'your_name'**.

### A note of caution
Jupyter notebooks can have strange behaviour if you execute blocks of code out of order. If I assign a variable and try to use it in a code block before that I will get an error, as above. _However_, if I run the assignment code block _then_ the preceding block I will not get an error because the Python session can see that the variable exists. To make sure this doesn't happen it is _strongly_ recommended that you restart the Python kernel (session) using the **Kernel > Restart & Run All** option above.

![Kernel restart](kernel_restart.png)

# Data types
To write effective programming code we need to be able to represent different types of data. As with a spoken/written language we have letters and numbers, and we can put letters together to create words. We can add, subtract, multiply and divide numbers together to make other numbers. We can also ask if two things are the same or different. All of these aspects can be represented in a programming language with the following basic data structures:

* strings - letters, words, sentences
* integers - whole numbers (not fractions)
* floats - fractional numbers, e.g. 1.43 or pi
* boolean - logical values - true or false

There are more complex data types that we will consider later that are useful for storing these different basic data types. You can find out what the data type is using the <span style="color:green">type</span>() function.


In [6]:
type(1)

int

In [7]:
type("Mike")

str

In [8]:
type(1.43)

float

In [9]:
type(True)

bool

In [10]:
type(False)

bool

## Container data types
There are several container data types that we will use a lot, the main ones being the <span style="color:green">list</span>, <span style="color:green">dict</span> and <span style="color:green">tuple</span>. 

### Lists
A <span style="color:green">list</span> can contain any number of elements, which can be of different types.

In [11]:
number_list = [1, 4, 6, 12, 9871]
string_list = ["Mike", "Ellie", "Kim", "Luna"]
mixed_list = ["words", 1, 4, "and numbers mix"]
type(number_list), type(string_list), type(mixed_list)

(list, list, list)

You can add new elements to a list in a number of different ways, the easiest is of which is to use a special <span style="color:blue">append</span> function. It adds the element inside the brackets to the end of the list as a new element. Repeated calls of <span style="color:blue">.append()</span> will keep adding to the end of the list.

In [12]:
number_list.append(string_list) # this adds the list as a new element to the end
number_list

[1, 4, 6, 12, 9871, ['Mike', 'Ellie', 'Kim', 'Luna']]

### **Exercise break**

What happens if you keep running the code above? Try it out and explain what you think is happening.

### Tuple
A <span style="color:green">tuple</span> contains a fixed number of elements that can be of mixed types - these are _immutable_, which means once a tuple is made it can't be changed except to make a new tuple.

In [13]:
my_tuple = (4, "apples", "basket", 12)
my_tuple, type(my_tuple)

((4, 'apples', 'basket', 12), tuple)

In [14]:
my_tuple + my_tuple # this makes a new tuple by adding together these two.

(4, 'apples', 'basket', 12, 4, 'apples', 'basket', 12)

We can't keep adding to a tuple like we can with a list. Try this out by using the .append() function on the tuple:

In [15]:
my_tuple.append(my_tuple) # we'll come back to this one later

AttributeError: 'tuple' object has no attribute 'append'

### Dict
<span style="color:green">dict</span>s can store lots of different elements using a key:value pair organisation. This is very useful for arranging other containers into different categories. For instance, I could organise names and numbers into a dictionary. Each key: value must be separated by a colon.

In [16]:
phone_book = {"Names": ["Mike", "Daniel", "Ellie", "Heather"],
             "Office": ["6.17", "5.23", "3.20", "3.18"],
             "Department": ["Immunology", "Neuroscience", "Developmental Biology", "Immunology"]}
type(phone_book)

dict

# Arithmetic
Computers are _amazing_ at fast arithmetic, which are common tasks in programming. We can add numbers together or we can add strings together but we can't mix them together.

## Numeric arithmetic
Adding and subtracting numbers works in the same way that you would expect it to. The exception is for floating point numbers because they have a _precision_ associated with them.

In [17]:
1 + 3

4

In [18]:
5 - 2

3

We can also use variables to which we have assigned numbers

In [19]:
x + 5

8

In [20]:
y = x + 5
print(y)

8


In [21]:
y - x

5

Floating point numbers _mostly_ work as expected, but can behave unexpectedly when they are _very_ small or _very_ large. Note how adding a float and an integer automatically converts the answer to a float; this is called **_casting_**.

In [22]:
1.5 + 3

4.5

In [23]:
4.5 - 1.5 # the answer is still a float

3.0

If you are really interesting in learning about how floating points are actually represented in Python see this link [here](https://docs.python.org/3/tutorial/floatingpoint.html). Multiplying works as expected, however, the multiplcation operator in python is **'\*'** and _not_ 'x' <- that's a variable name (if we have assigned it as such).

In [24]:
3 * 2

6

In [25]:
3 x 2

SyntaxError: invalid syntax (1418951104.py, line 1)

Division often results in a float because we are dealing with fractions rather than whole numbers, even if the result would be an integer.

In [26]:
6/2

3.0

In [27]:
6.0/2

3.0

In [28]:
6/2.0

3.0

We can also check if two values are the same - but beware that this can go awry when using floats because of the aforementioned _precision_. To check for equality, i.e. are two things _exactly the same_ we use the '<span style="color:rgb(175,0,255)"> == </span>' operator.

In [29]:
x = 4

x == 4

True

In [30]:
y = 2 * 2
x == y

True

In [31]:
# Comparing a float and an integer
x/y

1.0

## String arithmetic
String arithmetic is very simple - it is used to concatenate (add together) strings. Note that it will _not_ introduce a space (whitespace) into the output - you have to tell the computer to do this.

In [32]:
"add this" + "and this"

'add thisand this'

In [33]:
"add this" + " " + "and this"

'add this and this'

By the way, subtraction, multiplication and division don't exist in string arithmetic.

In [34]:
"subtract this " - " from this"

TypeError: unsupported operand type(s) for -: 'str' and 'str'

In [35]:
"multiply this " * "by this"

TypeError: can't multiply sequence by non-int of type 'str'

In [36]:
"divide this " / " by this"

TypeError: unsupported operand type(s) for /: 'str' and 'str'

You can however multiply a string by a number to repeat it.

In [37]:
"repeat me " * 3

'repeat me repeat me repeat me '

### **Exercise break**
What other arithmetic operators are there? Can you figure out what the modulus % operator does? How about adding and subtracting bools? What does <span style="color:green">True</span> + <span style="color:green">True</span> give you? Why do you think you get this answer? What other bool operators are there?

# Conditional flow
When writing a program we often have multiple choices to make depending on the value of some variable. Conditional flow allows us to make these choices and execute different code depending on what decision is made. For instance, if we have a variable containing a number less than 100 we might perform one operation, but if it's more than 100 we could do something different.

There are special statements that we use <span style="color:green">if</span>, <span style="color:green">else</span> and <span style="color:green">elif</span>. A conditional statement always begins with <span style="color:green">if</span>, and _usually_ finishes with <span style="color:green">else</span>. If there is a choice between more than 2 options then we can also use <span style="color:green">elif</span>.

In [38]:
my_name = "Mike"
#my_name = "Julie"

if my_name == "Mike":
    print("Howdy " + my_name)
else:
    print("Nice to meet you " + my_name)

Howdy Mike


In [39]:
if my_name == "John":
    print("Welcome " + my_name)
elif my_name == "Sandra":
    print("Lovely to meet you " + my_name)
else:
    print("We haven't met - what is your name?")

We haven't met - what is your name?


# Iteration - for and while loops
In a programming task we often want to perform the same operation multiple times of a series of numbers of strings. For instance, if I want to compute the first 50 square numbers then I could write out 50 lines of code and calculate each square number from 1 to 50. 

Instead we can _iterate_ over a collection of numbers using a <span style="color:green">**for**</span> loop and compute the square of each number.

In [40]:
for x in range(50):
    print(x ** 2)

0
1
4
9
16
25
36
49
64
81
100
121
144
169
196
225
256
289
324
361
400
441
484
529
576
625
676
729
784
841
900
961
1024
1089
1156
1225
1296
1369
1444
1521
1600
1681
1764
1849
1936
2025
2116
2209
2304
2401


The anatomy of this for loop is always

* <span style="color:green">for</span> _some variable name_ <span style="color:green">in</span> _collection of types_:

I used the <span style="color:green">range</span>() function here which creates something called a _generator_ to make a set of numbers, here from 0 to 50. Equally, we could include any set of numbers in an appropriate container.

We could also iterate over a set of strings and do some operation on them, like adding sequential elements to create one long string

In [41]:
string_list = ["Mike", "is", "an", "awesome", "programmer"]

concat = ""
for mystring in string_list:
    concat = concat + " " + mystring
    
print(concat)    

 Mike is an awesome programmer


<span style="color:green">while</span> loops can be another useful way to iterate over a set of elements when you don't know how many elements there are, or if you want to wait until a particular event has happened.

**NB: Use with caution** While loops can be tricky to work with and debug, as they can result in an infinite loop, i.e. one that never ends because the event to break the loop doesn't ever happen in the code. This will manifest as the code hanging, i.e. taking a long time to finish (an infinite amount of time infact).

In [42]:
# a while loop that works
number_adder = 0

while number_adder < 10:
    print(number_adder)
    number_adder += 1
    
print(number_adder*2)

0
1
2
3
4
5
6
7
8
9
20


In [43]:
# a while loop that never ends
number_adder = 0
threshold = -10
while number_adder >= threshold:
    number_adder += 1
    
print(number_adder*2)

KeyboardInterrupt: 

I used the 'keyboard interrupt' button to stop this as I know that it will run infinitely - can you figure out why it would never exit?

# Functions
Functions take an input and produce an output. They can be mathematical, or work on strings, or do any number of operations. They are especially useful for tasks that we wish to perform multiple times as they reduce the amount of code that we need to write. They can contain conditional flow, iteration loops, and any other operation that we've covered so far (plus more).

Coming back to our first 50 square numbers problem, I could write 50 lines of code that computes the square for each number, or write a for loop, or a I could write a short function and apply it to my set of numbers.

In [44]:
def squareFunction(last_number):
    '''
    A function to compute the square of a number
    '''
    
    number_list = []
    for x in range(last_number):
        number_list.append(x ** 2)
    
    return(number_list)

In [45]:
squareFunction(50)

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576,
 625,
 676,
 729,
 784,
 841,
 900,
 961,
 1024,
 1089,
 1156,
 1225,
 1296,
 1369,
 1444,
 1521,
 1600,
 1681,
 1764,
 1849,
 1936,
 2025,
 2116,
 2209,
 2304,
 2401]

With 4 lines of code and a definition I've compute the first 50 square numbers - much more efficient than writing out 50 lines of code, one for each number. Although this is slightly longer than the for loop I now have a function that I can use to compute the first N square numbers, where N is _any_ number. For example:

In [46]:
squareFunction(2)

[0, 1]

In [47]:
squareFunction(25)

[0,
 1,
 4,
 9,
 16,
 25,
 36,
 49,
 64,
 81,
 100,
 121,
 144,
 169,
 196,
 225,
 256,
 289,
 324,
 361,
 400,
 441,
 484,
 529,
 576]

### Exercise break
How would you modify the <span style="color:blue">squareFunction()</span> to be able to compute _any_ number to _any_ power?

## Methods
Methods are special types of functions that only work the data type they are defined for. We saw the use above with the <span style="color:blue">.append()</span> function which adds to the end of a list. When you tried this on a tuple you got an <span style="color:red">AttributeError</span> which stated that the .append function hasn't been defined for a tuple.

Different data types have different built-in methods and some don't have any at all. Built-in methods are only defined for lists, strings, dicts, tuples, sets and files. Some of these work in-place i.e. they change the variable directly without returning a new version of the data type, while others return new values - refer to the [Python documentation](https://www.w3schools.com/python/python_reference.asp) to find out which methods do which.

A few examples of useful methods are given below, try playing around with these and others on different data types.

In [48]:
"This string ends with a !".endswith("!")

True

In [49]:
"two hundred".isnumeric()

False

In [50]:
"200".isnumeric()

True

In [51]:
string_list = ["Mike", "is", "an", "awesome", "programmer"]
join_my_string = " ".join(string_list) # remember how we did this with a for loop earlier?
join_my_string

'Mike is an awesome programmer'

In [52]:
join_my_string.split(" ") # split and join can reverse eachothers operations if you use the same separator/joiner

['Mike', 'is', 'an', 'awesome', 'programmer']

In [53]:
"a lower case sentence".upper()

'A LOWER CASE SENTENCE'

In [54]:
# You can stack method functions together if they return the relevant input type
"a lower case sentence".upper().lower()

'a lower case sentence'

In [55]:
string_list.reverse()
string_list

['programmer', 'awesome', 'an', 'is', 'Mike']

In [56]:
first_word = string_list.pop() # remove the first element in-place and return it
first_word

'Mike'

### Exercises
1) What happens if you keep applying the .pop() function to a list? What happens when you get to the end of the list?
2) Can you find the method to count the number of occurences in a list? Try this with different lists and different elements.
3) The .append methods adds to the end of a list, which method do you think glues two lists together?

# Final exercise
You have been given an mRNA sequence from a bacterial gene. Write a function or working code that will return the reverse complement of this mRNA sequence.

Can you then convert it into a DNA sequence, knowing what you do about which nucleotides are found in DNA compared to RNA?

Bonus exercise: can you write a function or working code to translate the original mRNA into its amino acid sequence?
