# Aims and objectives

Learning how to fully use a programming language, especially for the first time, is a task which can easily take several months to achieve. Indeed, the Python programming language is rather vast and fully covering it would never be achievable within the timeframes allowed by this course. Instead this tutorial aims to offer a first impression on the basic concepts of Python programming. It is our hope that this will serve as an entry point towards further studies in both Python and other programming languages.
The main objectives this tutorial covers are:
1.	Understanding how to execute Python code using Jupyter notebooks


2.	Understanding the basic Python object types


3.	Understanding how to use operators to assign and manipulate variables


4.	Gaining basic insights on bult-in Python functions and how to import external modules


5.	Understanding how to use loops and conditionals to perform repetitive tasks


# Python programming - what and why?

The Python programming language is a general-purpose interpreted object-orientated programming language which has existed for over 30 years.  Originally developed by Guido van Rossum, the language has matured over the years into a incredibly large ecosystem with millions of developers. Python is not under the care of the Python Software Foundation and usually has yearly releases of the Python programming language, comprising of new features and bug fixes, with the latest release being python 3.10. More information on Python and its development process can be found here: https://www.python.org/
This short course concentrates solely on the use of the Python programming language. One could rightly ask why Python and not one of the hundreds of other programming languages that currently exist?
There are several reasons as to why this choice was made:

1.	Python has over recent years become a commonly used tool by scientists looking to rapidly create, explore and analyse data. Many existing scientific tools either exist purely in Python or offer Python interfaces (for example [alphafold](https://github.com/deepmind/alphafold) or the biomolecular viewer [ChimeraX](https://www.cgl.ucsf.edu/chimerax/). It has therefore become important for many scientists to have a working understanding of how to use basic Python commands.


2.	Python is very flexible and easy to play with. In part due to its nature as an `interpreted language` (unlike languages like C++ and Fortran which are compiled language that require to be passed through a compiler before a computer can understand written code), it is easy to make modifications to code and directly see how it may affect outcomes. This does come at the cost of performance, with Python being slower than compiled languages, however there are many ways in which savvy programmers can go around this limitation.


3.	Python is portable, free to use and open-source. Unlike programming language such as Matlab it does not require a paid license to operate. It also works equally well on any operating system, be it; Linux, Windows or MacOS.


4.	Python comes with an incredibly large ecosystem with many software packages to extend its usage. We unfortunately won’t be touching upon this in this tutorial, but there are many ways to extend Python, with tools such as NumPy for scientific computing (https://numpy.org/), scikit-learn for machine learning (https://scikit-learn.org/stable/), and matplotlib for drawing figures (https://matplotlib.org/). These massively expand the base utility of the Python programming language, allowing for complex tools and packages to be written.


That being said, this tutorial does not aim to advocate for one programming language over any other. Indeed, each programming language has its own advantage and disadvantages based on the intended use case. With most programming languages having very similar core components and concepts, we hope that this primer to Python programming will serve useful for whichever programming language you many end up using in the future.


# License and Acknowledgements

These tutorial materials are licensed under a [CC NY-NC 4.0 license](./README.md). The tutorial materials are primarily based on the contents of two other workshops, the [Oxford Computational Biochemistry Python Course](https://github.com/bigginlab/OxCompBio/tree/master/tutorials/Python) and [The Carpentries “Programming with Python” workshop](https://swcarpentry.github.io/python-novice-inflammation/). We thank and acknowledge the contributions of the authors from both of these workshops.

# How to use this notebook

In order to better demonstrate how the Python programming language works, we employ the use of Jupyter notebooks in this tutorial. [Jupyter notebooks](https://jupyter.org/) are a useful way to interactively demonstrate python code in a clear and organised manner, whilst avoiding the need to write out files. It also allows us to run python in the cloud using tools such as Google Colab or Binder (see “running these notebooks at home” below). In practice, one may need to write out code to a file and run them directly through the command line, how to do this will not be covered by this tutorial, but do ask a demonstrator for more information if you are interested!
Jupyter notebooks are composed of a series of `cells` which can either contain text (such as this one) or code (which will usually have the words `In [ ]:` or just `[ ]:` next to them. You can type code in `cells` by clicking on them and typing whatever changes you want to make. Code cells can be used to run any valid Python code. To do this simply click on the cell and press the `Run` button on the toolbar above, or type `Shift + Enter`.
Important note: All code executed is retained in the memory of the notebook. That means that if you `import` a module or declare a variable, these will be seen and can interact with code in cells executed at a later stage in other parts of the notebook. If things behave in a way that you do not expect, it is often worth checking that you did not declare a conflicting variable in another part of the notebook. If needs be the notebook can be cleared by navigating to `Kernel > Restart & Clear Output`, a further prompt will ask you if you wish to clear all outputs, pressing on this will return the notebook to a state where none of the cells have been run. When finished, a notebook should be shut down by clicking on `File > Close and Halt`.
More information on using a Jupyter notebook (this is recommended pre-reading for the tutorial), can be found here: https://www.codecademy.com/article/how-to-use-jupyter-notebooks


# Running the notebook from home

Instructions on starting the tutorial notebook in the Biochemistry computer lab will be provided on the day. However, should you wish to run this notebook online from the convenience of your own computer, you can do so using either Google Colab or mybinder. Our recommendation would be to use Google Colab if possible, as it is directly supported by Google infrastructure and therefore runs a lot more reliably than mybinder (which operates on community donated server instances). That being said, Google Colab has a slightly different interface from conventional Jupyter notebooks, and therefore requires you to read through its [“getting started documentation”](https://colab.research.google.com/?utm_source=scs-index#scrollTo=-gE-Ez1qtyIA).

__Links can be found here__

Google colab notebook: https://colab.research.google.com/github/bigginlab/TB5-IntroductionToPython/blob/main/TB5%20-%20Introduction%20to%20Python%20Programming.ipynb

Google colab getting started documentation: https://colab.research.google.com/?utm_source=scs-index#scrollTo=-gE-Ez1qtyIA

Mybinder notebook: https://mybinder.org/v2/gh/bigginlab/TB5-IntroductionToPython/HEAD

# 1. A first Python command

Let’s start things off with executing our first Python command. In the cell below we will use `print` to write out “Hello World”. Try it out by clicking on the cell and pressing run in the toolbar above.

In [None]:
# A first Python command
print("Hello World")

There is a lot happening here which we will cover in later parts of the tutorial. In the first instance, let us just consider that `print` is a function which is offered by default in Python to print out text.
The `print` function takes an input value (in this case some text, in quotation marks) and prints the text output (without quotation marks). One should note that Python is case-sensitive, i.e. we can’t use `Print` or `PRINT`.

Despite its simplicity, `print` is a very powerful way to get a direct output from code, allowing us to interrogate how the code is progressing and see what values are being produced.

### A note on Python comments
You will have noticed in the above code blocks the presence of the words `# A first Python command` and how it does not affect the notebook output. In Python, anything written after a `#` is considered to be a “comment”, that is to say that it is ignored by the Python interpreter and exists solely for the convenience of annotating code. It is generally considered good practice to comment your code in order to make it more readable for yourself and others. In this tutorial you will that we often place comments in code blocks in order to better communicate what should be happening.

### Question 1
In the cell below, use `print` to output the phrase “This is the second command”. Copy the working code line in canvas as your answer.

In [None]:
# Question 1

# 2. Basic Python types

Python, being an object-orientated programming language, essentially works on the idea of manipulating various building blocks, which we call objects, in order to achieve a given outcome. Covering the philosophy of object-orientated programming and how objects work is beyond the scope of this tutorial. Here we will primarily look at the basic types of Python objects and how they work.
Whilst each object is created dynamically, they all have what we call a `type` which defines how that object behaves.
For example, the “Hello World” (introduced in the previous section) is a text based object, also known as a string.
Object types include:

1.	Integers (i.e. whole numbers, both positive and negative)


2.	Floats (i.e. decimal numbers)


3.	Strings (i.e. text)


4.	Boolean values (i.e. True or False)


## 2.1 Integers and floating points

Integers are essentially represented as whole numbers in Python. They require no extra treatment.

In [None]:
# print the number 10
print(10)

Similarly a float (or floating point number) is just represented as a decimal:

In [None]:
# print the float 10.5
print(10.5)

Like using a calculator, you can perform arithmetic operations on integers and floats. You can add and substract using the `+` and `/` operators respectively:

In [None]:
print(2 + 4)

In [None]:
print(5 - 3)

Note: the spacing between numbers and symbols usually does not matter, although using a single empty space around the operator does improve readability.

In [None]:
print(7-   10)

Multiplying and dividing can be achieved using `*` and `/` respectively.

In [None]:
print(2 * 5)

In [None]:
print(4 / 3)

As you can see above when dividing integers, a floating point type value will be returned. Similarly, when doing any operation between a float and integer, a float will be returned.

In [None]:
print(2.0 * 3)
print(4 - 1.0)
print(2 + 0.5)

In certain cases, you may wish to carry out integer division. This can be achieved using the `//` operator:

In [None]:
print(2 // 3)

### Question 2

There are other Python arithmetic operators not covered here, what does `**` do? Try it out below and answer on canvas.

In [None]:
# Question 2

### Question 3

What about `%`? Try it out below and answer on canvas.

In [None]:
# Question 3

## 2.2 Strings

As mentioned above, strings are essentially representations of text. This text is contained within quotes.

In [None]:
print("this is a string")

Failing to encapsulate the string within quotes will prevent the Python interpreter from knowing that it is a string.

In [None]:
# NBVAL_RAISES_EXCEPTION
## Note: ignore the above comment, this to allow us to test the notebook
print(this is not a string)

Question: do we have to use single quotes to indicate a string? Will double quotes (`””`) work as well?

Answer: yes, single quotes, double quotes, and even triple quotes (i.e. `’’’something’’’` or `”””something”””`) tell Python that you want a string. When to use one or the other depends on what you are trying to do.
You can also use double quotes within single quotes and the other way around:

In [None]:
print("Single 'quotes' can be used within double quotes.")
print('Double "quotes" can be used within single quotes.')

### Question 4

Like integers and floats, strings can be added and multiplied.
What happens if you multiply the string “key” by 3? Try it out below and log your answer on canvas.

In [None]:
# Question 4

### Question 5

There are some types of operations that don’t work on some objects. What happens if you try to instead divide “key” by 3? Try it out below and log your answer on canvas.

In [None]:
# Question 5

## 2.3 Booleans

The last basic Python object type is the Boolean. Named after George Boole, these object are used to represent “truth” and either take the form `True` or `False`.

In [None]:
print(True)
print(False)

On their own Boolean objects are very little meaning, but combined with “relational operators” they can be very powerful tools.
For example to compare if a number is greater than another:

In [None]:
# The greater than operator
1 > 2

In general, there are 6 relational operators for comparisons:
1.	`>` (greater than)
2.	`<` (less than)
3.	`>=` (greater than or equal to)
4.	`<=` (less than or equal to)
5.	`!=` (not equal to)
6.	`==` (equal to)

Using these to compare objects will always return a Boolean object.

In [None]:
print("dog" != "cat")

In [None]:
print(1.5 == 2.5)

## 2.4 Seeing types

It is possible to interrogate the type of an object using the `type` command in Python.
This can be useful when trying to keep track of what is contained within a variable (see next section).

In [None]:
print(type(1.5))

You’ll notice that the output for the above is `<class ‘float’>`. In Python `classes` are synonymous with `objects`.

### Question 6

What is the type of `print`? Try it out below and answer on canvas.

In [None]:
# Question 6

# 3. Variables

Up to this point we have purely looked at how we can directly interact with Python objects. This can be quite useful for basic arithmetic, but the practical applications are limited. To do anything useful with the data we have, we need to assign its value to a _variable_.
In Python, we can assign a value to a variable using the equals sign `=`. This is called an assignment statement and it associates a variable name (on the left of the equal sign) with a given object (on the right of the equal sign). For example, we could record the weight of a person by assigning the variable `weight_kg` as shown below:


In [None]:
weight_kg = 60
print(weight_kg)

With this, we can then do further manipulations (as we have done in the previous section):

In [None]:
weight_pounds = weight_kg * 2.20462
print(weight_pounds)

You can be as creative as you like when it comes to giving names to variables, however you need to keep in mind the following:
1.	You cannot have a space in a variable name (i.e. `many numbers` would not work)
2.	Variables are case-sensitive (i.e. `Pi` is different from `pi`)
3.	Variables cannot start with a number (i.e. `1number` is not acceptable but `numb3r` would be)

**Important note**: A very frequent mistake is to mix up the use of the `==` comparison operator with the variable assignment operator `=`. It is important to remember that they do different things.

### Question 7

What type does the variable `weight_pounds` above have? Try it out here and answer on canvas.

In [None]:
# Question 7

# 4. Lists

Sometimes it is useful to work with a collection of objects, arranged in a specific order, rather than keeping track of lots of variables. A `list` is one of the ways in which this can be achieved in Python.
Note: there are other types of collections available in Python, such as `sets`, but these will not be covered here.


## 4.1 Creating lists

A list is constructed by putting together a series of objects separated by a commas and surrounded by square brackets. For example:

In [None]:
exampleList = [1, 4, 5, 3, 2]
print(type(exampleList))

Note: lists items do not need to have the same type, `[1, 'one', 2, 'three']` is a perfectly acceptable list construction.

## 4.2 Accessing list items (aka indexing)

You can retrieve entries in a `list` by specifying the index of the entry in the `list`. This is done by putting the index in square brackets after the name of the `list`.
Counting in Python starts as zero, so to pick the first entry in a `list`, specify index 0.
e.g.


In [None]:
exampleList = [1, 4, 5, 3, 2]

# get the first entry
print(exampleList[0])

To pick out the third entry:

In [None]:
print(exampleList[2])

You can also use negative numbers as indices, which start backwards from the end of the list. This is useful if you don’t know how long your list is, but want to pick, for example, the before last entry:

In [None]:
print(exampleList[-2])

### Question 8

What happens if you try to pick out an entry that doesn’t exist, for example the tenth entry? Try it out below and answer on canvas.

In [None]:
# Question 8

## 4.3 Changing specific entries in a list

We can use list indexing to change the value of a particular entry in a list
For example, to change the value of the second entry in `exampleList` we do:

In [None]:
exampleList[1] = 42

# print out the new values
print(exampleList)

## 4.4 Some other common list operations

Here are some other common lister operations you can do.

### Finding the length of a list
The `len` method can be used to find the length of a list:

In [None]:
print(exampleList)

# Get the length of the list
print(len(exampleList))

### Adding an extra item to a list
You can “append” to a list by using the `list.append` method:

In [None]:
print(exampleList)

# Let’s add a 9 to the list
exampleList.append(9)

# print out the new list contents
print(exampleList)

### Sorting a list

The built-in method `sorted` can be used to sort the elements in a list:

In [None]:
unsortedList = [9,2,7,4,1,8,12,6,4,1]

print(sorted(unsortedList))

### Summing a list

Similarly the built-in method `sum` can be used to add up the elements in a list:

In [None]:
print(sum(unsortedList))

### Converting a string to a list
Strings can be considered to be a sequence of characters, it is therefore possible to convert a string into a list by calling `list` on it

In [None]:
sequence = 'ACAATGCGATACGTATTTGCG'
sequence_list = list(sequence)
print(sequence_list)

### Slicing a smaller list from an existing list

This is more of an advance feature and we don’t expect it to be of much use within the context of this tutorial. However it is worth noting that you can “slice” the indices of a list. For example if you wanted to get the 2nd to the 4th entries in the previous `sequence_list` you could do:

In [None]:
new_list = sequence_list[1:4]
print(new_list)

The list slicing construct is `list[start:end]` where `end` is the value of the last index we want + 1.

# 5. Using Functions

In previous sections we have been frequently using Python functions such as `print`, `sum`, `sorted`, and `len`. Here we provide a little bit more context on them.
Python functions are essentially methods that usually take a set of inputs and return a value in the form `f(x) -> y`.
Up until now we have only encountered cases where functions have taken a single input, however there are some functions that take more than one. For example the `pow` function which returns the power of given numbers:


In [None]:
print(pow(10, 2))

Some functions also have arguments which are usually set by default but can be overridden. For example, the `sorted` function has an optional `reverse` argument which can allow you to reverse a sort.

In [None]:
unsortedList = [9,2,7,4,1,8,12,6,4,1]
reverse_sorted = sorted(unsortedList, reverse=True)
print(reverse_sorted)

You can inspect the type of arguments a function might take by calling `??` on the function in the following manner (click on the `x` to close the dialogue box after you have read the help message):

In [None]:
sorted??

The contents of the printed help documents will offer insights on what options can be used.

### Question 9

Which of the following is not an optional keyword argument of `print`?

1.	`file`

2.	`newline`

3.	`sep`

4.	`end`

Try it out below and answer on canvas.

In [None]:
# Question 9

# 6. The `for` loop

Sometimes when programming you might need to do repetitive tasks. For example, doing a given operation with an increasingly larger number. Python offers loops as a way to make your life easier and your code much smaller. Here we will be specifically looking at the use of `for` loops, there are other loops such as `while`, but they will not be covered in this tutorial.
`for` loops take an _iterator statement_ and loops over it until the iteration is exhausted. To demonstrate this, we use the `range` function to generate an iterator of numbers:

In [None]:
for number in range(1, 11):
    print(number)

Here `number in range(1, 11)` is the iterator statement which generates values of `number` progressively increasingly from 1 to 10 as the loop is executed. The iterator statement is terminated by passing a colon `:` at its end.
*note*: the range goes to 11 but the loop stops at 10. This is something where Python is not very intuitive. If you define ranges, python will not include the last value in your range.
The loop contents `print(number)` are specifically indented by 4 spaces. This tells the Python interpreter that anything that is indented by 4 spaces should be done each time the value of `number` increases.
Keeping track of spaces is very important! Failing to indent or indenting with the wrong number of spaces will lead to accidental issues.


### Using `for` loops with lists

It is possible to use a `list` as part of the _iterator statement_ of a `for` loop.


In [None]:
myList = [1, 3, 4, 8, 2]
for item in myList:
    print(item)

As shown above, the statement `for item in myList` assigns the next `myList` entry at each iteration of the `for` loop. This then allows us to print out all the elements in `myList`.
It is possible to do more advanced things here, like creating new lists or appending new list items in a loop. For the sake of brevity these will not be covered here, but we do encourage you to play around with `for` loops and see what you can achieve.

# 7. `if` conditionals

Decision making plays a key role in programming and Python provides the `if.. elif.. else` statement to achieve this.

## 7.1 The `if` statement

The `if` statement is used to check if a condition is fulfilled and then a task (or set of tasks) is executed. The following example shows how it could be used to check if a number if even:

In [None]:
number = 2

if number % 2 == 0:
    print("number is even")

The operator `%` is called the modulus and gives the remainder of an integer division (see section on object types). The _conditional expression_ follows directly after the word `if` and ends with a colon `:` (just as in the syntax for loops). The task(s) to be executed in case the condition is `True`, are then provided (indented by 4 spaces – see previous section on for loops). In this case it is a print statement telling us that the number if even. Note that if the condition is not `True`, the task is simply not executed.

In [None]:
number = 3

if number % 2 == 0:
    print("number is even")

## 7.2 The `if.. else` statement

An `else` statement can be combined with an `if` statement. An `else` statement contains the block of code that is executed if the conditional expression in the `if` statement resolves as `False`.

In [None]:
number = 3

if number % 2 == 0:
    print("number if even")
else:
    print("number is odd")


If there is more than one condition you want to check, you can add a number of `elif` statements between the `if` and `else` for each additional condition you might want to check.

In [None]:
number = -2

if number > 0:
    print("number is positive")
elif number < 0:
    print("number is negative")
else:
    print("number is zero")

The first statement that returns `True` will have its indented block of code executed.

## 7.3 Combining `if` and `for` loops

It is possible to combine different loops together by “nesting” them. For example, you could have a `for` loop that iterated through a set of numbers and then an `if` statement within it that checked what the numbers are:

In [None]:
for number in range(1, 20):
    if number % 2 == 0:
        print("number is even")
    else:
        print("number is odd")

# 8. Imports

To this point we have purely shown you how to use the basic bult-in methods and objects which are available in Python. As mentioned in `Python programming – what and why?` there are plenty of different extensions to Python which can be used to extend functionality and make your life easier.
How to install and use them is beyond the scope of this tutorial, however it is worth briefly showing how extra methods could be imported into a python session using the `import` statement.


## 8.1 Converting from degreees to radians

The basic formalism for converting between degree angles to radians is:
`radians = degrees * pi / 180.0`
This can be easily done using the standard Python objects:

In [None]:
degrees = 25
radians = degrees * (3.141 / 180.0)
print(radians)

However the above code is rather unwieldy and would be quite annoying to have to rewrite every time we wanted to do this conversion. Instead the `math` library has a specific `radians` method which can help do this:

In [None]:
import math
print(math.radians(25))

## 8.2 A whole world of libraries

As mentioned above there are many extensions to the base Python behaviour.
The standard python distribution offers a limited amount of libraries, termed the "standard library", details about them can be found here: https://docs.python.org/3/library/

This includes the above mentioned math libraries.

Beyond the "standard library", there are also many amazing community driven packages, here is a non-exhaustive list of them:
-	NumPy (http://www.numpy.org/) and SciPy (https://www.scipy.org/) are excellent libraries that do a lot more that simple mathematical operations (like the Euclidean distance calculator above). These also interface very well with plotting libraries.

-	Matplotlib (https://matplotlib.org/) is the most commonly used plotting library available. It is simple, and can do a lot. Other plotting libraries, such as GGPlot (http://ggplot.yhathq.com/), SeaBorn (https://seaborn.pydata.org/), and Bokeh (https://bokeh.pydata.org/en/latest/).

-	Pandas (https://pandas.pydata.org/) is a great library to manipulate data structures. Its ease of use makes it ideal to work with large data sets.

-	MDAnalysis (https://www.mdanalysis.org/), MDTraj (http://mdtraj.org/1.9.0/), and PyTraj (https://github.com/Amber-MD/pytraj), are some of the libraries used to process Molecular Dynamics trajcetories and other files. These interface very well with NumPy and Pandas and Matplotlib.

-	SciKit-Learn (http://scikit-learn.org/stable/) for machine learning in Python.

-	MPI4Py (http://mpi4py.readthedocs.io/en/stable/) allows Python scripts to be parallelised (run over multiple processors).

-	Cython (http://cython.org/) allows you to write parts of your code in C/C++, making it very fast.

Each library will have its means of directly installing them, however keeping track of various installs and avoiding clashes between libraries can be quite tedious. We instead recommend that a package manager like conda be used instead.
See here for more details about conda: https://www.anaconda.com/distribution/

# 9. Review and tying it all back together

In this tutorial we covered the following:
1. The basic Python types; integers, floats, strings and booleans
2. Arithmetic and relational operators
3. Assigning variables
4. Creating and using lists
5. Using functions
6. Using `for` loops
7. Using `if` conditionals
8. Importing extensions to Python
Finally we tie most of what we have learnt together into one final question.

### Question 10

Below is a string containing a sequence of single letter amino acids for a construct of Thrombin. Take this string, turn it into a list and count the number of glutamic acid residues which are present:

Hint: You can accumulate a variable by 1 by doing `variable_name += 1`.

In [None]:
# Question 10

sequence = "IVEGSDAEIGMSPWQVMLFRKSPQELLCGASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGKHSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALMKLKKPVAFSDYIHPVCLPDRETAASLLQAGYKGRVTGWGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKDSTRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFVMKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVFRLKKWIQKVIDQFGE"

# 10. Beyond this tutorial

This tutorial has offered a very small view into the world of Python. Should you be interested in learning more about Python, the following tutorials might be of interest:

1.	The oxford computational biochemistry Python course: https://github.com/bigginlab/OxCompBio/tree/master/tutorials/Python


2.	The CodeAcademy Python tutorial: https://www.codecademy.com/learn/learn-python


3.	The Carpentries Python workshop: https://swcarpentry.github.io/python-novice-inflammation/

We would also suggest looking at how Python is used in practice to achieve real research goals. For example the following notebooks / workshops created by members of our lab might be of interest:

1.	Calculating binding free energies using the OpenMM / Open Free Energy framework: https://github.com/OpenFreeEnergy/ExampleNotebooks/blob/master/openmm-rbfe/OpenFE_showcase_1_RBFE_of_T4lysozyme.ipynb


2.	Analysing simulations using the MDAnalysis framework: https://github.com/MDAnalysis/WorkshopPrace2021
