# An introduction to solving biological problems with Python - day two

- Our course webpage: http://pycam.github.io
- Python website: https://www.python.org/ 
- Python docs: [Python 3 Standard Library](https://docs.python.org/3/library/index.html)

## Learning objectives - day two

- **Recall** what we've learned so far on variables, common data types and collections
- **Propose and create** solutions using these concepts in an exercise
- **Use** conditions to execute specific code block
- **Employ** loops to repeat code block
- **Practice** reading and writing files with Python
- **Solve** more complex exercises

## Course schedule - day two

- 09:30-09:45: [0h15] **Introduction**
- 09:45-10:45: [1h00] **Session 2.1** - Conditional execution
- 10:45-11:00: *break*
- 11:00-12:30: [1h30] **Session 2.2** - Loops
- 12:30-13:30: *lunch break*
- 13:30-15:00: [1h30] **Session 2.3** - Files
- 15:00-15:15: *break*
- 15:15-16:15: [1h00] **Session 2.4** - Delimited files
- 16:15-16:30: *break*
- 16:30-17:00: [0h30] **Wrap-up**

## What we've learned so far

- Simple data types, Collections
- Functions used so far...

## Simple data types

In [1]:
## Integer
i = 1
print('Integer:', i)
## Float
x = 3.14
print('Float', x)
## Boolean
print(True)

Integer: 1
Float 3.14
True


In [2]:
## String
s0 = '' # empty string
s1 = 'ATGTCGTCTACAACACT' # single quotes
s2 = "spam's" # double quotes
print(s1 + s2) # concatenate
print(s1, s2) # print

ATGTCGTCTACAACACTspam's
ATGTCGTCTACAACACT spam's


## Collections

In [3]:
## Tuple - immutable
my_tuple = (2, 3, 4, 5)
print('A tuple:', my_tuple)
print('First element of tuple:', my_tuple[0])

A tuple: (2, 3, 4, 5)
First element of tuple: 2


In [4]:
## List
my_list = [2, 3, 4, 5]
print('A list:', my_list)
print('First element of list:', my_list[0])
my_list.append(12)
print('Appended list:', my_list)
my_list[0] = 45
print('Modified list:', my_list)

A list: [2, 3, 4, 5]
First element of list: 2
Appended list: [2, 3, 4, 5, 12]
Modified list: [45, 3, 4, 5, 12]


In [5]:
## String - immutable, tuple of characters
text = "ATGTCATTT"
print('Here is a string:', text)
print('First character:', text[0])
print('Slice text[1:3]:', text[1:3])
print('Number of characters in text', len(text))

Here is a string: ATGTCATTT
First character: A
Slice text[1:3]: TG
Number of characters in text 9


In [6]:
## Set - unique unordered elements
my_set = set([1,2,2,2,2,4,5,6,6,6])
print('A set:', my_set)

A set: {1, 2, 4, 5, 6}


In [7]:
## Dictionary
my_dictionary = {"A": "Adenine", 
                 "C": "Cytosine", 
                 "G": "Guanine", 
                 "T": "Thymine"}
print('A dictionary:', my_dictionary)
print('Value associated to key C:', my_dictionary['C'])

A dictionary: {'A': 'Adenine', 'C': 'Cytosine', 'G': 'Guanine', 'T': 'Thymine'}
Value associated to key C: Cytosine


## Functions used so far...

In [8]:
my_list = ['A', 'C', 'A', 'T', 'G']
print('There are', len(my_list), 'elements in the list', my_list)
print('There are', my_list.count('A'), 'letter A in the list', my_list)
print("ATG TCA CCG GGC".split())

There are 5 elements in the list ['A', 'C', 'A', 'T', 'G']
There are 2 letter A in the list ['A', 'C', 'A', 'T', 'G']
['ATG', 'TCA', 'CCG', 'GGC']


# Research before writing
Or how to avoid writing code that's already been writen
<center><img src="img/research_before_writing.png" width="400"></center>

- In the same way that before carrying out an experiment you would review the available literature, it is sensible to research if there are resources that will make it easier for you to write your program.


- Python has **MANY** function libraries (called modules) which can be easily installed using the **pip** command and then you can import them into your code. Look at [https://pypi.org](https://pypi.org) for a searchable index.

<center><img src="img/pypi-logo.svg" width="160"></center>

- Look on https://github.com for examples of code from authors tackling similar problems. Useful place to learn programming tricks (Can find good and bad code on display here).


**Read the Documentation**

- Python has extensive online documentation at https://docs.python.org/. This is for Python 3.x. There are pages for Python 2.x as well.
- There is an extensive tutorial at: https://docs.python.org/3/tutorial/index.html

**Online Fora**

Useful for searching but if you wish to post a query please follow the sites ‘Netiquette’ and check that the question hasn’t been asked before.
- https://stackoverflow.com – mostly programming but not just Python
- https://www.biostars.org - Bioinformatics Q&A with some programming questions
- http://seqanswers.com - Similar to Biostars



** Measure twice, cut once**

Don’t rush to code. Plan & describe what your program will do and what it needs (Specification).

- Can just write a few paragraphs of text into a `README.md` file
- Or draw a flowchart e.g. https://www.draw.io/ 

<center><img src="img/better-prog.png" width="300"></center>

- Try to anticipate user responses (PEBKAC – Problem Exists Between Keyboard And Chair) and handle resulting errors
- When writing anything beyond trivial programs try a bottom up approach i.e. write and test components before including them in your main program
- Learn to use **pip** and you will then be able to use a vast library of python modules – this will save you writing code that re-invents the wheel

## What to do when the program doesn’t work?
Zen and the art of debugging

**We ALL make mistakes when we code...**

[Nov. 10, 1999: Metric Math Mistake Muffed Mars Meteorology Mission](https://www.wired.com/2010/11/1110mars-climate-observer-report/): A disaster investigation board reports that NASA's Mars Climate Orbiter burned up in the Martian atmosphere because engineers failed to convert units from English to metric. The $125 million satellite was supposed to be the first weather observer on another world. But as it approached the red planet to slip into a stable orbit Sept. 23, the orbiter vanished. Scientists realized quickly it was gone for good. 

|   |   |
| - | - |
| <img src="img/mars.jpg" width="200"> | 1 Pound (force) = 4.44822 Newtons |

**We ALL make mistakes when we code...**

- Try things like printing (or logging) out contents variables – does the output match what we expected?
  - Can get very tedious when a lot of variables & data structures are involved
- Use writing values to a text file
  - Can output many variables – still somewhat tedious to inspect
- Track which bits of code are being executed and in what order by printing messages
  - Better but can get messy and doesn’t scale well

**We ALL make mistakes when we code...**

- Using a debugger tool (Read [*'How To Use the Python Debugger'*](https://www.digitalocean.com/community/tutorials/how-to-use-the-python-debugger))
  - Many programming languages have tools like this
  - They basically ‘step’ through your code stopping at ‘breakpoints’ and displaying a dump of the values of the variables
  - They can take time to learn to use and to get the best out of them


**We ALL make mistakes when we code...**

- Other tools like the Online Python tutor http://pythontutor.com/visualize.html#mode=edit
  - This is a web-based tool that visualizes what is going on in your program
  - Not suited for large or complex programs so e.g. test modules in it #betterprogramming

**We ALL make mistakes when we code...**
<center><img src="img/python-tutor.png"></center>

**Ways of testing your code**

- Try to write test functions that call your code with test data and defined output `=` a pass
- If you alter code to ‘improve’ it, then run the tests. If they do not pass then you’ve broken something!
- Often useful if there is another package that does calculations e.g. ANOVA in R that you can run in parallel with your code – do you get the same/similar results with the same data? 
- Get another programmer to review your code (Many eyes principle). If nothing else it will be a test of your documentation and code comments (You did do that didn’t you?).

## Next session

Go to our next notebook: [python_basic_2_1](python_basic_2_1.ipynb)