In [None]:
from IPython.display import Image
from IPython.display import clear_output
from IPython.display import FileLink, FileLinks
import matplotlib.pylab as plt
import pandas        as pd
import os
import time

## Intro to Scientific Programming to:

![title](img/python-logo-master-flat.png)

#### - Day 1

### About me


<img src="img/claudio-mirabello.jpg" alt="Drawing" style="width: 220px;"/>

Claudio Mirabello, Ph.D.
* Principal Research Engineer, bioinformatics division
* Work at National Bioinformatics Infrastructure Sweden
* Background as computer engineer
* Expertise: machine learning, data science
* Italian 🤌
* Food motivated


### About me


<img src="img/claudio-mirabello.jpg" alt="Drawing" style="width: 220px;"/>

Claudio Mirabello, Ph.D.
* Visiting address: B-huset 2B:586
* E-mail: claudio.mirabello@liu.se
* Fridays 10-11AM only


### About the course

* Course website:
https://github.com/clami66/workshop-python/tree/0422/
* Two lectures per week, for a total of 3 hours
* Who knows how much we'll cover!
* One final project applying what you have learned on your research
* Please make sure you have installed conda and jupyter notebook
* Based on [NBIS course for bioinformatics](https://github.com/NBISweden/workshop-python/)

### About you

- Short personal introduction:
    - Name, department
    - What do you work on?
    - Why are you here?
    - ...

### What is programming?

Wikipedia:  

"Computer programming is the process of building and designing an executable computer program for accomplishing a specific computing task"

### What can we use it for?

Endless possibilities!  
- compute complex equations from raw measurements
- fit curves to your data
- plotting of results
- automatize your stuff -> no more manual mistakes!

## Why Python?

### Typical workflow

1. Get data
2. Clean, transform data in spreadsheet
3. Copy-paste, copy-paste, copy-paste
4. Run analysis & export results
7. Realise the columns were not sorted correctly
8. Go back to step 2, Repeat


<img src="img/picard.jpg" alt="Drawing" style="width: 400px;"/>

<h2>Course content</h2>

- Core concepts about Python syntax: Data types, blocks and indentation, variable scoping, iteration, functions, methods and arguments  
- Different ways to control program flow using loops and conditional tests  
- (Regular expressions and pattern matching) 
- Writing functions and best-practice ways of making them usable  
- Reading from and writing to files  
- Using extra Python libraries:
    - Numpy for scientific computing
    - Matplotlib for plotting
    - pandas for table handling

<h2>Learning outcomes</h2>

After this course you should be able to:

- Describe and apply basic concepts in Python, such as:
    - Loops
    - If/else statements
    - Functions
    - Reading/writing to files
- Being able to edit and run Python code
- Write file-processing Python programs that produce output to the terminal and/or external files
- Create stand-alone python programs to process your data
- Know how to develop your skills in Python after the course (including debugging)

## Some good advice

- 5 weeks to learn Python is not much
- Amount of information will decrease over days
- Complexity of tasks will increase over days
- Read the error messages!
- Save all your code

<u>How to seek help:</u>  
- Google
- Ask your neighbour
- Ask me (slow response time)



## Day 1

- Computer architecture
- Types and variables
- Operations
- Loops
- if/else statements

## Computer Architecture

<img src="img/Von_Neumann_Architecture.png"/>

(src: [wikipedia](https://en.wikipedia.org/wiki/Von_Neumann_architecture))

## Computer architecture

* Very important to keep in mind how memory (RAM) works
* (Will scribble a lot, good to take notes!)

<img src="img/array.png"/>

(src: [Geeksforgeeks](https://www.geeksforgeeks.org/how-to-copy-elements-of-an-array-in-a-vector-in-c/))

## Example of a simple Python script

In [None]:
# A simple loop that adds 2 to a number
i = 0
while i < 10:
    u = i + 2
    print('u is' + str(u))
    i += 1

## Example of a simple Python script

<img src="img/simple_while_loop_comment.png" alt="Drawing" style="width: 400px;"/>

### Comment

All lines starting with # is interpreted by python as a comment and are not executed. Comments are important for documenting code and considered good practise when doing all types of programming

## Example of a simple Python script

<img src="img/simple_while_loop_literal.png" alt="Drawing" style="width: 400px;"/>


### Literals

All literals have a type:

- Strings (str) &emsp; &emsp; &nbsp;      ‘Hello’ “Hi”
- Integers (int)	&emsp; &emsp;             5
- Floats (float)	&emsp; &emsp;             3.14
- Boolean (bool) &emsp; &nbsp;  True or False

### Literals define values

In [None]:
'this is a string'
"this is also a string"
3       # here we can put a comment so we know that this is an integer
3.14    # this is a float
True    # this is a boolean

type('this is a string')

### Collections

In [None]:
[3, 5, 7, 4, 99]       # this is a list of integers

('a', 'b', 'c', 'd')   # this is a tuple of strings
{'a', 'b', 'c'}        # this is a set of strings
{'a':3, 'b':5, 'c':7}  # this is a dictionary with strings as keys and integers as values

type([3, 5, 7, 4, 99])

### What operations can we do with different values?

That depends on their type:

In [None]:
'a string'+' another string'
#2 + 3.4
#'a string ' * 3
#'a string ' *3.4

<b>Type &emsp; &emsp; &emsp; &emsp;  Operations </b>

int &emsp; &emsp; &emsp; &emsp; &emsp;        +  -  *  /  **  %  // ...  
float &emsp; &emsp; &emsp; &emsp; &nbsp;      +  -  *  /  **  %  // ...  
string &emsp; &emsp; &emsp; &ensp; &nbsp;           + *

### Example of a simple Python script


<img src="img/simple_while_loop_identifier.png" alt="Drawing" style="width: 300px;"/> 
### Identifiers

Identifiers are used to identify a program element in the code. 

For example:  
- Variables
- Functions
- Modules
- Classes 

### Variables

Used to store values and to assign them a name.

Examples:  
- `i       = 0`
- `counter = 5`
- `snpname = 'rs2315487'`
- `snplist = ['rs21354', 'rs214569']`    

In [None]:
width  = 23564
height = 10

snpname = 'rs56483 '
snplist = ['rs12345','rs458782']

width * height

#### How to correctly name a variable



<img src="img/variable_name.png" alt="Drawing" style="width: 600px;"/> 

__Allowed: &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp;           Not allowed:__  
Var\_name  &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp;                  2save  
\_total &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &ensp;  \*important  
aReallyLongName  &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp;                           Special%  
with\_digit\_2 &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &emsp; &nbsp;          With  &nbsp; spaces  
dkfsjdsklut &emsp; _(well, allowed, but NOT recommended)_

__NO special characters:__  
\+ - * $ % ; : , ? ! { } ( ) < > “ ‘ | \ / @

## Reserved keywords

<img src="img/python_keywords.png" alt="Drawing" style="width: 600px;"/> 


<b>These words can not be used as variable names</b>

## Summary

- Comment your code!
- Literals define values and can have different types (strings, integers, floats, boolean)
- Values can be collected in lists, tuples, sets, and dictionaries
- The operation that can be performed on a certain value depends on the type
- Variables are identified by a name and are used to store a value or collections of values
- Name your variables using descriptive words without special characters and reserved keywords


## NOTE!

### How to get help?

- [Google](https://www.google.com/) and [Stack overflow](https://stackoverflow.com/) are your best friends!
- Official [python documentation](https://docs.python.org/3/)
- Ask your neighbour
- Ask me

## Python standard library

<img src="img/built-in_functions.png" alt="Drawing" style="width: 800px;"/> 



### Example `print()` and `str()`

<img src="img/simple_while_loop_functions.png" alt="Drawing" style="width: 400px;"/> 

__Note!__  
Here we format everything to a string before printing it

## Python standard library

<img src="img/built-in_functions.png" alt="Drawing" style="width: 700px;"/> 



In [None]:
width  = 5
height = 3.6
snps   = ['rs123', 'rs5487']
snp    = 'rs2546'
active = True
nums   = [2,4,6,8,4,5,2]

float(width)

## More on operations

<img src="img/operations.png" alt="Drawing" style="width: 600px;"/> 


In [None]:
x = 4
y = 3
z = [2, 3, 6, 3, 9, 23]
pow(x, y)

## Comparison operators

<img src="img/comparison_operator.png" alt="Drawing" style="width: 600px;"/> 

Can be used on int, float, str, and bool. Outputs a boolean.

In [None]:
x = 5
y = 3

y > x

## Logical operators

<img src="img/logical_operator.png" alt="Drawing" style="width: 600px;"/> 



## Membership operators

<img src="img/membership_operator.png" alt="Drawing" style="width: 600px;"/> 


In [None]:
x = 2
y = 3

x == 2 and y == 5

#x = [2,4,7,3,5,9]
#y = ['a','b','c']

#2 in x
#4 in x and 'd' in y

In [None]:
# A simple loop that adds 2 to a number and checks if the number is even
i    = 0
even = [2,4,6,8,10]
while i < 10:
    u = i + 2
    print('u is '+str(u)+'. Is this number even? '+str(u in even))
    i += 1

In [None]:
# A simple loop that adds 2 to a number, check if number is even and below 5
i    = 0
even = [2,4,6,8,10]
while i < 10:
    u = i + 2
    print('u is '+str(u)+'. Is this number even and below 5? '+\
          str(u in even and u < 5))
    i += 1

### Order of precedence

There is an order of precedence for all operators:

<img src="img/order_of_precedence.png" alt="Drawing" style="width: 600px;"/> 


### Word of caution when using operators

In [8]:
x = 5
y = 7
z = 2
x == 5 and y < 7 or z > 1

#x > 6 and (y == 7 or z > 1)

# and binds stronger than or
#x > 4 or y == 6 and z > 3
#x > 4 or (y == 6 and z > 3)
#(x > 4 or y == 6) and z > 3

True

In [11]:
# BEWARE!
x = 5
y = 8

#xx == 6 or xxx == 6 or x > 2
x > 42 or (y < 7 and xx > 1000)

False

__Python does short-circuit evaluation of operators__

## More on sequences <small><b>(For example strings and lists)</b></small>

Lists (and strings) are an ORDERED collection of elements where every element can be accessed through an index.

<img src="img/operations_on_sequences.png" alt="Drawing" style="width: 600px;"/> 


In [17]:
l = [2,3,4,5,3,7,5,9]
s = 'some longrandomstring'

'o' in s

#l[2]
#s[0:7]
#s[0:8:2]
#s[-2]
#l[0] = 42
#l
#s[0] = 'S'

True

## Mutable vs Immutable objects

<br></br>

Mutable objects can be altered after creation, while immutable objects can't.


__Immutable objects:&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;Mutable objects:__  
- `int`    &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;  &bull;  `list`
- `float` &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&ensp;&ensp;&ensp;&ensp;  &bull;  `set`
- `bool` &emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&emsp;&ensp;&ensp;&ensp; &bull;  `dict`
- `str`
- `tuple`


## Operations on mutable sequences



<img src="img/operations_on_mutable_sequences.png" alt="Drawing" style="width: 600px;"/> 

In [None]:
s = [0,1,2,3,4,5,6,7,8,9]
s.insert(5,10)
#s.reverse()
#s

## Summary

- The python standard library has many built-in functions regularly used
- Operators are used to carry out computations on different values
- Three types of operators; comparison, logical, and membership
- Order of precedence crucial!
- Mutable object can be changed after creation while immutable objects cannot be changed

<br></br>
<br></br>


## Loops in Python

In [None]:
fruits = ['apple','pear','banana','orange']

print(fruits[0])
print(fruits[1])
print(fruits[2])
print(fruits[3])

In [None]:
fruits = ['apple','pear','banana','orange']

for fruit in fruits:
    print(fruit)

#    print('end')
#print('done')

__Always remember to INDENT your loops!__

### Different types of loops

### `For` loop

In [None]:
fruits = ['apple','pear','banana','orange']

for fruit in fruits:
    print(fruit)

### `While` loop

In [None]:
fruits = ['apple','pear','banana','orange']

i = 0
while i < len(fruits):
    print(fruits[i])
    i = i + 1

### Different types of loops

__`For` loop__

Is a control flow statement that performs a fixed operation over a known amount of steps.

__`While` loop__

Is a control flow statement that allows code to be executed repeatedly based on a given Boolean condition.

<br></br>

__Which one to use?__

`For` loops better for simple iterations over lists and other iterable objects

`While` loops are more flexible and can iterate an unspecified number of times




## Example of a simple Python script

<br></br>

<img src="img/simple_while_loop.png" alt="Drawing" style="width: 600px;"/> 

__&rarr; Notebook Day_1_Exercise_3  (~20 minutes)__

## Conditional `if/else` &nbsp;statements


<img src="img/if_else_statement.png" alt="Drawing" style="width: 600px;"/> 

In [None]:
shopping_list = ['bread', 'egg', 'butter', 'milk']

if len(shopping_list) > 5:
    print('Go shopping!')
else:
    print('Nah! I\'ll do it tomorrow!')

In [None]:
shopping_list = ['bread', 'egg', 'butter', 'milk']
tired         = False

if len(shopping_list) > 5:
    if not tired:
        print('Go shopping!')
    else:
        print('Too tired, I\'ll do it later')
else:
    if not tired:
        print('Better get it over with today anyway')
    else:
        print('Nah! I\'ll do it tomorrow!')

### This is an example of a nested conditional

## Putting everything into a Python script

Any longer pieces of code that have been used and will be re-used SHOULD be saved

Two options:
- Save it as a text file and make it executable
- Save it as a notebook file

### Things to remember when working with scripts

- Put _#!/usr/bin/env python_ in the beginning of the file
- Make the file executable to run with `./script.py`
- Otherwise run script with `python script.py`

## Working on files

In [None]:
fruits = ['apple','pear','banana','orange']

for fruit in fruits:
    print(fruit)

<img src="img/fruits.png" alt="Drawing" style="width: 300px;"/> 

In [None]:
fh = open('../files/fruits.txt', 'r', encoding = 'utf-8')

for line in fh:
    print(line)

fh.close()

### Aditional useful methods:
<br></br>

`'string'.strip()` &emsp; &emsp; &emsp; Removes whitespace  
`'string'.split()` &emsp; &emsp; &emsp; Splits on whitespace into list  

In [None]:
s  = '  an example string to split with whitespace in end   '
sw = s.strip()
sw
#l  = sw.split()
#l
#l  = s.strip().split('\t')
#l

<img src="img/fruits.png" alt="Drawing" style="width: 300px;"/> 

In [None]:
fh = open('../files/fruits.txt', 'r', encoding = 'utf-8')

for line in fh:
    print(line.strip())

fh.close()

### Another example

<img src="img/bank_statement.png" alt="Drawing" style="width: 300px;"/> 
How much money is spent on ICA?

In [None]:
fh    = open("../files/bank_statement.txt", "r", encoding = "utf-8")

total = 0

for line in fh:
    expenses = line.strip().split()  # split line into list
    store    = expenses[0]           # save what store
    price    = float(expenses[1])    # save the price
    if store == 'ICA':               # only count the price if store is ICA
        total = total + price
fh.close()

print('Total amount spent on ICA is: '+str(total))  

### Slightly more complex...

<img src="img/bank_statement_extended.png" alt="Drawing" style="width: 400px;"/> 

How much money is spent on ICA in September?

In [None]:
fh    = open("../files/bank_statement_extended.txt", "r", encoding = "utf-8")

total = 0

for line in fh:
    if not line.startswith('store'):
        expenses = line.strip().split()
        store    = expenses[0]
        year     = expenses[1]
        month    = expenses[2]
        day      = expenses[3]
        price    = float(expenses[4])
        if store == 'ICA' and month == '09':   # store has to be ICA and month september
            total = total + price
fh.close()

out = open("../files/bank_statement_results.txt", "w", encoding = "utf-8")   # open a file for writing the results to
out.write('Total amount spent on ICA in september is: '+str(total))
out.close()

In [None]:
for file in os.scandir("../files/"):
    print(time.ctime(os.stat(file).st_mtime), '\t', file.name)

<img src="img/bank_statement_results.png" alt="Drawing" style="width: 400px;"/> 

## Summary

- Python has two types of loops, `For` loops and `While` loops
- Loops can be used on any iterable types and objects
- `If/Else` statement are used when deciding actions depending on a condition that evaluates to a boolean
- Several `If/Else` statements can be nested
- Save code as notebook or text file to be run using python
- The function `open()` can be used to read in text files
- A text file is iterable, meaning it is possible to loop over the lines

__&rarr; Notebook Day_1_Exercise_4__