# Lecture 01: Variable assignment, Lists, Object Methods and Control Flow

## Chapters
Chapter 1: The Basics: Getting Started Quickly  <br>
Author: Jurre Hagemane & Ronald Wedema

## Jupyter Notebook

Jupyter is a web application that allows you to combine text, live code, equations and visualizations in Jupyter notebooks. It is very powerful for datascience as it allows to combine code with descriptions, figures, tables etc. all in one document. This presentation was actually made as a Jupyter notebook:

![Jupyter notebook](figs/fig2.png)

All presentations and notebooks can be downloaded from:
https://github.com/rwedema/DSLS_PrepProgramming

You can clone the repo as follows:

`git clone https://github.com/rwedema/DSLS_PrepProgramming.git`

Git should be installed on your system. Alternatively, you can download the zip-file or download the individual notebooks.

## Microsoft Visual Studio Code

During the programming courses we will be working in **Microsoft Visual Studio Code**. 
This is an Integrated Develpment Environment (IDE).

You can start this on our Linux systems by clicking on: Applications Menu (left topcorner) -> Development -> Visual Studio Code.

Or alternatively open a terminal emulator (**console**) and type `code`

![Microsoft Visual Studio Code](figs/startVisualCode.png)

To create a python file (file ending in **.py**) to work in, click New File..

Next click **File** and **Save As...**

Give the file a clear name (without spaces in the name!) and end the file with .py

You can work in this newly created Python script.

Every programming language always starts with the printing of the string **Hello world** to the screen.

So lets do that to show how to run a Python script.

Type in your script: `print("Hello world!")`

![Hello World Script](figs/HelloWorld.png)

## Assigning variables
In Python assigning data means that you use the **assignment operator** (**=**) to store some data into a variable. As the word variable allready implies, the content can change.

Here is an example of variable assignment:

In [139]:
x = 1
print(x)
print(type(x))


1
<class 'int'>


[Here](https://www.youtube.com/watch?v=aeoGGabJhAQ) you can find a movie that explains
variables.

Information (data) comes in many forms, such as numbers, characters, words, pictures, sound.
In programming languages like python information is stored in variables. You can think of a box that has a label on it and stuff stored inside. You can find or use the information later by searching for the label of the box which you stored the information in.

Here are some more examples of variable assignment:

In [140]:
amino_acid = "alanine"

In [141]:
number_of_atoms = 13

In [142]:
mw = 89.09

What happens when I type:

In [143]:
amino_acid = "alanine"

amino_acid is the name of the variable you have just created
The **assignment** operator (**=**) was used to assign the **string alanine** to the **variable amino_acid**
'alanine' is a literal, the value to be assigned to the variable amino_acid

Python has several data types. We will cover
the following types in this lesson:
- Integers
- Floats
- Strings

## Integers

An integer (abbreviated int), is a whole number, positive or negative, without decimals, of unlimited length.

In [144]:
x = 3
y = -3

print(type(x))
print(type(y))

<class 'int'>
<class 'int'>


In the above example we not only assigned a whole number (**int**) using the assignment operator. We also used the Python buildin **print()** method to print the content to screen. Lastly, we showed what the type of the variable is using the **type()** method

## Floats

Floats represent real numbers and are written with a decimal point dividing the integer and fractional parts. Floats may also be in scientific notation, with E or e indicating the power of 10.
Some examples:

In [145]:
x = 2.5
y = 12E64

print(type(x))
print(type(y))

<class 'float'>
<class 'float'>


## Basic calculations with Python

Of course, Python (being a programming language) supports calculations.  
Here are some examples:

In [146]:
x = 6
y = 3
print(x + y)
print(x - y)
print(x * y)
print(x / y)

9
3
18
2.0


Note that the division changes the datatype from an integer to a float!

## Other calculations

Some other (though still basic) calculations might be a bit less obvious. For example exponentiation:

In [147]:
x = 3
print(x**3)

27


> You might be tempted to use ^ (just as Excel) but this will NOT give the expected result. ^ is not covered here.

Floor devision is another frequently used operator. Floor division is a normal division operation but  it returns the largest possible integer (chopping the decimal part). It is (for example) convenient to use to calculate how many hours there are in 134 minutes:

In [148]:
x = 134
print(x // 60)

2


The modulus operator (%) is another frequently used operator. It calculates the left over after a division. For example 17 % 5 = 2. It is (for example) convenient to switch 134 minutes to hours:minutes notation:

In [149]:
x = 134
print( x % 60)

14


Or in total: convert 134 minutes to hour:minutes notation:

In [150]:
x = 134
print(x // 60)
print(x % 60)

# bit of formatting you will learn later:

print(x // 60, ":", x % 60, sep="")

2
14
2:14


## Getting help

Python comes with great help built-in. You can use the `help` funtion to get help on funtions, data objects etc. Here an example for help on the print function:

In [151]:
help(print)

Help on built-in function print in module builtins:

print(...)
    print(value, ..., sep=' ', end='\n', file=sys.stdout, flush=False)
    
    Prints the values to a stream, or to sys.stdout by default.
    Optional keyword arguments:
    file:  a file-like object (stream); defaults to the current sys.stdout.
    sep:   string inserted between values, default a space.
    end:   string appended after the last value, default a newline.
    flush: whether to forcibly flush the stream.



Here we only add the name of the function (print) as an argument in the help function. We don't execute it. So this will not work to get help on the print function:

In [152]:
help(print()) # not the same as the previous example. Will be explained at a future lesson.


Help on NoneType object:

class NoneType(object)
 |  Methods defined here:
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __repr__(self, /)
 |      Return repr(self).
 |  
 |  ----------------------------------------------------------------------
 |  Static methods defined here:
 |  
 |  __new__(*args, **kwargs) from builtins.type
 |      Create and return a new object.  See help(type) for accurate signature.



But you can also get help on a self declared variable:

In [153]:
x = 3
help(x)

Help on int object:

class int(object)
 |  int([x]) -> integer
 |  int(x, base=10) -> integer
 |  
 |  Convert a number or string to an integer, or return 0 if no arguments
 |  are given.  If x is a number, return x.__int__().  For floating point
 |  numbers, this truncates towards zero.
 |  
 |  If x is not a number or if base is given, then x must be a string,
 |  bytes, or bytearray instance representing an integer literal in the
 |  given base.  The literal can be preceded by '+' or '-' and be surrounded
 |  by whitespace.  The base defaults to 10.  Valid bases are 0 and 2-36.
 |  Base 0 means to interpret the base from the string as an integer literal.
 |  >>> int('0b100', base=0)
 |  4
 |  
 |  Methods defined here:
 |  
 |  __abs__(self, /)
 |      abs(self)
 |  
 |  __add__(self, value, /)
 |      Return self+value.
 |  
 |  __and__(self, value, /)
 |      Return self&value.
 |  
 |  __bool__(self, /)
 |      self != 0
 |  
 |  __ceil__(...)
 |      Ceiling of an Integral retur

The `dir`function is very convenient to explore what `methods` and `properties` are associated with any Python object. This will be explained more thoroughly at a later stage but let's just look at an example:

In [154]:
x = 10
dir(x)

['__abs__',
 '__add__',
 '__and__',
 '__bool__',
 '__ceil__',
 '__class__',
 '__delattr__',
 '__dir__',
 '__divmod__',
 '__doc__',
 '__eq__',
 '__float__',
 '__floor__',
 '__floordiv__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__index__',
 '__init__',
 '__init_subclass__',
 '__int__',
 '__invert__',
 '__le__',
 '__lshift__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__neg__',
 '__new__',
 '__or__',
 '__pos__',
 '__pow__',
 '__radd__',
 '__rand__',
 '__rdivmod__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rfloordiv__',
 '__rlshift__',
 '__rmod__',
 '__rmul__',
 '__ror__',
 '__round__',
 '__rpow__',
 '__rrshift__',
 '__rshift__',
 '__rsub__',
 '__rtruediv__',
 '__rxor__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__sub__',
 '__subclasshook__',
 '__truediv__',
 '__trunc__',
 '__xor__',
 'bit_length',
 'conjugate',
 'denominator',
 'from_bytes',
 'imag',
 'numerator',
 'real',
 'to_bytes']

This all might seem a bit overwelming but with a little practice it will soon be clear. Just remember that you can get help using the `help`function and that you can "inspect" any Python object using the `dir`function.

## Python strings
A word or multiple words are called **strings** in Python. Strings can be created by using different qoutes: **single**, **double** or even **tripple** around the words. In the next examples string assignment is shown:

In [155]:
single_word_string = 'foo'
print(single_word_string)

foo


In [156]:
single_qouted_string = 'In single qoutes you can use " as part of the string'
print(single_qouted_string)

In single qoutes you can use " as part of the string


In [157]:
double_qouted_string = "In between double qoutes 'single' qoutes can be used"
print(double_qouted_string)

In between double qoutes 'single' qoutes can be used


In [158]:
multi_line_string = """To have a string that contains multiple lines
you have to use 
triple double qoutes
"""
print(multi_line_string)

To have a string that contains multiple lines
you have to use 
triple double qoutes



String can be combined using the **+** operator and this is called **string concatenatation**

In [159]:
first_word = 'Hello '
second_word = 'world'
combined_words = first_word + second_word
print(combined_words)

Hello world


## String slicing
Characters in a string can be accessed using **indexing** (more on indexing in the second lecture). Every character in a string has a position starting from **0**. Using the variable name and an index we can get the character at that position. To do this we need to place the index in between brackets **\[\]**

In [160]:
first_letter = combined_words[0]
print(first_letter)

H


We can also specify a **range** of positions to retrieve from the string using the following bracket notation: **[start:stop:step]**. 

In [161]:
first_word = combined_words[0:5]
print(first_word)

Hello


In [162]:
second_word = combined_words[6:11]
print(second_word)

world


If you omit the **start** Python will start from the beginning and not specifying an **end** will let Python continue to the last position. 

In [163]:
first_word = combined_words[:5]
print(first_word)

Hello


In [164]:
second_word = combined_words[6:]
print(second_word)

world


Using the **step** in the slice has as effect that every position at the step interval will be retrieved. In the next example the **start** and **stop** are left blank which indicating we want to start at the beginnning and continue to the end.

In [165]:
every_second_character = combined_words[::2]
print(every_second_character)

Hlowrd


One extra neat trick we can do with string slicing is the use of **negative indices**. When a negative index is used it means start from the end. By using a **negative step** we can reverse the string.

In [166]:
reversed_combined_words = combined_words[::-1]
print(reversed_combined_words)

dlrow olleH


## String methods
Everything in Python is an **object** (More on objects in lecture 5) An object has content and methods that can operate on that content. A string is no exception and is also a Python object. The content of the string object is/are the word(s). To use a method of an object we have to use the **dot operator **. We can show which methods are available for any given object by using the Python buildin **dir()** method or **help()** if we want to have more information.

In [167]:
DNA = "ATGC"
print(dir(DNA))

['__add__', '__class__', '__contains__', '__delattr__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__getitem__', '__getnewargs__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__iter__', '__le__', '__len__', '__lt__', '__mod__', '__mul__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__rmod__', '__rmul__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', 'capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']


We could also use **dir()** directly on the str object itself by typing **dir(str)**. But we leave that for you to try out. To get a bit more information on how to actually use the str methods we can use the **help()** method to get information on all or just a single method. To get information on all just type **help(str)**, but for now we will focus on a single method. To get help on a single method we have to use help() and pass the method we want help on using the **dot notation(**). The method we will focus on is the **replace()** method. 

In [168]:
help(DNA.replace)

Help on built-in function replace:

replace(old, new, count=-1, /) method of builtins.str instance
    Return a copy with all occurrences of substring old replaced by new.
    
      count
        Maximum number of occurrences to replace.
        -1 (the default value) means replace all occurrences.
    
    If the optional argument count is given, only the first count occurrences are
    replaced.



Lets say we want to convert our DNA string to a RNA string (T->U). We could use the **replace()** method to accomplish this. From the help of the replace method we can see that we need to specify the old substring (T) and what it should become (U).

In [169]:
print(DNA.replace("T", "U"))

AUGC


**Note: strings are immutable, meaning after they are created we cannot change them anymore.** There are more inmutable data types, which you will learn about in lecture2.

In [170]:
print(DNA)

ATGC


As you can see in the above code example, when **x** is printed, it still contains the **old DNA string**. To save the replace() method operation **you have to save the output of the operation** into a variable.

In [171]:
RNA = DNA.replace("T", "U")
print(DNA)
print(RNA)

ATGC
AUGC


Some other usefull str methods are: **str.upper**, **str.lower** and **str.find** and **str.index**. These methods will respectively turn the string to uppercase, lowercase and the last two methods can be used to return an index of a certain substring. The difference in **find()** and **index()** is that if nothing is found, find() will **return** an **-1** and index() will **raise** a **ValueError** (more on exceptions in lecture...). Given the DNA and RNA strings from above, we will show the usage of these str methods()

In [172]:
print(DNA.lower())
print(RNA.lower())

atgc
augc


In [173]:
print(DNA.find("T"))

1


In [174]:
print(DNA.lower().find("t"))

1


In [175]:
print(DNA.index("GC"))

2


In [176]:
print(DNA.index("GCT"))

ValueError: substring not found

## Control flow
In programming we want to be able to check the content of variables and be able to make decisions based on that content. For example: if we want to change a DNA string to RNA, we need to be sure that we do not allready have a RNA string.

### if statement
If statements can be used to decide if a variable fits a given condition and execute a piece of code when it does. All the code that should be executed needs to be **indented** with **4 spaces** (or a tab). 

The basic syntax of an if statement is:
```
if condition:
    indented code block
```

The **conditional** tests that can be used are (without qoutes!): 
- '==' ; equals
- '!=' ; not equals
- '>' ; greater than
- '<' ; less than
- '>=' ; greater than or equal
- '<=' ; less than or equal

In [177]:
number = 5

if number == 5:
    print("Number is 5")
    
print("After the if")

Number is 5
After the if


In the above code we first created a variable named **number** and we **assigned** (using the assignment operator) the **value 5** to it. <br>
Next, we started the **if** statement and added our **condition** that **evaluates** to **true** if our number **is equal to 5**. <br>
Finally, the block that should be **executed** (the print statement) when the condition is true is **indendent** with **4 spaces**. <br>
The code continues again starting from the margin.<br><br>


In [178]:
number = 6

if number == 5:
    print("Number is 5")
    
print("After the if")

After the if


In the above code block we changed the number to 6. If we run this piece of code the **condition** in the if statement **tests to false 5**, because 6 is not equal to 5. The code block indented does not get executed and the code continues to run until the line starting in the margin.

In [179]:
if number == 5:
    print("Number is 5")

**Note:** the **:** after the if statement! **code after a semicolon should be indented**, forgetting the indentation will raise an invalid syntax error.

### if..else..
The if statement has an additional **else** that can be used to **execute** a code block when the if condition **evaluates to false**.

The if..else.. syntax now looks like:
```
if condition:
    indented code block1
else:
    indented code block2
```


In [180]:
if number < 5:
    print("Number is more than 5")
else:
    print("Not a number that is less than 5")

Not a number that is less than 5


### if..elif..else..
To test multiple conditions we can use the if..**elif**..else.. statement. If we now add the elif statement we come to the following structure:
```
if condition:
    indented code block1
elif condition1:
    indented code block2
else:
    indented code block3
```

There can be multiple **elif** statements following an if. In the next example we test if a grade for an exam fits a condition, when the condition evaluates to true the code block following the condition is executed.

In [181]:
grade = 8
if grade < 5.5:
    print("Bummer, unsufficient!")
elif grade >= 5.5 and grade <= 6:
    print("Needs a bit of work")
elif grade > 6 and grade < 8:
    print("Getting there")
else:
    print("Top job!")

Top job!


Note in the above example code the use of the **and** logical operator, when used the condition on **both ends** of the **and should evaluate to true**. <br>
There is also the **or** logical operator, than **one** of the **sides** should **evaluate to true**.

Finally, now that we know that we can use indentation to indicate a block of code, we can use this indentation to have nested if statements. <br>
Each if statement can have it's own if..elif..else.. structure.

In [182]:
number = 5
if number >= 0:
    if number == 0:
        print("Zero")
    else:
        print("Positive number")
else:
    print("Negative number")

Positive number


## for loop
Executing code many times can be done naivelly by just typing the code (or copy pasting) many times. Luckily, Python comes with buildin functionallity that will allow us to **repeat code** many times. Even if we do not know on beforehand how many times a piece of code should be executed. Looping is usually caried out on strings, (lists, tuples and dictionaries can also be looped over, this is content of lecture2) but **any object that is iterable can be looped over**.

The basic syntax of a for loop is as following:
```
for x in iterable:
    code block to be executed on item
```

**x** is a placeholder that holds just **one** item of the iterable at the time. And every time the loop is repeated the **next item is placed in x**. We can give x any name we want, and it helps to name it accordingly to the type of value it will contain.

In the next example we use the for loop to loop over a DNA string, every time taking one character and placing this in the letter placeholder. In the indented block we print the content of the placeholder.

In [183]:
DNA = 'ATGC'
for letter in DNA:
    print(letter)

A
T
G
C


Another example using a different iterable type (**list**). Note that we used a different name for the placeholder. The result is exactly the same as in the previous example.

In [184]:
myDNAList = ['A', 'T', 'G', 'C']
for character in myDNAList:
    print(character)

A
T
G
C


## while loop
With the while loop we can execute a code block as long as a condition is true.

In [186]:
number = 1
while number < 6:
  print(number)
  number = number + 1

1
2
3
4
5


Note: remember to increment i, or else the loop will continue forever!

## Exercises:
Time to put all our new coding skills to the test!

Create a programm  that will check if a given sequence is DNA (and not RNA) and if it is DNA will reverse the sequence.
Finally the reversed sequence should be converted to RNA by replacing the T charachters to U.

For this exercise you may define your own DNA sequence and assign it to a variable. Or use the following DNA sequence "ATGAGTAGGATAGGCTAGATGGCGATGAATT"

Tip: think of the difference between DNA and RNA. How could you use this to check if a sequence is DNA and not RNA?

## End