# Basic Python

In this notebook basic Python language constructs are learned. Here, we will not introduce Python as a tool to program applications but as a programming tool for data science. We use it together with the Jupyter notebook environement and we will stay specific to this. General beginners tutorials can be found [here](https://wiki.python.org/moin/BeginnersGuide/Programmers). However, some basic contructs are introduced.

## The data science viewpoint

What do we actually want? We want to be able to process data, be it from forestry data repositories, or other sources like the SMEAR stations network, the [ICOS Carbon Portal](https://www.icos-cp.eu), the [NOAA climate data](https://www.ncdc.noaa.gov/cdo-web/datasets) and more.

Usually we have some task or idea like how to compare the monthly average carbon dioxide concentration in forests. We would need to find the data, download them and load them into a Jupyter/Python notebook, transform them to the wanted structure (monthly averages of different years at different places), make some analyses (visual, statistical,...), produce meaningful output (graphs, tables,...) and save the results.

Using Jupyter notebooks let us further use the notebook document as computational document were we can add our thoughts and we can even use it to publish our results.

## The Python REPL

Ok,what's that? First, programming languages can be interpreted or compiled. In simple words, the interpreted code is translated into machine code (what the computer actually understands) while the program or app is running and the compiled code is first translated into machine code and the executed. That means, again in simple words, interpreted code can be changed, re-executed and you get the result as a feedback immediately. In copiled code, you need first write it in some editor, compile it, execute it, if there is need for change (almost always there is) this cycle begins again (edit-compile-run).

In Python, the interpreter follows:

**R**ead 

**E**valuate

**P**rint

**L**oop

which is the **REPL** acronym. It reads the code / number / string /..., then evaluates it, prints it to the "console" or in the Jupyter notebook into the output cell and finally loop this process and waits for the next input to be read. Some examples are:


In [1]:
10

10

The interpreter did read "10", evaluated it and knew it's an integer number, printed it into the output cell and waited for the next input

In [3]:
10 + 10

20

The interpreter did read "10 + 10", evaluated this into the number "10" that need to be added "+" to another number "10" and made this calculation, printed the result "20" to the output cell and was ready for next input. 

You can test this by changing the numbers or the operators (+, -, \*, /) in the cell above and run them by pressing ```[SHIFT] + [ENTER]```. 

## Python basic operators

|**Operator**|**Name**|**Example**|
|---|---|---|
| + | Addition | 2 + 3 == 5 |
| - | Subtraction | 8 - 3 == 5 |
| * | Multiplication | 2 * 3 == 6 |
| / | Division | 12 / 3 == 4 |
| % | Modulus | 5 % 2 == 1 |
| // | Floor Division | 9 // 2 == 4 |
| ** | Exponential | 4 ** 2 == 16 |

That is all to make already a lot of calulations and you can of course use them like a pocket calculator on your computer. The typical operator precedence we know from out math courses are valid and you can test them in the notebook.

In [8]:
(2 + 3) * 3

15

## Python variables

In the examples before, we did not actually stored or kept the values of the calculations. To keep something for later use the **variables** are used in Python. In simple words, we give some name, best so that it makes some sense on what we want to keep, and refer to this later. 

To store the results of our calculations we may think to use a variable ```result```. Let's check if Python already knows it? Just run the following code cell:

In [9]:
result

NameError: name 'result' is not defined

Obviously not. The interpreter gave an error message telling us that we have to **define** the variable we want to use. To do so, we have to usw the equals operator "=" and we get:

In [10]:
result = 4

Now, we do not get an error message but Python does not show the result! Python made an **assignment** and assigned the value of 4 to the variable ```result```.  We can check that by typing:

In [11]:
result

4

Variables can be reused, in the next cell we assign a new value to ```result``` and show the effect:

In [12]:
result = 5 * 3
result

15

## Python strings

Sometimes we want to give textual feedback, that can be done by **strings**. Or, another option is that we have some text as identifying keys of a map or dictionary or like in the table above the header line. Strings are defined by either quotes **' '** or double quotes **" "**. 

In [13]:
greeting = 'Hello'
greeting

'Hello'

In [14]:
sayHi = "Hello everybody!"
sayHi

'Hello everybody!'

Strings can be concatenated by simply adding them together:

In [15]:
greeting + sayHi

'HelloHello everybody!'

We can print them using the builtin function ```print()```. 

In [16]:
print(sayHi)

Hello everybody!


Can you see a difference? Printing a string had no quotes in the output. Just evaluating the variable doe print the variables content and that is a string of which the quotes belong. Python internally seem to prefer quotes.

Another operation we can do on strings is to ask for their length:

In [17]:
len(sayHi)

16

## Python functions

to structure our code into tasks, we can use **functions**. The key word to tell Python that we define a function is ```def```. Let's tru to define a functions that says hi to someone.

In [21]:
def say_hi_to(name):
    print('Hi', name)

In [22]:
say_hi_to('Lea')

Hi Lea


If you look at the function there is a certain structure needed. after ```def``` we have the name of the function with which we can call it. All commands that belong to the function have then to be indended so Python can recognise them to belong to the function. 

Let's make a function that takes multiple arguments and returns a value.

In [44]:
def welcome(name, location):
    print('Hi', name, 'welcome to', location)
    return name + ' is here in ' + location + '!'

In [45]:
result = welcome('Lea', 'Tartu')

Hi Lea welcome to Tartu


In [46]:
result

'Lea is here in Tartu!'

In [47]:
welcome('Lea', 'Tartu')

Hi Lea welcome to Tartu


'Lea is here in Tartu!'

In the last example, we have given two arguments to the function, the function did print a welcome greeting and also returned some string. In comparison to the ```say_hi_to(name)``` function, where nothing is returned we can store the returned string to some variable. If we do not so, the returned string is aoutomatically shown in the output. 

## Decisions and booleans

Programs are great to make decisions depending on values that come from calculations, or are in strings etc. Also, Python can use boolean operators to make comparisons. We use them to control how the flow of the program goes. 

Let's see some examples.

In [48]:
bear_is_home = True
if bear_is_home:
    print('better run...')

better run...


In [49]:
bear_is_home = False
if bear_is_home:
    print('better run...')
else:
    print('let\'s grab the honey')

let's grab the honey


### Basic boolean operators

|**Operator**|**Name**|**Example**|
|---|---|---|
| > | greater than | 2 > 5 -> False |
| < | smaller than | 2 < 5 -> True |
| >= | greater or equal than | 3 >= 3 -> True |
| <= | smaller or equal than | 3 <= 2 -> False |
| == | is equal | 3 == 3 -> True |
| != | is not equal | 4 != 7 -> True | 

## Python control structures and loops

Computers are thankful doing stuff we find usually quite boring or even tedious. So counting from 1 to 1000 in steps of 1 is a somewhat boring task. Or reading sentences letter by letter have not very much excitement potential. The ```for``` loop is doing this without complaining.

In [50]:
for letter in 'Hi there, I am some sentence!':
    print(letter)

H
i
 
t
h
e
r
e
,
 
I
 
a
m
 
s
o
m
e
 
s
e
n
t
e
n
c
e
!


For loops are ideal in the case we already know how long an object is. Lists as example, can hold differernt things like numbers, strings etc. and usually they have a certain length. In the next example we loop over a list of things.

In [51]:
mylist = [13, 'z', 'Yeah']
for item in mylist:
    print(item)

13
z
Yeah


But, what to do if we do not know how long an object is or if we want to repeat something unless aome condition is met? Python uses in this case the ```while``` loop.

In [52]:
i = 1
while i <= 5:
    print(i)
    i = i + 1

1
2
3
4
5


In some cases, we want to run something forever. This can be done by

```python
while True:
    <do somthing>
```
Be careful, in a terminal window this loop can be stopped by pressing the ```CTRL-C``` keys. In Jupyter notebooks, this can be stopped by topping the Python kernel via the kernel menu. But, in some cases when your program produces in this situation massive output your Broweser may get stuck. You need then to use your systems way to stop that process.


## Python comments

It is however a good idea to write some comments into the code. In Python there are the hash **#** symbol that is used to tell the interpreter to skip the rest of the line starting from the hash. These comments help to remember what the actual piece of code is doing. A good documented code helps you to understand also after some days or weeks to understand your code. 

In Jupyter notebooks, you can as well use the markdown cells to write longer texts and thoughts or even explanations of the pieces of code which are in the notebook. 


In [56]:
# This is a comment.

a = 40 # upper limit of our loop
b = 0  # result goes here

# calculate the sum of all elements until the upper limit and print the result
for i in range(a):
    b += i  # short version of b = b + i
    
print(b)

780


The example here is actually kind of overcommented, but we wanted to show the possibility of comments in code. How much comments there should be is a question that is not very easy answered. While it is good to have some explanations it does not make sense to have everything commented. Becasue we can choose meanigful names for variables, the need of commenting long lists single letter variable names is more related to a "bad programming style". Let's try once more

In [57]:
upper_limit = 40
result      = 0

# calculate the total in the range 0 to upper_limit
for i in range(upper_limit):
    result += i
    
print(result)

780


In the last example, it is already very clear what the variables are for and no extra comments are needed. The actual comment is now used to explain what the loop is doing. It, hopefully, shows that it is possible to make "readable" programs.

Another way to make some kind of documentation and comments are **docstrings**. These are used mostly in function definitions to let us make a longer description abut the function. Let's have an example. Docstrings are with three quotes or double quotes enclosed.

```python
""" here some text """ == ''' here some text '''
```

The text between the quotes can span over several lines.


In [60]:
def do_nothing():
    """This function doesn't do anything."""
    pass

In [61]:
def do_still_nothing():
    """
    This function does still nothing.
    
    Parameters:
        none
    
    Returns:
        nothing
    """
    pass

In [62]:
help(do_nothing)

Help on function do_nothing in module __main__:

do_nothing()
    This function doesn't do anything.



In [63]:
help(do_still_nothing)

Help on function do_still_nothing in module __main__:

do_still_nothing()
    This function does still nothing.
    
    Parameters:
        none
    
    Returns:
        nothing



From the last cells we can see that there is a very important purpose of these docstrings. We can write already in the code a documentation for the functions we are providing. This documentation can then be accessed via Python's help system. 

## The Python progam structure in Jupyter notebooks

A Python program consists of a series of commands in which we define variables and functions, use boolean operators to control the flow of the program and make decisions what to do in the onr or the other case. 

In the Jupyter environment, we can use code cells to run pieces of code, annotate them via markdown cells, and do a very interactive process in developing the program, or maybe better the workflow, our task needs.
