# A quick overview

In this course you will get to know the programming language **Python**. You will get to know the language up to a level where you will be able to write some scripts and run Python code as an executable program, or in a notebook environment.

Since this is the preferred way for a majority of data scientists, and a convenient form for educational purposes, we will start in a notebook setting. The document you are currently reading is a **Jupyter** notebook. A scientific notebook is an electronic  document used for **literate programming**. In literate programming, text (usually in the form of **Markdown**), code, and its output (both textual and graphical) are combined.

## An example

Suppose I want to calculate the surface of three different equilateral triangles, with sides 3, 4 and 2.21 respectively.
First I define a **variable** representing the side and assign an initial value to it. To inspect the value of a variable at any given time you can `print()` it.

In [5]:
side = 3
print(side) # print the side

3


Note the use of `#text` to add human-readable comments to the code. Everything on a line after `#` will be ignored when evaluating the cell.

<div class="alert alert-block alert-info">
    <img src="pics/64px-Simple_Information.png" style="float:left;margin-right:10px;">
    <span style="display:block;overflow:hidden;">
        <strong>Code comments</strong><br> Any text occurring on a line after a hash symbol <code>#</code> will be ignored by the python interpreter. Text after a hash is called code comment. <br><br>
    </span>
</div>

`print()` is a **function** that can be used to display text on the console (or in a notebook).  
If it is the last statement in a notebook cell, you can simply type the name of the variable you want to see without wrapping it in a call to print().


<div class="alert alert-block alert-info">
    <img src="pics/64px-Simple_Information.png" style="float:left;margin-right:10px;">
    <span style="display:block;overflow:hidden;">
        <strong>Functions</strong><br>A function is a re-usable piece of code, accessible by its name. It can receive input data (arguments) specified by its parameters. These are passed to the function between parentheses. For instance, <code>print(..., sep=" ")</code> is a function named print that can receive any number of arguments to print (...) that will be displayed on the console separated by the value specified with the "sep" parameter. You can create your own functions very easily as will be shown later.
    </span>
</div>

Next, I use the `side` variable to calculate the surface of a 3 - 3 - 3 triangle. The formula is $\frac{1}{2} * width * height$, but since it is equilateral we can simply use $\frac{1}{2} * side^2$

In [8]:
side = 3
area =  0.5 * side**2

f'The area of an equilateral triangle with side {side} is {area}'


'The area of an equilateral triangle with side 3 is 4.5'

<div class="alert alert-block alert-info">
    <img src="pics/64px-Simple_Information.png" style="float:left;margin-right:10px;">
    <span style="display:block;overflow:hidden;">
        <strong>Operators</strong><br> 
        Operators such as <code>*</code> and <code>**</code> are (combinations of) symbols within Python code that <i>operate</i> on (usually) the two values on either side of it (named <i>operands</i>), and that will generate a new value from these two input values. 
        We call this combination an <i>expression</i>. <br> 
        For instance, <code>+</code> is the addition operator that takes the values on either side and returns the sum of the two.
        Other mathematical operators are, not surprisingly, minus (<code>-</code>), plus (<code>+</code>), division (<code>/</code>). <br> 
        There are many more operators, classified as <i>mathematical</i> (like the above), <i>assignment</i> (like the `=` symbol in the above cells), <i>comparison</i> operators (such as "&lt;"), and some more categories. A complete listing of operators can be found <a href="https://www.w3schools.com/python/python_operators.asp">here</a> or 
        <a href="https://www.tutorialspoint.com/python/python_basic_operators.htm">here</a>.
    </span>
</div>

This:
```python
f'The area of an equilateral triangle with side {side} is {area}'
```
is a format string; a series of characters in which we can insert values from python variables at the location of the curly braces to produce this:  
  
`'The area of an equilateral triangle with side 3 is 4.5'`


Until a few Python versions ago we could not use these _format strings_ and needed to write this:

`print("The area of an equilateral triangle with side ", side, " is ", area)'`

instead of this:

`f'The area of an equilateral triangle with side {side} is {area}'`

Now I want to do the same calculation and reporting for other triangles. I could just copy-and-paste all code, as in the cell below:

In [7]:
side = 4
area =  0.5 * side**2

print(f'The area of an equilateral triangle with side {side} is {area}') 

side = 2.21
area =  0.5 * side**2

#note the use of round() in the statement below
print(f'The area of an equilateral triangle with side {side} is {round(area, 2)}') 

The area of an equilateral triangle with side 4 is 8.0
The area of an equilateral triangle with side 2.21 is 2.44


This is awful!   

Copy-and-paste activities are a real no-no in programming.  
Whenever you catch yourself using Ctrl+C & Ctrl+V, stop and consider a better way to do it.  
In many cases this will result in extracting the copied code into a **function**.  

Below, the re-used piece of code is embedded in the function `triangle_area()`.

In [2]:
def print_triangle_area(side):
    area =  0.5 * side**2
    print(f'The area of an equilateral triangle with side {side} is {round(area, 2)}')


We'll visit the theory and practice of functions in a later chapter. Suffice to know here is that a named piece of code has been defined which can be used by that name when it is also given a value for its input argument, `side`. 

With this function defined (and loaded) it is easy to repeat the operation for a series of values:

In [8]:
print_triangle_area(3)
print_triangle_area(4)
print_triangle_area(2.21)

The area of an equilateral triangle with side 3 is 4.5
The area of an equilateral triangle with side 4 is 8.0
The area of an equilateral triangle with side 2.21 is 2.44


But wait! There is still copied code. Taking a leap forward in Python programming concepts - the `for` loop, here is an even better way to do it.

In [8]:
sides = [3, 4, 2.21]        # a list of values
for n in sides:             # iterate using for
    print_triangle_area(n)  # the iterated block

The area of an equilateral triangle with side 3 is 4.5
The area of an equilateral triangle with side 4 is 8.0
The area of an equilateral triangle with side 2.21 is 2.44


The `for` loop _iterates_ an iterable (a collection of values) and executes the given block for each of the values. We call this type of construct a **flow control** element because it controls the flow of the program.  

# A real program

By now you may be thinking that this does not look like programming at all. Where is the program?
Actually there is no program here. A program is usually a piece of functionality on a computer or other device that receives some input (keyboard, mouse, touchscreen, sensor, etc.) and generates some output (screen, terminal, file, database).


Given the triangle example above, a standalone program for calculating and reporting triangle surface area would look like this:

```python
'''This program calculates the surface of equilateral triangles with sides given on the command line.'''

import sys

def print_triangle_area(side):
    area =  0.5 * side**2
    print(f'The area of an equilateral triangle with side {side} is {round(area, 2)}')

for arg in sys.argv[1:]:
    side = float(arg)
    print_triangle_area(side)
```


Suppose this code is stored in a text file (e.g. 'triangle_surface.py') on your computer it could be run from a terminal (Linux or MacOS) or command prompt (Windows) using the command 

```bash
> python3 triangle_surface.py 3 4 2.21
The area of an equilateral triangle with side 3 is 4.5
The area of an equilateral triangle with side 4 is 8.0
The area of an equilateral triangle with side 2.21 is 2.44
```

The code for this program, which we usually call a **script**, can be found [here](./scripts/triangle_surface.py).  

There are some elements that you may not understand yet. For those who cannot proceed without a little premature explanation:  

- `import sys` says "load the functionality located in module `sys` and make it available here". The functionality that is available by default in any Python program is rather limited to keep it lean. Any additional functionality must be loaded from modules using an **import statement**.
- `sys.argv` is the list of arguments entered on the command line, in this case `["triangle_surface.py", "3", "4", "2.21"]`. We'll get to lists in the next chapter. 
- `for arg in sys.argv[1:]:` this says: "iterate the command-line arguments but skip the first". Again, subject of a later chapter.

## Excercises

These assume that you already have Python3 installed on your device. Use the Qt console from Anaconda Navigator for these. Alternatively you can use the terminal Python 3 REPL (Read, Evaluate Print Loop) engine by typing `python3` (so without any argument) in the Terminal or Command prompt. Both options provide an interactive Python session we call a REPL (for Read Evaluate Print Loop).  

1. First enter `import math` and press enter to load the math module. 

2. Inspect the value of `math.pi` and try out the function `math.sqrt()`.

3. Calculate the following (using `math.pi` and `math.sqrt()` where relevant).  

    - $\frac{4.6 + 1.2}{2.09}$ 
    - $3\times4^\frac{7}{12}$
    - $r = 6$ <br />
      $\frac{2}{3}\times \pi r^3$  
    - $5 \times (\frac{4 + 2}{\sqrt{7} \times 9})$
  
4. On your computer, create a folder that will hold the exercises of this course. In it, put a copy of the above script [triangle_surface.py](./scripts/triangle_surface.py). Run the script as demonstrated above. Try out some other command-line arguments.

5. Change the script at some points and investigate the effect:
    - comment-out (e.g. put a hash symbol `#` in front of it) the `import sys` statement
    - change `float(side)` to `int(side)`
    - change `sys.argv[1:]` to `sys.argv[2]` and to `sys.argv[2:]`
    - use your imagination and experiment further


# Key concepts

- **(computer) program**: a computer program is a sequence or set of instructions in a programming language that a computer (or other device) can execute or interpret.
- **flow control**: programming elements that control the flow of a program. Flow control elements are used for iteration and conditional execution.
- **function**: A function is a chunk of code (usually named) that you can re-use, rather than copying it multiple times. Functions enable programmers to break down a problem into smaller pieces, each of which performs a particular task.
- **import**: A statement making functionality available that is not loaded by default.
- **Jupyter**: a Notebook platform in which you do interactive _literate programming_. It supports Julia, Python and R.  
- **Markdown**: Markdown is a lightweight markup language that you can use to add formatting elements to plaintext text documents. It is used in a wide range of settings: Jupyter Notebooks, R Markdown, eBook authoring etcetera.
- **operator**: a symbol that operates on operands, usually on both sides, together forming an expression.
- **Python**: a very popular _programming language_, praised for its ease of learning and use and applicability in a wide range of programming challenges.  
- **script**: a text file with computer code that can be executed as a program, usually by the _interpreter_ for the programming lanuage used.
- **variable**: a program element that couples a name to a memory location with some contents.