# Overview of basic programming concepts

## Before we start

The notebook at the end of the day is just a list of cells (chunks) with either text or executable code and its output. To enter the input mode and type in code or text into a cell you need to simply click on the cell. However, you do not need to use the mouse. There are quite a few extremely useful shortcuts that allow navigating through the document using almost only the keyboard (you will learn fast that it is what you want to do most of the time). 

First, you need to remember that there are two modes: `input` and `navigate` mode. While the former is quite obvious because it is used to type, the latter allows moving between cells. In the navigate mode, you can move between cells using arrow keys. To enter the input mode press `enter`, and you will be able to type in something. On the other hand, you can leave the input mode without executing the cell by simply pressing `esc`.

However, before entering the cell it is good to know what you will write inside, whether it will be code or text. Therefore, press `m` to write plain text and `y` to write the code (obviously you need to be in the navigate mode otherwise you will just type either m or r). The default input is code.

To execute the cell press `shift + enter` or `ctrl + enter`. The difference between these two shortcuts is subtle. The former executes the code and moves to the next cell in the navigate mode, and the latter executes the code while staying in the same cell in the navigation mode. Bellow, there are a few useful tips on how to use Jupyter Notebook effectively:
- use arrows `up` and `down` to navigate
- use `enter` to enter the input mode
- use `esc` to leave the input mode
- use `m` when in the navigation mode to change the cell to text cell
- use `y` when in the navigation mode to change the cell to code cell
- use `a` when in the navigation mode to add a cell above the current cell
- use `b` when in the navigation mode to add a cell below the current cell
- use `c` when in the navigation mode to copy the current cell
- use `v` when in the navigation mode to paste the cell below the current cell
- press `dd` (double d) when in the navigation mode to delete the current cell.

## More Resources

More resources and tutorials can be found on Google Colab's [main page](https://colab.research.google.com/notebooks/welcome.ipynb).

## Loading notebooks directly from Github

This is a useful thing and we will use it in the class. Here are some [instructions](https://colab.research.google.com/github/googlecolab/colabtools/blob/master/notebooks/colab-github-demo.ipynb#scrollTo=K-NVg7RjyeTk). Fear not if you do not understand! It will become clear and obvious soon enough and I will provide you with exact instructions/links every time anyway.

# Basic programming concepts (in Python3)

In a nutshell (and simplifying a bit) programming is an art of communicating with a machine that is super efficient in performing simple logical and arithmetical operations as well as storing the results but beyond that, it is extremely dumb (for amusement check out [this](https://www.youtube.com/watch?v=FN2RM-CHkuI) video what would happen if people followed the instructions as computers do). Hence, it is crucial to be **perfectly** explicit. Otherwise, the machine will not understand, or even worse, misunderstand our intentions without telling us, which may lead to results that are seemingly ok but inherently wrong.

### Algorithms

Therefore, when we are trying to communicate with a computer we use instructions that it can understand. In computer science and maths, they are usually called algorithms. They describe a set of computations that when executed on a set of inputs will proceed through a sequence of well-defined states and eventually produce an output. You can think of an algorithm like a recipe from [Jamie's Oliver cookbook](https://www.jamieoliver.com/features/how-to-cook-pasta-in-6-easy-steps/):

1. Fill a large saucepan with water, put the lid on, and bring to a boil over high heat.
2. Add a good pinch of sea salt.
3. Once the water is boiling, stir in the pasta.
4. Cook the pasta according to the packet instructions. To tell if your pasta is cooked, try a piece about a minute or so before the end of the cooking time. It’s ready when it’s soft enough to eat, but still has a bit of bite. The Italians say ‘al dente’.
5. Scoop out a mugful of the starchy cooking water and set it aside. This will help emulsify the pasta sauce.
6. Drain the pasta in a colander over the sink. Now it’s ready to toss through your favorite sauce -- it’s best to do this in the pan, adding splashes of cooking water and mixing as you go until your sauce coats the pasta and is a perfect consistency.

Or more in the mathematical way like Heron of Alexandria who allegedly was the first to document a way to compute the square root of a number:

1. Start with a guess, $g$.
2. If $g\times g$ is close enough to $x$, stop and say that $g$ is the answer.
3. Otherwise create a new guess by averaging $g$ and $x/g$, i.e. $\frac{(g + x/g)}{2}$.
4. Using this new guess, which we again call $g$, repeat the process until $g \times g$ is close enough to $x$.

### Python

In general, there is no such concept as the best programming language because if you can do something in one programming language you can do it also in another (all modern programming languages are [Turing complete](https://en.wikipedia.org/wiki/Turing_completeness). However, different languages are better or worse for different kinds of applications. You have already learned a bit of _R_ and it is perfect for statistics analysis. I would also say it is good for data wrangling and data manipulation on a small scale but I would not recommend it for big data or any of the purposes I mentioned before. Similarly, neither _R_ nor _Python_  is very good for websites creation. It is doable but there are better languages for that.

Python, likewise every other programming language, has a set of primitive constructs, syntax, static semantics, and semantics. Therefore, in this regards it is similar to natural languages, e.g. in English the primitive constructs are words, the syntax describes which strings of words constitute well-formed sentences, the static semantics defines which sentences are meaningful, and the semantics defines the meaning of those sentences.

So, typically in English, when you want to form a sentence you can not really throw five nouns like: [Person, Woman, Man, Camera, TV](https://youtu.be/tRjGHHYe9nA?t=24). To create a syntactically valid sentence you need to have some other parts of speech than just nouns. In Python it is similar. The sentence of primitives $3.2 + 3.2$ is correct but $3.2\ 3.2$ is not.

The static semantics defines which syntactically valid strings have a meaning. For example, in English, you can not just write 'I runs 5km every day'. It is incorrect (check in Grammarly if you do not believe me). Although syntactically everything works in terms of regular verbs declination there is an issue here.

The semantics of a language associates a meaning with each syntactically correct string of symbols that has no static semantic errors. In other words, what action the program should perform. In programming languages, each correct sentence (program) has exactly one meaning, unlike in natural languages.

### Errors

Syntax errors are the most common. You should be prepared for getting a lot of them. It is going to be a bit frustrating but soon enough you will learn first how to read error messages and second how to avoid syntax errors. The more dangerous and harder to avoid are semantic errors. That is because _Python_ like any other programming language can not read in your mind. If there are no syntax and static semantic errors the program has a meaning. However, sometimes the meaning of the program is different than you would expect because you made a mistake in designing it.


### Arithmetic operators

In general, in programming languages, we have two types of objects: scalars and non-scalars. The difference is quite simple. The former are indivisible while the latter has an internal structure. For now, it might sound a bit cryptic but soon it will become clearer. Let's first focus on scalars. In _Python_ we have four types of scalars:

* `int` -- represent integers, e.g. `-3` or `10002`.
* `float` -- represent real numbers. However, they always include the decimal point, e.g. `3.0`. It is also possible to write them in scientific notation, e.g. `1.6E10`.
* `bool` -- represent boolean values, either `True` or `False`. NOTE: the spelling is different than in _R_.
* `None` -- it has a single value and it represents missing data.

For starters, we will consider the computer just a big, fat calculator. Thus, we will start by introducing ourselves to basic arithmetic operations.

In [None]:
## Addition
3 + 7

In [None]:
## Subtraction
10 - 7

In [None]:
## Multiplication
10 * 11

In [None]:
## Division
7 / 3

Let us now note that something important has just happened. What is it? Any ideas?

We used two integers and as a result, we got a float (a floating-point number). Although it is not a big surprise (at least it should not be if you graduated from high school) but it is important to realize that division will produce an output of a float type. Always.

The second, important observation is that the result is an approximation. The correct result should be $2\frac{1}{3}$ not $2.3333333333333335$. That is because of how _Python_ and other programming languages represent floating-point numbers. Under the hood, they are written as base 2 (binary) fractions. It means that for example $.125$ is represented by $\frac{0}{2} + \frac{0}{4} + \frac{1}{8}$. In the case of $\frac{1}{3}$, it is impossible to find a base 2 fraction that would represent it precisely. What is more, it is impossible to write in base 2 fractions $.1$ precisely. The closest approximation is $\frac{3602879701896397}{2^{55}}$ which equals to $.1000000000000000055511151231257827021181583404541015625$. This problem is known in all modern programming languages as [Represenation problem](https://docs.python.org/3/tutorial/floatingpoint.html). However, unless we want to perform very very precise calculations we should not be bothered about it. Moreover, there are ways in _Python_ to perform more precise calculations but we will not need them here.

Instead, let us also note another 'weird' behavior of division.

In [None]:
7 / 0 

What happened? We just did something illegal in mathematics. And Python does not like it and will not allow it, so it "threw an error".

Something like this will happen always when we do something illegal from the vantage point of the programming language and/or software libraries we are using.

Notice that Python is kind enough to be quite explicit and tells us what happened that made it angry. A message like this is called a traceback. It shows the error but also tries to pinpoint exactly the place in the code where the error happened. The one we see here is very simple, but in real-world settings, tracebacks can be quite complicated and intimidating at first. However, the best way to approach them is just to be calm and read carefully through them.

We will see some more advanced examples later on.

---

It is also possible to compute powers and roots in _Python_.

In [None]:
## Raising to a power
2**4

In [None]:
## Square root
4**(1/2)

In general `**` operator allows performing raising to any arbitrary power including fractional powers (roots) which is legal in mathematics. This may sometimes lead us to weird places...

In [None]:
(-4)**.5

There are also some more esoteric arithmetic operators.

In [None]:
## Integer division
13 // 4

In [None]:
## Modulo / modulus operator (division remainder)
13 % 4

Complex expressions are read from left to right and the standard prevalence of operations we know from school is used. We can also use round braces to modify the order of operations.

In [None]:
2*3 + 4    ## 10

In [None]:
2*(3 + 4)  ## 14

Of course, we can also compare two values, which brings us to the realm of logical operators.

In [None]:
## Note that we use a double 'equal' sign
2 + 2 == 5

In [None]:
2 + 2 == 4

In [None]:
2 + 2 == 5

### Logical operators

There are (of course) two basic, primitive logical values: `True` and `False`.

In [None]:
True

In [None]:
False

And using them we can express arbitrary logical operations.

In [None]:
## QUESTION: what are the results of the following operations?

## Unary operator: negation
not True

In [None]:
not False

In [None]:
## Binary operators: 
## conjunction (logical and)
True and True

In [None]:
True and False

In [None]:
False and True

In [None]:
False and False

In [None]:
## Disjunction (logical or)
True or True

In [None]:
True or False

In [None]:
False or True

In [None]:
False or False

So far so good, but what about other basic binary operators we know (right?) from our logic classes?

Usually there are at least two more:

* Implication: $p \Rightarrow q$
* Equivalence: $p \Leftrightarrow q$


This is a good moment to briefly introduce the concept of variables (we will talk more about it soon).
What we would like to do now, is to assign logical values (`True` or `False`) to two variables (`p` and `q`) and build logical expressions using only the operators we introduced so far representing implication and equivalence.

In [None]:
## Assigning values to names (variables)
p = True
q = True
## Note that we use a single 'equal' sign to assign to variables
## Double 'equal' is used for comparisons

In [None]:
## NOTE: variables defined in one chunk can be accessed in another one
p

In [None]:
## EXCERCISE: define implication
## IMPLICATION: The truth table
# p | q | p => q
# 1 | 1 |   1
# 1 | 0 |   0
# 0 | 1 |   1
# 0 | 0 |   1

not p or q

In [None]:
## EXCERCISE: define equivalence
## EQUIVALENCE: The truth table
# p | q | p <=> q
# 1 | 1 |    1
# 0 | 1 |    0
# 1 | 0 |    0
# 0 | 0 |    1
(p and q) or (not p and not q)

Sometimes people use also so `XOR` operator, which is exclusive or. Can you implement it in Python with basic logical operators?

In [None]:
## EXCERCISE: define XOR
## XOR: The truth table
# p | q | p XOR q
# 1 | 1 |  0
# 1 | 0 |  1
# 0 | 1 |  1
# 0 | 0 |  0

(p and not q) or (not p and q)

### Strings

We have already seen quite a few different types of values, namely integers (whole numbers), floating-point (real) numbers, and logical values. However, it would be nice if we could write something and compute on textual values, right? Fortunately, we can.

In [None]:
"Bob and Alice are two generic persons."

The thing above is a so-called **string** (abbreviated as `str` in _Python_). NOTE: `str` in _Python_ is not the same as `chr` in _R_. That is because they are divisible. We will see it in a bit.

Summing up we have the following primitive types in _Python_ (sorted by generality):

1. Logical values (or booleans; called `bool` in _Python_)
2. Integers (called `int` in _Python_)
3. Floating-point numbers (called `float` in _Python_)
4. Strings (called `str` in _Python_)

We have already seen that we have quite a lot of different operators that allow us to compute numbers and booleans. It turns out that we have also some basic operators for computing on strings.

In [None]:
s1 = "Text"
s2 = "More text"

In [None]:
## Adding two strings concatenates them
s1 + s2

In [None]:
## Multiplying a string by an integer repeats it n times
s1*4

In [None]:
## Does it always make sense to multiply a string?
s1 * 2.5

In [None]:
## We can also ask for the length of a string (number of individual characters)
len(s1)

There are many more tricks you can do with strings, but we will not go further into that for now.