**NB** The order of the notebook is not logical, simply dump of the mindmap, followed by `notedown` conversion to `ipynb`!

# Computer programs  
  
“A **program** is a sequence of instructions that specifies how to perform a computation. The computation might be something mathematical, such as solving a system of equations or finding the roots of a polynomial, but it can also be a symbolic computation, such as searching and replacing text in a document or something graphical, like processing an image or playing a video.” (Downey, 2015) 

![Mindmap](Computer-programs-mindmap-20170728.png)

 
## Data types  
  
### Dictionaries, hashes, arrays, etc.  
  
The use of more advanced structures often enables efficient implementation of complex computations (such as signal-processing operations on large datasets). The types of data structures available are highly language-dependent, and using them becomes natural if and when the domain-specific problem calls for it. No need to learn something you won’t use...  
  
### Lists and other collections  
  
A list is an ordered sequence of elements of another type. In python, we can assign a list to a variable using the following syntax

In [None]:
one_to_five = [1, 2, 3, 4, 5]  

Depending on the language, the power of lists ranges from great to humongous.

In [None]:
a_crazy_list_in_python = [‘one’, 2, 3.0, one_to_five]  # lists can contain any objects, including other lists...  

The word ‘list’ is usually reserved for an _ordered_ set of elements. Some languages also have the concept of a ‘collection’ of (unordered) elements (in Python, this is known as a ‘set’).  
  
### Characters and strings  
  
A string is an ordered sequence of (one or more) characters. There is all manner of mayhem associated with strings in programming languages (‘encoding’ is a thorny issue), but we don’t need to get into those here.  
  
In python, strings are particularly nimble (don’t try this in other languages!):

In [None]:
one_string = ‘I am a ’  
another_string = ‘Horse’  
one_string + another_string  
3 * another_string  

### Floating point numbers  
  
Like integers, decimal number need to be stored using some number of bits. ‘Float’ and ‘single’ refer to decimal numbers saved using 32-bit resolution, whereas a ‘double’ refers to a 64-bit representation of a decimal number. It is beyond the scope of this course to go into the details of under which conditions one representation/resolution is more appropriate than the other. Suffice it to say: it’s better to err on the side of caution and use doubles. This is indeed what all mainstream interpreted languages do.  
  
### Integers  
  
Whole numbers, the maximum size of which depends on how many _bits_ are used to store the number (in memory and/or on disk).  
  
## Variables, expressions and statements  
  
Following Chapter 2 in (Downey, 2015).  
  
### Assignment & variables  
  
" One of the most powerful features of a programming language is the ability to manipulate **variables**. A variable is a name that refers to a value.” (Downey, 2015)  
  
Variables are **assigned** values; in _most_ programming languages, the assignment **operator** is the equal-sign (`=`). **Assignment statements** are read from left-to-right, here are a couple of `python`-assignments:

In [None]:
message = “Is it time for a break yet?”  
n = 42  
electron_mass_MeV = 0.511  

The programmer is free to choose the _names_ of the variables she wishes to use in the code, as long as they do not violate the language’s syntactic rules. Common to most languages is that the following variable names are _syntactically incorrect_.  
  
_Never_ begin the name of a variable with a digit, or pretty much any other non-letter character (there are a few exceptions, such as _underscore_, `_`):

In [None]:
2fast = 140  
@home = False  

Note that _a variable name is like a pointer to whatever data the assignment operation dictates_. In interpreted languages, one does not need to pre-define what the variable is to contain: the interpreter figures this out on-the-fly. Similarly, the assignment to a variable can happen multiple times in the ‘lifetime’ of a program: each assignment simply moves the pointer to a new data object.  
  
In Python, to see what a variable contains (maybe we forgot!), we can simply write the name of the variable on a line and execute it, or use the built-in `print`-command

In [None]:
n  
print(message)  

### Keywords  
  
Each programming language ‘reserves’ some portion of typically the English (natural) language as **keywords** that are associated with specific (typically low-level) computational operations.  
  
Practically all languages use the keywords `if``, `else` and `for` for ‘control flow’ of program execution. In Python, `not`, `True`, `False` and `None` are used to define logic:

In [None]:
two_greater_than_one = (2 > 1)  
print(two_greater_than_one)  
if two_greater_than_one:  
	print(‘Correct’)  

**NB** Keywords cannot be used as variable names! Try, and you will get a Syntax Error.  
  
### Expressions and operators  
  
An **expression** is a combination of values, variables, and operators. An expression gets **evaluated** when it is executed, and a value is found for it:

In [None]:
n = 42  
m = n + 25  
print(m)  

A **statement** is a unit of code that has an effect, like creating a variable or displaying a value (there are three statements in the code block above). When you type a statement, the interpreter **executes** it, which means that it does whatever the statement says.  
  
**Operators** are one of the ways we can manipulate the data referenced by our variables. The usual mathematical operators are omnipresent, but things get interesting when we start applying operations to non-numeric objects. In Python, we could _e.g._ multiply a string by an integer, or in Matlab divide a 2D _matrix_ of numbers by a 1D _vector_ to obtain the least-squares estimate of the solution to a set of linear equations! But we digress...  
  
## Programming languages (Downey, 2015)  
  
**Natural languages** are the languages people speak, such as English, Spanish, and French. They were not designed by people (although people try to impose some order on them); they evolved naturally.   
  
**Formal languages** are languages that are designed by people for specific applications. For example, the notation that mathematicians use is a formal language that is particularly good at denoting relationships among numbers and symbols. Chemists use a formal language to represent the chemical structure of molecules. And most importantly:   
  
**Programming languages are formal languages that have been designed to express computations.**  
  
### Textual “code” files  
  
The set of formal statements that constitute a program are known as **code**. The programmer writes code into _files_ that are saved using a file extension that indicates the language the code is written in; examples include:  
  
* _.py_ for Python  
* _.c_ for C  
* _.m_ for Matlab  
  
Note that they are **textual**, _i.e._, human-readable format (as opposed to machine code).  
  
### Syntax  
  
The formal programming languages have associated _syntax rules_ that come in two flavors: one pertaining to **tokens** and another to **structure** [how tokens may be combined]. Tokens are the basic elements of the language, such as words, numbers, and chemical elements. Programming languges differ greatly in both the specific form of the tokens used and the structures they may form.  
  
One of the most common error messages a programmer encounters when beginning to use a new language is the infamous: `Syntax Error`! For learners of natural languages, structural errors much more common are ;)  
  
### Unambiguous and literal  
  
“The meaning of a computer program is unambiguous and literal, and can be understood entirely by analysis of the tokens and structure.  
  
Formal languages are more dense than natural languages, so it takes longer to read them. Also, the structure is important, so it is not always best to read from top to bottom, left to right. Instead, learn to parse the program in your head, identifying the tokens and interpreting the structure. Finally, the details matter. Small errors in spelling and punctuation, which you can get away with in natural languages, can make a big difference in a formal language.” (Downey, 2015)  
  
### Comments  
  
Partly because programming languages have rather terse syntax (some are worse than others!), it is considered a _good custom_ to annotate the computational “business”-portions of programs with __comments__. Comments are portions of the _code_ that are not **parsed** by either the interpreter or the compiler, _i.e._, these are “left out” from the translation to machine instructions. Comments are thus exempt of syntax-checking and meaning-parsing applied to all other code.  
  
Here is an example of a comment in the Python language. Whenever the hash-sign (‘#’) is encountered, parsing of the current line is stopped and the parser (interpreter or compiler) moves to the next line.

In [None]:
2  # this is the number two  

## Compiled programs/languages  
  
This is the ‘traditional’ way of thinking about programming: a two-stage process of execution  
  
1. a **compiler** (_e.g._, gcc) passes through _all code_, checks it (for syntax & structure), then writes out **machine code**  
1. the **non-human-readable** machine code can be executed as a ‘program’  
  
The ‘compilation’ stage allows, amongst other things, the compiler to _optimize_ the execution of the CPU-level instructions with respect to more-or-less detailed knowledge of the processor and memory layout of the computer performing the computations. Note also that once a program has been compiled to be executable, it is no longer human-readable.  
  
Examples of (important) compiled languages include:  
  
- Fortran  
- Java  
- C / C++ / C#  
  
## Interpreted programs/languages  
  
These are executed line-by-line (fail-on-error)  
  
### Javascript  
  
- used to make webpages interactive and provide online programs, including video games.  
  
### Matlab  
  
- GUI by Mathworks (costly license)  
- Numerical algorithms public domain  
  
### Python / IPython  
  
- command-line interface  
- open-source, free software  
  
### Jupyter notebook  
  
"The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and explanatory text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, machine learning and much more.  
  
The Notebook has support for over 40 programming languages, including those popular in Data Science such as Python, R, Julia and Scala.” ([jupyter.org][1])  
  
Common to them all is the interactive, _interpreted_ nature.  
  
## References  
  
Allen Downey, _How to think like a computer scientist_, 2nd edition, Green Tea Press, 2015. URL:    
  
Quotes from the book, which may be freely downloaded [here](thinkpython2.com), are used under the terms of the Creative Commons Attribution-NonCommercial 3.0 Unported License, which is available at http://creativecommons.org/licenses/by-nc/3.0/.  
  
## Exercises  
  
#### Syntax  
  
### Syntax  
  
Try running these code blocks. Some of them have syntax appropriate to the Python programming language, others are invalid; can you predict which is which?

In [None]:
3 + 2  
Add together the numbers 3 and 2  
sum(3, 2)  # this is tricky, we’ll return to it later  
3 + -2  
3 + 2 .  
3 +* 2  
3 ** 2  

### Variables  
  
* In math notation you can multiply `x` and `y` like this: `xy`. What happens if you try that in Python? Why?  
* Suppose the cover price of a book is 229 kr, but bookstores get a 40% discount. Shipping costs 49 kr for the first copy and 3 kr for each additional copy. What is the total wholesale cost for 60 copies? _Write a sequence of statements, using variable assignments and expressions/operators, and print the answer._  
  
### Data types  
  
### Functions  
  
**Before running it!**, figure out what the output of this code block will be:

In [None]:
def double_the_input(input):  
	return(2 * input)  
  
def add_two_items(item_a, item_b):  
	return(item_a + item_b)  
  
a = 4.5  
b = 7  
  
print(add_two_items(double_the_input(a), b)  

### Debugging  
  
#### Debugging errors  
  
Fix the code below until it runs through without error. Hint: use the `print`-function to inspect the contents of variables. You should be able make the code run even without understanding what it is doing!

In [None]:
from glob import glob  
import numpy as np  

## Debuggng  
  
Errors and mistakes inevitably find their way into code. The execution (or compilation) of a program stops at an error, after which it is your job to   
  
By ‘mistake’, we here refer to something less than an ‘error’, _i.e._, the program may  
  
## Conditionals and iteration  
  
See Chapters 5 and 7 in (Downey, 2015)  
  
## Functions  
  
" In the context of programming, a **function** is a named sequence of statements that performs a computation. When you define a function, you specify the name and the sequence of statements. Later, you can “call” the function by name.   
  
A function call is like a detour in the flow of execution. Instead of going to the next statement, the flow jumps to the body of the function, runs the statements there, and then comes back to pick up where it left off.  
  
That sounds simple enough, until you remember that one function can call another. While in the middle of one function, the program might have to run the statements in another function. Then, while running that new function, the program might have to run yet another function!  
  
In summary, when you read a program, you don’t always want to read from top to bottom. Sometimes it makes more sense if you follow the flow of execution.” (Downey, 2015; Chapter 3)  
  
### Arguments  
  
### Why functions?  
  
But is there a deeper point/advantage in using functions instead of just writing out the code you want to have executed? One reason is that: “Creating a new function gives you an opportunity to name a group of statements, which makes your program easier to read and debug. [and] Well-designed functions are often useful for many programs. Once you write [...] one, you can reuse it.” A third example relates to efficiency: "Functions can make a program smaller by eliminating repetitive code. Later, if you make a change, you only have to make it in one place.” (Downey, 2015)  
  
### Namespaces  
  
## Scripts  
  
When working in a field with computational elements, you are likely to come across the term *script*.  
  
  
[1]: http://jupyter.org