# Lesson 1: Introduction, variables and _for_-loops

## About me
Kasper Fyhn Jacobsen

BA in Linguistics from AU

Currently doing 3rd semester of MA as an intern at UNSILO (who works with NLP)

Mostly self-taught in programming: Java, Python and R

General interest in NLP and computer science

Contact: 201308063@post.au.dk

## Readings
Heinold (2012): _A Practical Introduction to Python Programming_

- Less focus on the computer science behind, more on the practical use
- No linguistics, though

Reading before or after class?

The reading plan is subject to change depending on how well it goes

## First things first
Is everyone set up with Anaconda?

I’m familiar with Windows, Mac and Linux, so just ask if you run into problems

Jupyter Notebook

Who has experience with coding? If so, which language(s) and what kind of projects?

## The format
When introducing concepts, I imagine introducting:
- The general concept
- How it works in Python
- Style conventions
- (Potential problems and how to handle them)

## Coding style - why?

Imagine that you are reading an article and come across something like this:

> "Several studies have suggested that aliens would most likely speak a language that we would not be able to produce [11, 13]. In experiments with Martians and Earthling toddlers, we saw that this was not the case (Blargon 2077, pp. 14-17), but the point may still be valid when we talk about aliens from other solar systems due to their different life forms (Johnson 2079: 43).

Sure, it may work, but it would be much nicer for both the writer and the readers if it had followed a consistent style. The same goes for code!

## Variables

## The concept of a variable

Recall from math that a variable is a "placeholder" for an actual value, e.g. in the Pythagorean theorem:

\begin{equation}
a^2 + b^2 = c^2
\end{equation}

With variables, we can make generalized rules and later fill in actual values for specific cases.

\begin{equation}
a = 3, b = 4 \\
c^2 = 3^2 + 4^2 = 25^2 \\
c = \sqrt{25} = 5
\end{equation}

## How it works in Python

A variable is declared by being given a name and a value.

Python stores a variable and its value.

We call variables by their names; Python keeps track of the values stored in variables.

In [3]:
import math

a = 3
b = 4
print('a =', a, ', b =', b)

a = 3 , b = 4


In [4]:
c_squared = a**2 + b**2
c = math.sqrt(c_squared)
print('c² =', c_squared, ', c =', c)

c² = 25 , c = 5.0


## A metaphor

It may be useful to think of a variable as a containing a value.

Then, when we do something with those variables, we do something with the value(s) contained in the variable(s).

That is, we can put the value `4` in the box labeled `a`. When we do something with `a`, we open the box and take out the value `4`.

### The example from Heinold (2012: 8)

What value do the variables contain at each line?

In [5]:
x = 3
y = 4
z = x + y
z = z + 1
x = y
y = 5

## The things we put in the variables: basic data types

Numbers are written simply as numbers, e.g. `12` (integers) or `5.67` (floating point numbers).

We can do arithmetic with numbers:

- addition `3 + 4` -> 7
- subtraction `5 - 2` -> 3
- multiplication `3 * 5` -> 15
- division `7 / 3` -> 2.333... 
- exponentiation `3**3` -> 27
- integer division `7 // 3` -> 2 (the remainder of 1 is thrown away)
- modulo `7 % 3` -> 1 (the remainder of integer division)

Characters and strings of characters - _strings_ more generally - are written with single or double quotes around them, e.g. `'c'`, `'a string'`, `"a long string with several characters"` or `"23"`.

We cannot do arithmetic with strings, but we can concatenate them:

- `"Monty" + " " + "Python"` -> `"Monty Python"`

We can also do tons of other things with strings, but more on this later!

## Coding style

Variable names can contain letters, numbers and the underscore, e.g.

- `c_squared`
- `var1` and `var2`
- `a_pretty_long_name`

Variable names cannot contain spaces or start with a number.

The PEP-8 coding style guide says to stick to lowercase for regular variable names.

For multiword variables, e.g. `a_pretty_long_name`, the words should be separated by underscores.

`CamelCase` and `CAPITALS` are reserved for other things.

Variable names can mean an important difference between transparent code and something incomprehensible. We can call variables for what they contain, what they are for or anything, e.g. `cleaned_transcripts` or `longest_utterance`.

## _for_-loops

## The concept of _for_-loops

Once we have repetitive tasks and/or things that we should do the same things to, computers are extremely helpful. 

Imagine that you want to get the number of speakers in each of your 1000 transcripts. You probably have manual work for several days. A computer does it in seconds.

The _for_-loop is also called _for each_ in other programming languages. The core idea is to do the same task **for each object in a row of objects**.

That is, you tell the computer: for each transcripts in this pile of transcripts, figure out the number of speakers and note it down.

Or you can say, for each number from 1 to 100, give me some calculation with that number.

## How it works in Python

There are two important elements in a _for_-loop:

- The row/list/... that the computer **iterates** over.
- The **loop variable** which changes to the next object in the list for each iteration.

The **list** is central here. You will learn more about these in detail later, but let's have an early look at them now and what we can do with them.

The block of code that should be repeated must be indented!

In [6]:
print(list(range(10)))

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


In [7]:
list_of_integers = [3, 7, 24, 56, 72]
print(list_of_integers)

[3, 7, 24, 56, 72]


In [8]:
for i in range(10):
    print(i)

0
1
2
3
4
5
6
7
8
9


In [9]:
for number in list_of_integers:
    print(number)

3
7
24
56
72


## More on the loop variable

Generally, we want to do _something_ with the things in our lists. By iterating over the list, we can do it **one by one**.

For each new iteration, the loob variable becomes the value next in line which we can then do something with.

In [10]:
for i in range(3, 22, 2):
    # the first number is where it should start
    # the second is the number that it should stop before
    # the third number is the value by which the loop variable should be incremented (increased)
    print('i is now', i)
    input()
    i_squared = i**2
    print('i to the power of 2 is', i_squared)
    input()
    

i is now 3

i to the power of 2 is 9

i is now 5

i to the power of 2 is 25

i is now 7

i to the power of 2 is 49

i is now 9

i to the power of 2 is 81

i is now 11

i to the power of 2 is 121

i is now 13

i to the power of 2 is 169

i is now 15

i to the power of 2 is 225

i is now 17

i to the power of 2 is 289

i is now 19

i to the power of 2 is 361

i is now 21

i to the power of 2 is 441



## Nested loops (advanced)

Loops can be nested. That is, we can have a loop within a loop.

In such a case, the inner loop runs all the way through for each iteration in the outer loop.

In [12]:
# lotst of different triangles

for i in range(1, 11):
    for j in range(1, 11):
        a = i
        b = j
        print('a =', a, ', b =', b)
        c_squared = a**2 + b**2
        c = math.sqrt(c_squared)
        print('c² =', c_squared, ', c =', c)
        print()

a = 1 , b = 1
c² = 2 , c = 1.4142135623730951

a = 1 , b = 2
c² = 5 , c = 2.23606797749979

a = 1 , b = 3
c² = 10 , c = 3.1622776601683795

a = 1 , b = 4
c² = 17 , c = 4.123105625617661

a = 1 , b = 5
c² = 26 , c = 5.0990195135927845

a = 1 , b = 6
c² = 37 , c = 6.082762530298219

a = 1 , b = 7
c² = 50 , c = 7.0710678118654755

a = 1 , b = 8
c² = 65 , c = 8.06225774829855

a = 1 , b = 9
c² = 82 , c = 9.055385138137417

a = 1 , b = 10
c² = 101 , c = 10.04987562112089

a = 2 , b = 1
c² = 5 , c = 2.23606797749979

a = 2 , b = 2
c² = 8 , c = 2.8284271247461903

a = 2 , b = 3
c² = 13 , c = 3.605551275463989

a = 2 , b = 4
c² = 20 , c = 4.47213595499958

a = 2 , b = 5
c² = 29 , c = 5.385164807134504

a = 2 , b = 6
c² = 40 , c = 6.324555320336759

a = 2 , b = 7
c² = 53 , c = 7.280109889280518

a = 2 , b = 8
c² = 68 , c = 8.246211251235321

a = 2 , b = 9
c² = 85 , c = 9.219544457292887

a = 2 , b = 10
c² = 104 , c = 10.198039027185569

a = 3 , b = 1
c² = 10 , c = 3.1622776601683795

a = 3 , b 