**Table of contents**<a id='toc0_'></a>    
- 1. [Your first notebook session](#toc1_)    
  - 1.1. [Execution of code in cells](#toc1_1_)    
- 2. [Fundamentals](#toc2_)    
  - 2.1. [The work flow of your computer](#toc2_1_)    
  - 2.2. [What is a variable?](#toc2_2_)    
- 3. [Atomic types](#toc3_)    
  - 3.1. [Type conversion](#toc3_1_)    
  - 3.2. [Operators](#toc3_2_)    
  - 3.3. [Augmentation](#toc3_3_)    
  - 3.4. [Logical operators](#toc3_4_)    
  - 3.5. [Summary](#toc3_5_)    
- 4. [Containers](#toc4_)    
  - 4.1. [Lists](#toc4_1_)    
    - 4.1.1. [Slicing a list](#toc4_1_1_)    
    - 4.1.2. [Referencing](#toc4_1_2_)    
  - 4.2. [Tuples](#toc4_2_)    
  - 4.3. [Dictionaries](#toc4_3_)    
  - 4.4. [Summary](#toc4_4_)    
- 5. [Extra](#toc5_)    
  - 5.1. [SimpleNamespace](#toc5_1_)    

<!-- vscode-jupyter-toc-config
	numbering=true
	anchor=true
	flat=false
	minLevel=1
	maxLevel=6
	/vscode-jupyter-toc-config -->
<!-- THIS CELL WILL BE REPLACED ON TOC UPDATE. DO NOT WRITE YOUR TEXT IN THIS CELL -->

Types and operators
==============

You will be given an in-depth introduction to the **fundamentals of Python** (objects, variables, operators, classes, methods, functions, conditionals, loops). You learn to discriminate between different **types** such as integers, floats, strings, lists, tuples and dictionaries, and determine whether they are **subscriptable** (slicable) and/or **mutable**. You will learn about **referencing** and **scope**. You will learn a tiny bit about **floating point arithmetics**.

**Take-away:** This lecture is rather abstract compared to the rest of the course. The central take-away is **a language** to speak about programming in. An overview of the map, later we will study the terrain in detail. It is not about **memorizing**. Almost no code projects begin from scratch, you start by copying in similar code you have written for another project.

Hopefully, this notebook can later be used as a **reference sheet**. When you are done with the DataCamp courses, read through this notebook, play around with the code, and ask questions if there is stuff you do not understand. 

**Links:**

* **Tutorial:** A more detailed tutorial is provided [here](https://www.python-course.eu/python3_course.php).
* **Markdown:** All text cells are written in *Markdown*. A guide is provided [here](https://www.markdownguide.org/basic-syntax/).

# 1. <a id='toc1_'></a>[Your first notebook session](#toc0_)

**Optimally:** You have this notebook open as well on your own computer.

**Download course material**

1. Follow the [installation guide](https://sites.google.com/view/numeconcph-introprog/installation)
1. Follow the download part of the [git guide](https://sites.google.com/view/numeconcph-introprog/guides/git)

**Updating your local version of a notebook.**

* 1: Close down all tabs.
* 2: Press the tab **Git**.
* 3: Press **Open Git Repository in Terminal**
* 4: Make sure that you are in the repository folder you want to update (`IntroProg-lectures` or `IntroProg-exercises`, or your own repo).  
    * On **Windows** write `cd`. 
    * On **Mac** write `pwd`. 
    * This will display your current location. 
* 5: **See if YOU have any changes**
    * Write `git status`. 
    * Note if it says `modified: 02_Fundamentals/Primitives_Types_and_operators.ipynb`, or show modifications to **other files**.
* 6: **View incoming changes**
    * Write `git fetch`
    * Write `git diff --name-status main..origin/main` 

* 7: **Remove conflicting notebooks**
    * Were **any** of the files listed in Step 6 **also found** on the list produced in Step 5? Eg. `02_Fundamentals/Primitives_Types_and_operators` in both places?
    * If there are any overlaps (conflicts), you need to discard your own changes (you'll learn to stash later). 
    * Of course, if you made notes or experiments that you want to keep, you can always **make a copy** of your conflicting file and keep that. Just use a good old copy-paste and give your own file a new name.  
    * Then write `git checkout -- 02_Fundamentals/Primitives_Types_and_operators` **only if** there was a conflict for that file. Do so with **all** overlapping files.
* 8: **Accept incoming changes**
    * Write `git merge`

As you may have guessed, the command `git pull` is **identical** to a `git fetch` combined with a `git merge`. 

**Note:** This guide is _only a rough start_, meant to avoid all conflicting updates. You will soon learn to do better and **not** having to discard all you local changes in case of overlaps.  

**PROBLEMS?** Ask your teaching asssistant ASAP.

## 1.1. <a id='toc1_1_'></a>[Execution of code in cells](#toc0_)

* **Movements**: Arrows and scrolling
* **Run cell and advance:** <kbd>Shift</kbd>+<kbd>Enter</kbd>
* **Run cell**: <kbd>Ctrl</kbd>+<kbd>Enter</kbd>
* **Edit:** <kbd>Enter</kbd>
* **Toggle sidebar:** <kbd>Ctrl</kbd>+<kbd>B</kbd>
* **Change to markdown cell:** <kbd>M</kbd>
* **Change to code cell:** <kbd>Y</kbd>

# 2. <a id='toc2_'></a>[Fundamentals](#toc0_)


**Computers**

Before we start looking at code, let's check out the structure of a **computer**:

<img src="computer.gif" alt="computer" width=50% />

## 2.1. <a id='toc2_1_'></a>[The work flow of your computer](#toc0_)
* You give it some command through an input device (eg. keyboard)
* The control unit figures out if any new data from the hard disk (external storage) is needed
* If it is needed, the control unit loads that data and puts in an **address** in memory
* From memory, the data can be accessed by the arithmetic unit in the cpu to do the prompted calculations 
* Resulting data is stored in memory
* Control unit can then pass resulting data in memory to output devices (eg. screen) and to hard disk

**Figuratively speaking**: 
* **Memory** is like a well organized file cabinet, where all data is neatly stored and quickly accessible.
* Each drawer of this "file cabinet" is an *address* where data can be stored. 
* We can see the address of any variable in memory by applying the function `id()`.   
* In turn, the **hard disk** is like a cellar with reports in boxes. It contains much more data but is also slower to retrieve from.  

## 2.2. <a id='toc2_2_'></a>[What is a variable?](#toc0_)

* A variable in python is thus a **reference** (or *pointer*) to a place in memory where data resides.
* There are *many* types of variables. 
* Some store data **directly**, some are **containers** for data.
* There are **4 types** of data:
    * Booleans (true/false)
    * Integers
    * Floats (decimal numbers)
    * Strings
* The 4 kinds of data use **different amounts of memory** pr unit. Important not to waste memory! 
* A variable that references one of these data types **directly** is an **Atomic type**.  
    The data of an atomic type is **unchangeable** at the address. 
* Variable types that are containers are eg. **lists**, **dictionaries**, **data frames**, etc.  
    Their data is allowed to change.
* **All variables are objects**: bundles of data and functions. 

# 3. <a id='toc3_'></a>[Atomic types](#toc0_)

The most simple types are called **atomic**, because they cannot be changed - only overwritten. 

**Integers (int):** -3, -2, -1, 0, 1, 2, 3, $\ldots, \infty$  
There is no cap on the size of ints in python!

In [46]:
import sys # Don't worry about this, it is a built in module which allows us to see how much memory objects fill on our computer

In [47]:
# variable x references an integer type object with a value of 1
x = 1 

print('x is a', type(x)) # prints the type of x
print('x =', x)
print('Address of x is',id(x)) 
print('x uses',sys.getsizeof(x),'bytes')

x = x*2
print('\nNote that the address is new, as x gets a new value!')
print('x =', x)
print('Address of x is',id(x)) 

x is a <class 'int'>
x = 1
Address of x is 2482308540720
x uses 28 bytes

Note that the address is new, as x gets a new value!
x = 2
Address of x is 2482308540752


**Decimal numbers (float)**: 3.14, 2.72, 1.0, etc.

In [2]:
x = 1.2
# variable x references an floating point (decimal number) type object 
# with a value of 1.2 

print('x is a',type(x))
print('x =',x)
print('x uses',sys.getsizeof(x),'bytes')

x is a <class 'float'>
x = 1.2
x uses 24 bytes


**Strings (str)**: 'abc', '123', 'this is a full sentence', etc.

In [3]:
x = 'abc' 
# variable x references a string type opbject 
# with a value of 'abc'

print('x is a',type(x))
print('x =',x)
print('x uses',sys.getsizeof(x),'bytes')

x is a <class 'str'>
x = abc
x uses 52 bytes


**Note:** Alternatively, use double quotes instead of single quotes.

In [4]:
x = "abc" 
# variable x reference a string type opbject 
# with a value of 'abc'

print('x is a',type(x))
print('x =',x)
sys.getsizeof("abc")

x is a <class 'str'>
x = abc


52

**Booleans (bool)**: True and False

In [5]:
x = True 
# variable x reference a boolean type opbject 
# with a value of False

print('x is a',type(x))
print('x =',x)
print('x uses',sys.getsizeof(x),'bytes')

x is a <class 'bool'>
x = True
x uses 28 bytes


**Atomic types:**

1. Integers, *int*
2. Floating point numbers, *float*
3. Strings, *str*
4. Booleans, *bool*

## 3.1. <a id='toc3_1_'></a>[Type conversion](#toc0_)

Objects of one type can (sometimes) be **converted** into another type.  
This obviously changes the address of an atomic type.  
As an example, from float to string:

In [6]:
x = 1.2
# variable x references an floating point (decimal number) type object 
# with a value of 1.2 

y = str(x) 
# variable y now references a string type object 
# with a value created based on x 

print('x =', x)
print('x is a',type(x))
print('y =', y)
print('y is a',type(y))

x = 1.2
x is a <class 'float'>
y = 1.2
y is a <class 'str'>


From float to integer: **always** rounds down!

In [7]:
x = 2.9

y = int(x) # variable y now references an integer type object  
print('x =', x)
print('y =', y)
print('y is a',type(y))

x = 2.9
y = 2
y is a <class 'int'>


**Limitation:** You can, however, e.g. not convert a string **with letters** to an integer.

In [8]:
try: # try to run this block
    x = int('222a')
    print('can be done')
    print(x)
except: # if any error found run this block instead
    print('canNOT be done')

canNOT be done


**Note**: The identation is required (typically 4 spaces).

**Question**: Can you convert a boolean variable `x = False` to an integer?

- **A:** No
- **B:** Yes, and the result is 0
- **C:** Yes, and the result is 1
- **D:** Yes, and the result is -1
- **E:** Don't know

## 3.2. <a id='toc3_2_'></a>[Operators](#toc0_)

Variables can be combined using **arithmetic operators** (e.g. +, -, /, **).<br>For numbers we have:

In [9]:
x = 3
y = 2
print(x+y)
print(x-y)
print(x/y)
print(x*y)
print(x**2)

5
1
1.5
6
9


For strings we can use an overloaded '+' for concatenation:

In [10]:
x = 'abc'
y = 'def'
print(x+y)

abcdef


A string can also be multiplied by an integer:

In [11]:
x = 'abc'
y = 2
print(x*y)

abcabc


**Question**: What is the result of `x = 3**2`?

- **A:** `x = 3`
- **B:** `x = 6`
- **C:** `x = 9`
- **D:** `x = 12`
- **E:** Don't know


**Note:** Standard division converts integers to floating point numbers.

In [12]:
x = 8
y = x/2 # standard division
z = x//3 # integer division
print(y,type(y))
print(z,type(z))

4.0 <class 'float'>
2 <class 'int'>


## 3.3. <a id='toc3_3_'></a>[Augmentation](#toc0_)

Variables can be changed using **augmentation operators** (e.g. +=, -=, *=, /=)

In [13]:
x = 3 
print('x =',x)

x += 1 # same result as x = x+1
print('x =',x)
x *= 2 # same result as x = x*2
print('x =',x)
x /= 2 # same result as x = x/2
print('x =',x)

x = 3
x = 4
x = 8
x = 4.0


## 3.4. <a id='toc3_4_'></a>[Logical operators](#toc0_)

Variables can be compared using **boolean operators** (e.g. ==, !=, <, <=, >, >=). 

In [14]:
x = 3
y = 2
z = 10
print(x < y) # less than
print(x <= y) # less than or equal
print(x != y) # not equal
print(x == y) # equal

False
False
True
False


The comparison returns a boolean variable:

In [15]:
z = x < y # z is now a boolean variable
print(z)

False


## 3.5. <a id='toc3_5_'></a>[Summary](#toc0_)

The new central concepts are:

1. Variable
2. Reference
3. Object
4. Type (int, float, str, bool)
5. Value
6. Operator (+, -, *, **, /, //, % etc.)
7. Augmentation (+=, -=, *=, /= etc.)
8. Comparison (==, !=, <, <= etc.)

# 4. <a id='toc4_'></a>[Containers](#toc0_)

* A more complicated type of variable is a **container**.  
* This is an object, which consists of several objects, for instance atomic types.  
* Therefore, containers are also called **collection types**. 
* **Types of containers**
    * Lists
    * Dictionaries
    * Tuples 
    * Pandas data frames
    * ...

## 4.1. <a id='toc4_1_'></a>[Lists](#toc0_)

A first example is a **list**.  A list contains **elements** each **referencing** some data in memory.

In [16]:
x = [1,'abc'] 
# variable x references a list type object with elements
# referencing 1 and 'abc'

print(x,'is a', type(x))

[1, 'abc'] is a <class 'list'>


The **length** (size) of a list can be found with the **len** function.

In [17]:
print(f'the number of elements in x is {len(x)}')

the number of elements in x is 2


A list is **subscriptable** and starts, like everything in Python, from **index 0**. Beware!

In [18]:
print(x[0]) # 1st element 
print(x[1]) # 2nd element

1
abc


A list is **mutable**, i.e. you can change its elements on the fly.  
That is, you can change its **references** to objects.

In [19]:
x[0] = 'def'
x[1] = 2
print('x =', x, 'has id =',id(x))

# Change x[1]
x[1] = 5
print('x =', x, 'has id =',id(x))

x = ['def', 2] has id = 2482392800448
x = ['def', 5] has id = 2482392800448


and add more elements

In [20]:
x.append('new_element') # add new element to end of list
print(x)

['def', 5, 'new_element']


**Link:** [Why is 0 the first index?](http://python-history.blogspot.com/2013/10/why-python-uses-0-based-indexing.html)  

### 4.1.1. <a id='toc4_1_1_'></a>[Slicing a list](#toc0_)

A list is **slicable**, i.e. you can extract a list from a list.

In [21]:
x = [0,1,2,3,4,5]
print(x[0:3]) # x[0] included, x[3] not included
print(x[1:3])
print(x[:3])
print(x[1:])
print(x[:99]) # This is very particular to Python. Normally you'd get an error.  
print(x[:-1]) # x[-1] is the last element

print(type(x[:-1])) # Slicing yields a list
print(type(x[-1])) # Unless only 1 element

[0, 1, 2]
[1, 2]
[0, 1, 2]
[1, 2, 3, 4, 5]
[0, 1, 2, 3, 4, 5]
[0, 1, 2, 3, 4]
<class 'list'>
<class 'int'>


**Explantion:** 
* Slices are half-open intervals. 
* ``x[i:i+n]`` means starting from element ``x[i]`` and create a list of (up to) ``n`` elements.
* Sort of nice if you have calculated ``i`` and know you need ``n`` elements. 

In [22]:
# splitting a list at x[3] and x[5] is: 
print(x[0:3])
print(x[3:5])
print(x[5:])

[0, 1, 2]
[3, 4]
[5]


**Question**: Consider the following code:

In [23]:
x = [0,1,2,3,4,5]

What is the result of `print(x[-4:-2])`?

- **A:** [1,2,3]
- **B:** [2,3,4]
- **C:** [2,3]
- **D:** [3,4]
- **E:** Don't know

### 4.1.2. <a id='toc4_1_2_'></a>[Referencing](#toc0_)
**Container types, incl. lists, are non-atomic** 
* Several variables can refer to the **same** list.
* If you change the data of a list that **one** variable refers to, **you change them all**.
* Variables refering to the same object has the same id.

In [24]:
x = [1,2,3]
print('initial x =',x)
print('id of x is',id(x))
y = x # y now references the same list as x
print('id of y is',id(y))
y[0] = 2 # change the first element in the list y
print('x =',x) # x is also changed because it references the same list as y

initial x = [1, 2, 3]
id of x is 2482391841600
id of y is 2482391841600
x = [2, 2, 3]


If you want to know if two variables contain the same reference, use the **is** operator. 

In [25]:
print(y is x) 
z = [1,2]
w = [1,2] 
print(z is w) # z and w have the same numerical content, but do not reference the same object. 

True
False


**Conclusion:** The `=` sign copy the reference, not the content!

Atomic types cannot be changed and keep their identity.

In [26]:
z = 10
w = z
print(z is w) # w is now the same reference as z
z += 5
print(z, w)
print(z is w) # z was overwritten in the augmentation statement. 

True
15 10
False


If one variable is deleted, the other one still references the list.

In [27]:
del x # delete the variable x
print(y)

[2, 2, 3]


Containers should be **copied** by using the copy-module:

In [28]:
from copy import copy

x = [1,2,3]
y = copy(x) # y now a copy of x
y[0] = 2
print(y)
print(x) # x is not changed when y is changed
print(x is y) # as they are not the same reference

[2, 2, 3]
[1, 2, 3]
False


or by slicing:

In [29]:
x = [1,2,3]
y = x[:] # y now a copy of x
y[0] = 2
print(y)
print(x) # x is not changed when y is changed

[2, 2, 3]
[1, 2, 3]


**Advanced**: A **deepcopy** is necessary, when the list contains mutable objects.

In [30]:
from copy import deepcopy

a = [1,2,3]
x = [a,2,3] # x is a list of a list and two integers
y1 = copy(x) # y1 now a copy x
y2 = deepcopy(x) # y2 is a deep copy

a[0] = 10 # change1
x[-1] = 1 # change2
print(x) # Both changes happened
print(y1) # y1[0] reference the same list as x[0]. Only change1 happened 
print(y2) # y2[0] is a copy of the original list referenced by x[0]

[[10, 2, 3], 2, 1]
[[10, 2, 3], 2, 3]
[[1, 2, 3], 2, 3]


**Question**: Consider the following code:

In [31]:
x = [1,2,3]
y = [x,x]
z = x
z[0] = 3
z[2] = 1

What is the result of `print(y[0])`?

- **A:** 1
- **B:** 3
- **C:** [3,2,1]
- **D:** [1,2,3]
- **E:** Don't know

## 4.2. <a id='toc4_2_'></a>[Tuples](#toc0_)

* A **tuple** is an **immutable list**.
* Tuples are created with soft parenthesis, `t = (1,3,9)`.
* As with lists, elements are accessed by brackets, `t[0]`. 
* **Immutable:** `t[0]=10` will produce an error.
* We use tuples to pass variables around that should not change by accident.  
* **Functions** will **output** tuples if you specify multiple output variables.
* Tuples can also be used as arguments to function.

In [32]:
x = (1,2,3) # note: parentheses instead of square backets
print('x =',x,'is a',type(x))
print('x[2] =', x[2], 'is a', type(x[2]))
print('x[:2] =', x[:2], 'is a', type(x[:2]))

x = (1, 2, 3) is a <class 'tuple'>
x[2] = 3 is a <class 'int'>
x[:2] = (1, 2) is a <class 'tuple'>


But it **cannot be changed** (it is immutable):

In [33]:
try: # try to run this block
    x[0] = 2
    print('did succeed in setting x[0]=2')
except: # if any error found run this block instead
    print('did NOT succeed in setting x[0]=2')
print(x)

did NOT succeed in setting x[0]=2
(1, 2, 3)


## 4.3. <a id='toc4_3_'></a>[Dictionaries](#toc0_)

* A **dictionary** is a **key-based** container. 
* Lists and tuples use numerical indices.
* Initialized with curly brackets. `d={}` is an empty dictionary.
* Should be used when you need to look up data quickly by a name.
* Arch example: a **phone book**.  
    You know the name of a person (*key*), and want the phone number (*data*).
* Frequent use: want to make a **collection** of variables or parameters used in a model. 
* **Keys:** All immutable objects are valid keys (eg. `str` or `int`).
* **Values:** Fully unrestricted. 

In [34]:
x = {'abc': 1.2, 'D': 1, 'ef': 2.74, 'G': 30} # Create a dictionary
print("x['ef'] =", x['ef']) # Extracting content
x['abc'] = 100 # Changing content

x['ef'] = 2.74


Elements of a dictionary are **extracted** using their keyword. Can be a variable 

In [35]:
key = 'abc'
value = x[key]
print(value)

100


**Content is deleted** using its key:

In [36]:
print(x)
del x['abc']
print(x)

{'abc': 100, 'D': 1, 'ef': 2.74, 'G': 30}
{'D': 1, 'ef': 2.74, 'G': 30}


**Task:** Create a dictionary called `capitals` with the capital names of Denmark, Sweden and Norway as values and country names as keys.

**Answer:**

In [37]:
capitals = {}
capitals['denmark'] = 'copenhagen'
capitals['sweden'] = 'stockholm'
capitals['norway'] = 'oslo'

capital_of_sweden = capitals['sweden']
print(capital_of_sweden)

stockholm


## 4.4. <a id='toc4_4_'></a>[Summary](#toc0_)

The new central concepts are:

1. Containers (lists, tuples, dictionaries)
2. Mutable/immutable
3. Slicing of lists and tuples
4. Referencing (copy and deepcopy)
5. Key-value pairs for dictionaries

**Note:** All atomic types as immutable, and only strings are subscriptable.

In [38]:
x = 'abcdef'
print(x[:3])
print(x[3:5])
print(x[5:])
try:
    x[0] = 'f'
except:
    print('strings are immutable')

abc
de
f
strings are immutable


# 5. <a id='toc5_'></a>[Extra](#toc0_)

Other interesting containers are e.g. **namedtuple** and **OrderDict** (see [collections](https://docs.python.org/2/library/collections.html)), and [**sets**](https://docs.python.org/2/library/sets.html).

## 5.1. <a id='toc5_1_'></a>[SimpleNamespace](#toc0_)

[SimpleNamespace](https://docs.python.org/3/library/types.html) are also sometimes handy as replacements for dictionaries. They have fewer capabilities (are not really containers) but offer shorter syntax, as they replace *x['abc]* with *x.abc*

In [39]:
from types import SimpleNamespace

In [40]:
x = {'abc': 1.2, 'D': 1, 'ef': 2.74, 'G': 30} # Re-reate a dictionary
y = SimpleNamespace(**x)  # Turn it into a SimpleNamespace. 
# Alternatively it can be created as SimpleNamespace(abc=1.2,D=1, ef=2.74, G=30)

print(y)

namespace(abc=1.2, D=1, ef=2.74, G=30)


Now we reference the elements using dot notation

In [41]:
print('From Dictionary:', x['abc'])
print('From SimpleNamespace:', y.abc)

print('From Dictionary:', x['G'])
print('From SimpleNamespace:', y.G)

From Dictionary: 1.2
From SimpleNamespace: 1.2
From Dictionary: 30
From SimpleNamespace: 30


The only reason for wanting to do this transformation is the simpler syntax as you avoid writing *['']* <br>
But you do loose some dictionary capabilities you will learn about later in the course. You can easily turn a simple name space into a dictionary:

In [42]:
y_asdict = y.__dict__
print(type(y_asdict))
print(y_asdict)

<class 'dict'>
{'abc': 1.2, 'D': 1, 'ef': 2.74, 'G': 30}
