# Introduction to Python. Floating point arithmetics.
ECON 3127/4414/8014 Computational methods in economics  
Week 1  
Fedor Iskhakov  
<img src="../img/lecture.png" width="64px"/>


## Part 2: Introduction to Python

<img src="img/PythonLogo.jpg" width="512px"/>

- General–purpose programming language capable of performing _many different tasks_ including scientific computing
- Open source (free!), development coordinated through the [Python Software Foundation](https://www.python.org/psf/)
- Experienced rapid adoption in the last decade, and is now one of the most popular programming languages



### Popularity of Python

<img src="img/python_projections.png" width=700px style="margin: auto">

<div style="font-size:10px">Source: <a href="https://stackoverflow.blog/2017/09/06/incredible-growth-python/">StackOverflow</a></div>


### Scope of Python

<img src="img/python_usage.png" width=900px style="margin: auto">

<div style="font-size:10px">Source: <a href="https://www.quora.com/What-are-the-places-where-Python-is-used">Quora</a></div>



### Low and high level programming languages

We have to distiguish

1. Low level languages (Assembler, FORTRAN, C, C++)
    - Very fast
    - Very verbose
2. High level languages (Matlab, R, **Python**)
    - Slower (although not for all tasks and circumstances)
    - A lot more concise
    - Versatile with (usually) many libraries
    
What is the best practice?    

### How concise/verbose?

<img src="img/language_verbosity.png" width=740px style="margin: auto">

<div style="font-size:10px">Source: <a href="http://blog.revolutionanalytics.com/2012/11/which-programming-language-is-the-most-concise.html">blog.revolutionanalytics.com</a></div>



### Adding two numbers in Assembler

```
AddNumbers:
	std           ; go from LSB to MSB
	clc           ;
	pushf         ; save carry flag
.top
	mov ax,0f0fh  ; convert from ASCII BCD to BCD
	and al,[si]   ; get next digit of number2 in al
	and ah,[di]   ; get next digit of number1 in ah
	popf          ; recall carry flag
	adc al,ah     ; add these digits
	aaa           ; convert to BCD
	pushf         ;
	add al,'0'    ; convert back to ASCII BCD digit
	stosb         ; save it and increment both counters
	dec si        ;
	loop .top     ; keep going until we've got them all
	popf          ; recall carry flag
	ret           ;
    
```

<div style="font-size:10px">Source: <a href="http://assembly.happycodings.com/code1.html">assembly.happycodings.com</a></div>

### Reading from a file in Python

```Python
data_file = open("data.txt")
for line in data_file:
    print(line.capitalize())
data_file.close()
```

### Speed comparisons

Aruoba, S. Borağan & Fernández-Villaverde, Jesús, 2015. "A comparison of programming languages in macroeconomics," Journal of Economic Dynamics and Control, Vol. 58(C), pages 265-273.  
http://econweb.umd.edu/~webspace/aruoba/research/paper24/Aruoba_FernandezVillaverde_Programming

Aruoba, S. Borağan & Fernández-Villaverde, Jesús, 2015. "A Comparison of Programming Languages in Economics: An Update"  
https://www.sas.upenn.edu/~jesusfv/Update_March_23_2018.pdf

Jules Kouatchou, NASA. "Basic Comparison of Python, Julia, Matlab, IDL and Java (2018 Edition)"  
https://modelingguru.nasa.gov/docs/DOC-2676


### A Comparison of Programming Languages in Economics: An Update

<img src="img/runtime1.png" width=700px style="margin: auto">
<img src="img/runtime2.png" width=730px style="margin: auto">



### Trade-off

<img src="img/tradeoffs.png" width=1100px style="margin: auto">


### Objective function

Development and maintenance time $+$ (run time $\times$ number of runs) $\longrightarrow$ MIN

**Minimizing one component only is suboptimal!**  
(premature optimization)

* High level language (Python) for overall structure and appearance
* Low level language (C or C++) for computational bottlenecks
* Necessary to also think about
    - Vectorization
    - Parallelization (scalability)

### Why Python for Computational Economics?

- Versatile high level programming language
- High quality scientific libraries (NumPy, SciPy, Pandas, Matplotlib)
- Parallelization and just-intime (JIT) compilation
- Modern machine learning libraries (Tensorflow API, Scikit Learn)
- Vast array of free libraries in other fields (web, networks, natural language processing, etc.)
- Positive spillovers from popularity (Stack Overflow)

## Part 3: Floating point arithmetics

* Because computers only work with 0 and 1 internally, all real numbers have to be represented in _binary_ format
* This leads to many peculiar arithmetics properties of seemingly simple mathematical expressions
* Understanding how computers work with real numbers is essential for computational economics

### Simple example

In [163]:
a = 0.1
b = 0.1
c = 0.1
a+b+c == 0.3

False

### So can we now trust the following calculation?

In [164]:
interest = 0.04
compounding = 365
investment = 1000
t=10
daily = 1 + interest/compounding
sum = investment*(daily**(compounding*t))

format(sum, '.25f')


'1491.7920028601754438568605110'

### Compare to exact calculation

In [167]:
from decimal import *
getcontext().prec = 100 #set precision of decimal calculations
#using floats
interest1 = 0.04
compounding = 365*24
t=100 #years
investment1 = 10e9 #one billion
daily1 = 1 + interest1/compounding
sum1 = investment1*(daily1**(compounding*t))
#the same using precise decimal representation
interest2 = Decimal(interest1)
daily2 = 1 + interest2/compounding
investment2 = Decimal(investment1)
sum2 = investment2*(daily2**(compounding*t)) #using exact decimals
diff=sum2-Decimal.from_float(sum1)
format(diff, '.10f')



'14.7882694559'

### So, what is happening?
- Real numbers are represented with certain precision
- In some cases, the errors may have economic significance
- In order to write robust code suitable for the task at hand we have to understand what we should expect and why

*Numerical stability* of the code is an important property!

### Number representation in decimal form
$r$ &mdash; real number  
$b$ &mdash; _base_ (radix)  
$d_0,d_1,d_2,...,d_k$ &mdash; digits (from lowest to highest)

$$
r = d_k \cdot b^k + d_{k-1} \cdot b^{k-1} + \dots + d_2 \cdot b^2 + d_1 \cdot b + d_0
$$

For example for decimals $b=10$ (0,1,..,9) we have

$$
7,631 = 7 \cdot 1000 + 6 \cdot 100 + 3 \cdot 10 + 1
$$

$$
19,048 = 1 \cdot 10000 + 9 \cdot 1000 + 0 \cdot 100 + 4 \cdot 10 + 8
$$


### Number representation in binary form
Now let $b=2$, so we only have digits 0 and 1

$$
101011_{binary} = 1 \cdot 2^5 + 0 \cdot 2^4 + 1 \cdot 2^3 + 0 \cdot 2^2 + 1 \cdot 2 + 1 = 43_{decimal}
$$

$$
25_{decimal} = 16 + 8 + 1 = 2^4 + 2^3 + 2^0 = 11001_{binary}
$$

Other common bases are 8 and 16 (with digits $0,1,2,\dots,9,a,b,c,d,e,f)$

$0_{binary}$ $\rightarrow$ $1_{binary}$ $\rightarrow$ $10_{binary}$ $\rightarrow$ $11_{binary}$ $\rightarrow$ ??

_Is it possible to count to 1000 using 10 fingers?_

### Similar structure for fractions

In base-$b$ using $k$ _fractional_ digits

$$
1.r = 1 + d_{-1} \cdot b^{-1} + d_{-2} \cdot b^{-2} + \dots + d_{-k} \cdot b^{-k}
$$

$$
1.5627 = \frac{15,627}{10,000} = 1 + 5 \cdot 10^{-1} + 6 \cdot 10^{-2} + 2 \cdot 10^{-3} + 7 \cdot 10^{-4}
$$

Yet, for some numbers there is no finite decimal representation

$$
\frac{4}{3} = 1 + 3 \cdot 10^{-1} + 3 \cdot 10^{-2} + 3 \cdot 10^{-3} + \dots = 1.333\dots
$$

$$
\frac{4}{3} = 1 + \frac{1}{3} = 1 + \frac{10}{3} 10^{-1} = 1 + 3 \cdot 10^{-1} + \frac{1}{3}10^{-1}
$$

$$
= 1.3 + \frac{10}{3} \cdot 10^{-2} = 1.3 + 3 \cdot 10^{-2} + \frac{1}{3}10^{-2}
$$

$$
= 1.33 + \frac{10}{3} \cdot 10^{-3} = 1.33 + 3 \cdot 10^{-3} + \frac{1}{3}10^{-3} = \dots
$$

### In binary
$$
1.1 = \frac{1}{10} = 1 + \frac{16}{10} 2^{-4} = 1.0001 + \frac{6}{10} 2^{-4} =
$$

$$
1.0001 + \frac{12}{10} 2^{-5} = 1.00011 + \frac{2}{10} 2^{-5} =
$$

$$
1.00011 + \frac{16}{10} 2^{-8} = 1.00011001 + \frac{6}{10} 2^{-8} = 1.000110011...
$$

Therefore $0.1$ can not be represented in binary exactly!

### Rounding error

Squeezing infinitely many real numbers into a finite number of _bits_ requires an approximate representation

$p$ &mdash; number of digits in the representation of real number $r$  
$e$ &mdash; exponent between $e_{min}$ and $e_{max}$, taking up $p_e$ bits to encode

$$
r \approx \pm d_0. d_1 d_2 \dots d_p \cdot b^e
$$

The float takes the total of $1 + p + p_e$ digits


### Bits in float point represetnation
<img src="img/bit_map.gif" width=1100px style="margin: auto">

### Distribution of representable real numbers
<img src="img/float_map.jpg" width=1100px style="margin: auto">

### The main issues to be aware of
1. Rounding errors $\leftrightarrow$ loss of precision when numbers are represented in binary form 
$\Rightarrow$ can not compare floats for equality
2. Catastrophic cancellation $\leftrightarrow$ potential drastic loss of precision when substracting close real numbers represented by floats $\Rightarrow$
innocent formulas may in fact be numerically unstable
3. Overflow $\leftrightarrow$ obtaining a real number that is too large to be represented as float
4. Underflow $\leftrightarrow$ obtaining a real number that is indistinguishable from zero

_Will look at these cases in the Lab next Monday_

# Thank you. Questions?

## Further learning resources
- Simple guide to Git http://rogerdudler.github.io/git-guide/
- Comparing programming languages https://modelingguru.nasa.gov/docs/DOC-2676
- Counting on fingers https://www.youtube.com/watch?v=UixU1oRW64Q
- Floating point arithmetics http://www.itu.dk/~sestoft/bachelor/IEEE754_article.pdf
