# Lecture 2: Introduction & Python Part 2
## September 01, 2023



# Recap: Command line arguments

Command line arguments are "words" written after the program name when you run it.
```bash
python hello_world.py 10
```


We have used the `sys` python package it is imported by
```python
import sys
```

### The first example (lecture-2-test-sys.py.py):

In [None]:
import sys
print(f"In this program, {sys.argv[1]} is the command line argument")

In [None]:
!python ./scripts/lecture-2-test-sys.py 10

### Several command line arguments (lecture-2-test-sys2.py)

the "magic" sys.argv line refers to the list sys.argv which consists of all the words (separated by spaces) that are listed on the command line (including the program name)

```python
import sys

sys_argv_list = sys.argv
cmd_args = sys.argv[1:]

print("The sys.argv list looks like this: ", sys_argv_list)
print("But we are only interested in these arguments: ", cmd_args)
```

In [None]:
!python3 scripts/lecture-2-test-sys2.py 10 20 30 40

## argparse
`argparse` is a standard library in Python used for parsing command-line arguments. It makes it easy to write user-friendly command-line interfaces by defining the arguments that your program requires and automatically generates help and error messages.

Why use `argparse` instead of `sys.argv`?
* Provides a lot of flexibility to specify how command-line arguments should be parsed.
* Automatically generates help messages.

```python
import argparse

# Initialize the parser
parser = argparse.ArgumentParser(description="This is a simple example.")

# Add arguments
parser.add_argument("name", help="Your name")
parser.add_argument("-a", "--age", help="Your age", type=int, default=0)

# Parse the arguments
args = parser.parse_args()

# Use the arguments
print(f"Hello, {args.name}!")
if args.age:
    print(f"You are {args.age} years old.")
```

In [None]:
!python ./scripts/lecture-2-argparse.py --help

In [None]:
!python ./scripts/lecture-2-argparse.py steven -a 29

# The input function
Another way of getting user information is the input function. The user does not need to provide command line arguments, but can reply to questions from the program: 

In [None]:
number = input('Write a number:')
print(f'Your number is {number}')

In [None]:
numbers = input("Write many numbers separated by spaces")
print(f"Your numbers are", numbers)

In [None]:
numbers = input("Write many numbers separated by spaces")
print(f"Your numbers are", numbers.split())

The final exapmle is quite similar to the result from sys.argv[1:]. I prefer sys, it is faster. 

## Useful string operations 

In [None]:
info = input("Write the current day, date and time separated by spaces")
infolist = info.split()
day = infolist[0]
date = infolist[1:-1]
time = infolist[-1]
print(f"Today is {day}. The date is {date}. The time is {time}")

In [None]:
info = input("Write the current day, date and time separated by commas")
infolist = info.split(',')
day = infolist[0]
date = infolist[1]
time = infolist[2]
print(f"Today is {day}. The date is {date}. The time is {time}")

## The join method
define the list
```python
date = ['09', 'September']
```
Can you extract the info from the list into one string saying only "09 September"

In [None]:
date = ['09', 'September']
' '.join(date)

## Adding strings

In [None]:
date[0]+' '+date[1]

# Working with textfiles
Text files are files containing sequences of characters. Unlike binary files, they are human-readable and are commonly used for storing data in a structured text format like JSON, CSV, or just plain text.

## Reading a file

To open a datafile in the same location as the current .py-file use the syntax open(filename) where filename is the name of the datafile as a string:
```python
infile = open('data/example_data.txt')
```
Here example_data may look like this: 
```bash
This is the first line of the file
This is the second line of the file
Below comes the interesting part of the file: 
10 20 30 
20 30 1
2.2 125 6.45
0.1 20 3.14
```

In [None]:
infile = open('data/example_data.txt')
infile

We can read the file line by line by using the method readline:

In [None]:
line1 = infile.readline()
line2 = infile.readline()
print(line1)
print(line2)
infile.close()

If we are not interested in the first (few) lines we can call infile.readline() a few times to skip those lines. 

In [None]:
infile = open('data/example_data.txt')
infile.readline()
infile.readline()
line3 = infile.readline()
print(line3)
infile.close()

The TextIOWrapper can be iterated over and starts at the current line in the datafile. We have already called infile.readline() three times since opening the last time, thus the first three lines are omitted in the for loop below:

In [None]:
for line in infile:
    print(line)

In [None]:
#full program to print the interesting lines: 
infile = open('data/example_data.txt')
infile.readline()
infile.readline()
infile.readline()
for line in infile:
    print(line)
infile.close()

### We can use readlines() to read all lines at once

In [None]:
infile = open('data/example_data.txt')
lines = infile.readlines()
print(lines)

Now assume that we wanted to store the numbers from the file in three lists/columns: c1, c2 and c3. In the end we should end up with: 
```python
c1 = [10, 20, 2.2, 0.1]
c2 = [20, 30, 125, 20]
c3 = [30, 1, 6.45, 3.14]
```

### `with` keyword
You can use the `with` keyword to automatically close the file after you are done reading/writing to it.

In [None]:
with open('data/example_data.txt') as f:
    lines = f.readlines()
    print(lines)

### Exercise (5-10 minutes)
1) Read the file "GRA4157/lectures/02-python-summary-2/data/example_data.txt" in python

2) Create empty lists c1, c2, and c3. Then iterate over the infile and add the first number in each line to c1, the second number in each line to c2 and the third number to c3. The type of objects in the lists should be float. 

### Useful string operations 2

The methods startswith, in and endswith are useful string operations that may be used when reading files. 

example_data2.txt
```bash
This is a header
This is a header
Numbers: 1 2 3
Numbers: 2 3 4
5 6 7
```
1) We are only interested in the lines that starts with "Numbers":

In [None]:
infile = open('data/example_data2.txt')
for line in infile:
    if line.startswith('Number'):
        print(line)

2) We are only interested in the lines that does not end with "header"

In [None]:
infile = open('data/example_data2.txt')
for line in infile:
    if not line.endswith('header\n'):
        print(line)

In [None]:
infile = open('data/example_data2.txt')
for line in infile:
    if not line.strip().endswith('header'):
        print(line)

3) We are only interested in lines that has the number 2 in them

In [None]:
infile = open('data/example_data2.txt')
for line in infile:
    if '2' in line:
        print(line)
infile.close()

## Writing to file

To write to file, we still use the open() function, but we have to specify that we want to write to file. 
```python
outfile = open('data/outfile.txt','w')
```
The mode, here "w", indicates that we want to write to file. The default value (when nothing is provided as in the previous examples) indicates that we want to read from file. Warning: If the file outfile.txt exists, everything will be overwritten by what we decide to write to the file. 

In [None]:
outfile = open('data/outfile.txt','w')
outfile.write('This is the first line of the file')
outfile.close()

In [None]:
outfile = open('data/outfile.txt','w')
outfile.write('The previous line is deleted and this is the new line')
outfile.close()

We can append to existing files using the 'a' mode when opening:

In [None]:
outfile = open('data/outfile.txt','a')
outfile.write('The previous text is still there, and this line was just appended')
outfile.close()

Use \n for newline

In [None]:
outfile = open('data/outfile.txt','a')      # use 'w' to write to new file 'a' to append to an already existing file
outfile.write("Now let's add a new line:\n")
outfile.write("This is the new line")
outfile.close()

We perform all the operations we want on a file before closing it. In the previous example we closed the file after each operation to inspect changes while we wrote. 

### Exercise: write a table to file
Assume that we have 10 numbers in a python list: [1,2,3,4,5,6,7,8,9,10]. 
Use python to write a file that contains the numbers as a column and another column with the square root of the given number. The file should look like this: 
```bash
x sqrt(x)
1 1
2 1.41
...
```
You can decide on how many decimals and on which format the sqrt(x) should contain in the file. 

## Starting to work with bigger data sets
Example: How to people spend their time? 
<img src="figs/Timeuse.png" style="width: 90%; margin: auto;">



I have exported the data to a .txt file: 
```text
,Country,Category,Time (minutes)
0,Australia,Paid work,211.146629603892
1,Austria,Paid work,279.53226810278
2,Belgium,Paid work,194.476452188763
3,Canada,Paid work,268.660609647898
4,Denmark,Paid work,199.771595915566
...
```


Let us now assume that we are only interested in how much people in a given country work (Paid work). How would you extract this information? The file has 462 lines, so manual reading is not effective.






















In [None]:
infile = open('data/Time-use.txt')
for line in infile:
    if "Paid work" in line:
        print(line)

In [None]:
infile = open('data/Time-use.txt')
line0 = infile.readline()
line1 = infile.readline()
print(line1)
info = line1.split(',')
print(info)

In [None]:
infile = open('data/Time-use.txt')
work = {}
for line in infile:
    if "Paid work" in line:
        info = line.split(',')
        country = info[1].strip()
        hours = info[-1]
        work[country] = float(hours.strip())
print(work)

In [None]:
print(work['India'])
print(work['Norway'])

In [None]:
number = min(work.values())
print(number)

In [None]:
idx = list(work.values()).index(number)
print(idx)

In [None]:
list(work.keys())[idx]

### Exercise

Locate the file GRA4157/lectures/02-python-summary-2/data/Time-use.txt.

1) Write a program that reads the file, and prints out all information about Norway. 

2) Write a program that reads the file, and prints out the information about Leisure time for all countries. 

3) Write a program that reads the file, and writes a new file sleep.txt, only consisting of the minutes of sleep per country. sleep.txt thus contains two columns, one column of countries and a corresponding column with minutes of sleep. The header should be "Country Sleep-minutes".

3) Write a program that computes a "happiness score" per country. The happiness score is computed via: hours_of_sleep + seeing_friends + other_leisure + 1.2\*education - 0.2*paid_work

# More on errors and exceptions
We often want to convert data from files (strings) to floating point numbers:

```txt
This is the first line of the file
This is the second line of the file
Below comes the interesting part of the file: 
10 20 30 
20 30 1
2.2 125 6.45
0.1 20 3.14
```

In [None]:
infile = open('data/example_data.txt')
numbers = []
for line in infile:
    info = line.split()
    try:
        number = float(info[0])
        numbers.append(number)
    except:
        print('Skipping line: ', line)
    
print(numbers)
    

## Raise (throw) an exception
When working with input data, we often want the program to fail when wrong input is provided:

In [None]:
message = input('Write hello')
if message != 'hello':
    raise Exception('The input should be hello')

There are numerous exceptions in python:

In [None]:
number = float(input('Write a number between 0 and 10'))
if number <= 0 or number >= 10:
    raise ValueError('The number must be between 0 and 10')

# More on reading files

In [None]:
with open('data/example_data.txt') as infile:
    for line in infile:
        print(line)

Gather all information about the countries in a nested dictionary

In [None]:
all_data = {}
with open('data/Time-use.txt') as infile:
    headers = infile.readline().split()
    for line in infile:
        info = line.split(',')
        country = info[1].strip()
        if not country in all_data:
            all_data[country] = {}
        else:
            all_data[country][info[2]] = float(info[3])

            
all_data['Norway']

We will later work with pandas that can deal with these "nested dictionaries" automatically

# Essentials to better Python

**Overview**
 * List comprehensions
 * Decorators
 * PEP8

# List comprehensions

List comprehensions provide a compact and readible way to create lists. 


**Syntax**:

Create a list without list comprehension:

```python
from math import sin
old_list = [0.1, 0.3, -0.4, 0.2]
def filter(x):
    if x > 0:
        return True
    else:
        return False
    
new_list = []
for x in old_list:
    if filter(x):
        new_list.append(sin(x))
```        
the same task with list comprehension

```python
new_list = [sin(x) for x in old_list if filter(x)]
```

### Example 1: List of even numbers

**Task**: Create a list of even numbers.

**Solution** without list comprehension:

In [None]:
def is_even(i):
    return i%2==0

even_numbers = []
for i in range(20):
    if is_even(i):
        even_numbers.append(i)
print(even_numbers)

**Solution** with list comprehension:

In [None]:
even_numbers = [i for i in range(20) if i%2==0]
even_numbers

### Example 2: Remove sensitive information from log data

**Task**: Remove all strings in a logfile that contain passwords

**Solution** without list comprehension:

In [None]:
fp = open("data/log.txt", "r")

log = []
for line in fp:
    if "password" not in line:
        log.append(line.strip())
fp.close() 

log

**Solution** with list comprehension:

In [None]:
with open('data/log.txt', "r") as fp:
    log = [line.strip() for line in fp if "password" not in line]
    
log    

## Functions as arguments

Like all objects, functions can be arguments to functions

In [None]:
def add(x,y):
    return x+y

def sub(x, y):
    return x-y

def apply(func, x, y):
    return func(x, y)

In [None]:
apply(add, 1, 2)

In [None]:
apply(sub, 7, 5)

## Functions inside functions

Python allows nested function definitions:

In [None]:
def g(x, y):
    
    def cube(x):
        return x*x*x
    
    return y*cube(x)

g(4, 6) 

## Function returning functions

In [None]:
def h():
    pi = 0.13
    def inner_h():
        print("Inside inner_h but can access pi={}".format(pi))
        
    return inner_h

foo = h()
foo

In [None]:
foo()

## More functions returning functions: *decorators*

A toy example

In [None]:
def foo():
    return 1

def outer(func):
    def inner():
        print("before calling function")
        return func() + 100
    return inner

decorated = outer(foo)

The function `decorated` is a decorated version of function `foo`.
It is `foo` plus something more:

In [None]:
decorated()

To simplify, we could just write
```python 
foo = outer(foo)
```
to replace foo with its decorated version each time it is called

## A (slightly) more useful decorator

Suppose we have been given a function that only works for some numerical inputs:

In [None]:
from math import log
def f(x):
    return log(x) - 2  # Not defined for x<=0

In [None]:
f(5)

In [None]:
f(-1)

Suppose we want to limit the range of values sent to this function:

The idea is that we **wrap** the function inside another function:

## Interactive programming (15 minutes)

1) Implement the normal function f(x) in a python script

2) Create a decorator-function checkrange that calls f, but prints a custom message to the user if x <= 0. Hint: The decorator function chekcrange should return a function, and not a function call. 

3) Optional: Perform a test using a test function (with assert) checking that your function works as intended

In [None]:
from math import log

def checkrange(func):
    """Provides a safe version of f. Avoids math domain error."""
    #def inner
        #...
        # return ...
    #return inner

def f(x):
    return log(x) - 2  # Not defined for x<=0

In [None]:
def checkrange(func):
    def inner(x):
        if x <= 0:
            print("Error: x must be larger than zero")
        else:
            return func(x)
    return inner

In [None]:
f_safe = checkrange(f)
f_safe(5)

In [None]:
f_safe(-1)

Voilà!!

## The `@decorator` syntax

Python provides a short notation for decorating a function with
another function:

In [None]:
@checkrange
def g(x):
    return log(x) - 2

In [None]:
g(0)

This is essentially the same as writing `g = checkrange(g)`.

A decorator is simply a function taking a function as input
and returning another function. 

The syntax `@decorator` is a
short-cut for the more explicit `f = decorator(f)`.

## A (much) more useful decorator: memoization

The first time we learned multiplication, our strategy might to add cumulatively: e.g. 3x3 = 3 + 3 + 3 = 6 + 3 = 9

In [None]:
from time import sleep

def slow_mult(x,y):
    res = 0
    for i in range(y):
        print("Thinking...")
        sleep(1)
        res += x
    return res

print(slow_mult(3,3))
print(slow_mult(3,3))

We call the function with the same input arguments, and hence perform the same (slow) calculations multiple times.

The idea of memoization (or buffering) is to buffer the input-output pairs for which the function was called.
If the function is called twice with same input arguments, we return the buffer value.

The implementation of a memoization with a `decorator` could look like:

In [None]:

def memoize(func):
    ''' Caches a function's return value each time it is called.
        If called later with the same arguments, the cached value is returned
        (not reevaluated). '''
    cache = {}  # Stores all input-output pairs

    def inner(x, y):
        if (x, y) in cache:
            return cache[(x, y)]
        else:
            result = func(x, y)
            cache[(x, y)] = result
            return result
        
    return inner

Now we can apply the decorator to our slow function. Demo:

In [None]:
@memoize
def slow_mult(x, y):
    print("Thinking...")
    sleep(1)     # Simulate a long computation
    return x*y

@memoize
def slow_add(x, y):
    print("Thinking...")
    sleep(1)     # Simulate a long computation
    return x+y

... and test it out

In [None]:
slow_mult(3, 6)

## Decorator summary 

* A function that takes a function as argument and returns a modified function
* `@decorator` syntax simply a short cut for the standard function call `f = decorator(f)`.

## PEP8: How to write more Pythonic code

Clear and consistent style is critical for writing "good code".

* Python comes with an extensive programming style guidline: **PEP8**.
* It consists of a list of do's and dont's for writing Python.
* Get familiar with the conventions once, and you will automatically start using them.
* I will give you some examples below

### Guide to Pythonic code: Bindary operations

* Add whitespaces around bindary mathematical operations:

```python
# Do:
x = x + 1

# Don't:
x=x+1
```


### Guide to Pythonic code: Naming conventions 


* For **variables**:

```python
# Do
shopping_list = ["Bananas", "Apples"]
gravity_acceleration = 9.81
# Don't
ListOfStudents = ["Bananas", "Apples"]
GRAVITYACCELERATION = 9.91  
```

* For **functions**:
    
```python
def order_items(image):
    pass
```

* For **classes**:

```python
# Do:
class ElectricCar:
    pass

# Don't:
class electriccar:
    pass
```


### Guide to Pythonic code: Indentations and spacing


* Aways use **four** white spaces when indenting (set your editor accordingly):

```python
# Do
def order_items(image):
    pass  # Four whitespaces


# Don't 
def order_items(image):
  pass    # Not four whitespaces
```

* Break long lines "nicely":

```python
# Do:
shopping_list = {"Apple": 2, "Banana": 10, "Chocolate": 1,
                 "Toothpaste": 1, "Shampoo": 2}

# Don't: second line is under-indented
shopping_list = {"Apple": 2, "Banana": 10, "Chocolate": 1, "Toothpaste": 1, "Shampoo": 2}
```