## Lesson 4 — Taking Input, Reading and Writing Files, Functions

### Readings

* Shaw: Exercises 11-26
* [Data Loading, Storage, and File Formats, Wes McKinney](https://wesmckinney.com/book/accessing-data)

### Table of Contents

* [Taking Input](#Taking-Input)
* [Reading Files](#Reading-Files)
* [Writing Files](#Writing-Files)
* [Functions](#Functions)


### Taking Input

In Shaw's _Learn Python The Hard Way_, he uses `input()` and `argv` to take input from the user. These don't work very well with Jupyter notebooks, but we will cover them because they can be useful in Python scripts.

#### `input()`

The `input()` function will halts the program and prompts the user for a variable, that gets stored into the variable. It's important to note that **the output will always be a string**.

```python
x = input("What is the value for x? ")  # <- won't work on a notebook
print(x)
```

#### `argv`

When you import the `argv` special variable, it allows you to pass strings, numbers, and filenames to your python code. It also doesn't work in Jupyter notebooks, so you'll have to use a workaround. You can create a local script with the ending `.py` with the following content:

```python
from sys import argv

script = argv[0]
value1 = argv[1]

print(f"script: {script}\nfirst: {value1}")
```

You are then able to run it with:

```
python myscript.py foo bar
```

- `argv[0]`: is the first argument of the program, in this case it is `myscript.py`
- `argv[1]`: is the second argument of the program, in this case it is `foo`

This piece of code does not check if the correct number of arguments is provided: `bar` is ignored, and if you provide less arguments than it needs, it will raise an error (just try it!).

Adding code to the begging of your code would check if the correct number of arguments is entered and exit if not:

```python
from sys import argv
from sys import exit

if len(argv) != 2:
    print("Usage: python ex.py value1")
    exit(1)

script = argv[0]
value1 = argv[1]

print(f"script: {script}\nfirst: {value1}")
```

You may now call your script using the following pattern:

```bash
python myscript.py foobar
```

Notice that you pass a different number of arguments, the script will exit early.

#### `ipywidgets`

Notebooks have their own way of providing dynamic input to your through [`ipywidgets`](https://ipywidgets.readthedocs.io/en/stable/examples/Widget%20Basics.html). Widgets are eventful python objects that have a representation in the browser, often as a control like a slider, textbox, etc.

The documentation of the library is very extensive, so we will cover only common examples:

In [1]:
# install and import the library
%pip install ipywidgets 
from ipywidgets import interact

Note: you may need to restart the kernel to use updated packages.


You are able to use the function `interact()` to require input that can be used within the cell in a function. Below, there are three examples of this input: as a **slider** for a number, as a **text box** for arbitrary strings and as a **dropdown** for different choices.

In [2]:
def f(x):
    print(f"The chosen number is {x}")

interact(f, x=5)

interactive(children=(IntSlider(value=5, description='x', max=15, min=-5), Output()), _dom_classes=('widget-in…

<function __main__.f(x)>

In [3]:
def f(word):
    print(f"The {word} is the word")

interact(f, word="Bird")

interactive(children=(Text(value='Bird', description='word'), Output()), _dom_classes=('widget-interact',))

<function __main__.f(word)>

In [4]:
def f(fruit):
    print(f"I have chosen {fruit}")

interact(f, fruit=['apples', 'oranges']);

interactive(children=(Dropdown(description='fruit', options=('apples', 'oranges'), value='apples'), Output()),…

#### Beginning your notebook or script

Without using argv, a typical script or IPython notebook might begin like this:

In [5]:
# you might need to install pandas, if you haven't already
%pip install pandas numpy

Note: you may need to restart the kernel to use updated packages.


In [6]:
# import required packages
import pandas as pd
import numpy as np

In [7]:
# define file paths and variables
path_input_file = '~/sio209/input.txt'
path_output_file = '~/sio209/output.txt'
iterations = 10
evalue = 1e-5
color = 'dark blue'
title = 'My plot'

### Reading Files

We can read in a text file using `open()` and then print or use it all at once or one line at a time. Note that when we read the lines of a file, the lines are removed from the file handle object (called a `TextIOWrapper`).

In [8]:
filename = '../data/woodchuck.txt'

#### Read all at once

Notice that we need to close the file after we are done with it. If you open a file, you will need to close it, because your system expects it to write buffered data to disk. The recommendation here is to use a `with` statement (see below).

In [9]:
txt = open(filename)

print(f"Content of file {filename}:")
print(txt.read())

txt.close()

Content of file ../data/woodchuck.txt:
How much wood
Would a woodchuck chuck
If a woodchuck could chuck wood?



In [10]:
txt = open(filename)

txt.read()

'How much wood\nWould a woodchuck chuck\nIf a woodchuck could chuck wood?\n'

In [11]:
txt.close()

In [12]:
type(txt)

_io.TextIOWrapper

#### Read one line at a time

In [13]:
txt = open(filename)

txt.readline()

'How much wood\n'

In [14]:
txt.readline()

'Would a woodchuck chuck\n'

In [15]:
txt.readline()

'If a woodchuck could chuck wood?\n'

In [16]:
txt.readline()

''

In [17]:
txt.close()

#### Read lines as a list

In [18]:
txt = open(filename)

txt.readlines()

['How much wood\n',
 'Would a woodchuck chuck\n',
 'If a woodchuck could chuck wood?\n']

In [19]:
txt.close()

#### Open in a `with` block. Then use `for` loop, `read()`, `readline()`, or `readlines()`.

A `with` statement (also called a [context manager](https://realpython.com/python-with-statement/)) makes sure that the object is closed, even if part of application logic fails. By the end of the `with` statement, the file close will automatically be called for you.

In [20]:
with open(filename, 'r') as f:
    for line in f:
        line = line.rstrip()
        print(line)

How much wood
Would a woodchuck chuck
If a woodchuck could chuck wood?


In [21]:
with open(filename, 'r') as f:
    lines = f.read()
lines

'How much wood\nWould a woodchuck chuck\nIf a woodchuck could chuck wood?\n'

In [22]:
with open(filename, 'r') as f:
    line = f.readline()
line

'How much wood\n'

In [23]:
with open(filename, 'r') as f:
    lines = f.readlines()
lines

['How much wood\n',
 'Would a woodchuck chuck\n',
 'If a woodchuck could chuck wood?\n']

In [24]:
with open(filename, 'r') as f:
    lines = [line.rstrip() for line in f.readlines()]
lines

['How much wood',
 'Would a woodchuck chuck',
 'If a woodchuck could chuck wood?']

#### Pandas can also read files, but it's better with tables.

In [25]:
df = pd.read_csv(filename, header=None)

In [26]:
df

Unnamed: 0,0
0,How much wood
1,Would a woodchuck chuck
2,If a woodchuck could chuck wood?


### Writing Files

We can write files using `write()`.

In [27]:
outfile = 'limerick.txt'

In [28]:
# some text to write (a limerick by Edward Lear)
line1 = "There was an Old Man with a beard\nWho said, 'It is just as I feared!"
line2 = "Two Owls and a Hen\nFour Larks and a Wren,"
line3 = "Have all built their nests in my beard!'"

#### Write the most basic way

In [29]:
target = open(outfile, 'w')

target.write(line1)
target.write('\n')
target.write(line2)
target.write('\n')
target.write(line3)
target.write('\n')

target.close()

In [30]:
type(target)

_io.TextIOWrapper

#### Write in a `with` block

Again, we can use `with` to simplify things (avoid having to `close()` the file).

In [31]:
with open(outfile, 'w') as target:
    target.write(line1)
    target.write('\n')
    target.write(line2)
    target.write('\n')
    target.write(line3)
    target.write('\n')

#### Write with Pandas to comma-separated values or tab-separated values

The dataframe `df` contains the woodchuck text from above.

In [32]:
df.to_csv('woodchuck_pandas.csv')

In [33]:
df.to_csv('woodchuck_pandas.tsv', sep='\t')

### Functions

Functions allow you to carry out the same task multiple times. This reduces the amount of code you write, reduces mistakes, and makes your code easier to read.

#### Printing

In [34]:
def say_hello():
    print('Hello, world!')

In [35]:
say_hello()

Hello, world!


In [36]:
def print_a_string(foo):
    print(foo)

In [37]:
print_a_string('Here is a string.')

Here is a string.


In [38]:
x = 'A string saved as a variable.'
print_a_string(x)

A string saved as a variable.


In [39]:
y = 300
print_a_string(y)

300


In [40]:
def print_two_things(one, two):
    print(f"{one} AND {two}")

In [41]:
x = 'yes'
y = 10
print_two_things(x, y)

yes AND 10


In [42]:
def print_three_things(*blob):
    v1, v2, v3 = blob
    print(f'{v1}, {v2}, {v3}')

In [43]:
print_three_things('a', 31, ['x', 'y', 'z'])

a, 31, ['x', 'y', 'z']


In [44]:
def add_two(num1, num2):
    print(num1 + num2)

In [45]:
add_two(10, 5)

15


In [46]:
add_two(1.3, 4.4)

5.7


In [47]:
add_two('AAA', 'bbb')

AAAbbb


#### Returning

In [48]:
def return_sum(a, b):
    return a + b

In [49]:
return_sum(5, 8)

13

In [50]:
x = return_sum(8, 13)
x

21

In [51]:
x * 2

42

In [52]:
def combine_with_commas(*blob):
    mystring = ''
    for element in blob:
        mystring = mystring + str(element) + ','
    mystring = mystring[:-1]
    return(mystring)

In [53]:
combine_with_commas(40, 50, 60)

'40,50,60'

In [54]:
combine_with_commas(40, 50, 60, 70, 'hello')

'40,50,60,70,hello'

In [55]:
# we have to redefine this function to return instead of print
def add_two(num1, num2):
    return(num1 + num2)

In [56]:
x = 100
y = 100
z = add_two(x, y)
z

200

In [57]:
type(z)

int

In [58]:
a = '100'
b = '100'
c = add_two(a, b)
c

'100100'

In [59]:
type(c)

str

In [60]:
a = '100'
b = '100'
c = add_two(int(a), int(b))
c

200

In [61]:
type(c)

int

In [62]:
def sum_product_exponent(v1, v2):
    s = v1 + v2
    p = v1 * v2
    e = v1 ** v2
    return(s, p, e)

In [63]:
sum_product_exponent(2, 5)

(7, 10, 32)

In [64]:
my_sum, my_product, my_exponent = sum_product_exponent(2, 5)

In [65]:
my_sum

7

In [66]:
my_product

10

In [67]:
my_exponent

32

#### lambdas

lambdas are inline functions with a single operation. They are often used to be passed as reference to other functions

In [68]:
x = lambda a, b: a + b
x(1, 2)

3

In [69]:
a = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
even_numbers = filter(lambda x: x % 2 == 0, a)
list(even_numbers)

[2, 4, 6, 8, 10]