## Lesson 4 - Taking Input, Reading and Writing Files, Functions

* Shaw Exercises 11-26
* Lutz Chapters 9,14-17

### Taking input

In Shaw's _Learn Python The Hard Way_, he uses `raw_input()` and `argv` to take input from the user. These don't work very well with Jupyter notebooks, but we will cover them because they can be very useful.

#### `raw_input()`

In Python 3, it's been renamed `input()`. Newer versions of Jupyter Notebook support this kind of input, but it's kind of weird. Better to just 'hard code' the value for a variable

In [1]:
x = input()

5


In [2]:
y = 6

#### `argv`

When you import the `argv` special variable, it allows you to pass strings, numbers, and filenames to your python code. It doesn't work in Jupyter Notebooks, however, so you'll have to use a workaround. We can comment out the `argv` calls and hard code the values we would have passed. Later, when we select "Download as > Python (.py)", we can open up that .py file and uncomment the `argv` calls. Still, it's a good idea to define all your variables and file paths at the start of your notebook.

In [3]:
#from sys import argv

script = 'something.py' #argv[0]
value1 = 5 #argv[1]
value2 = 6 #argv[2]
value3 = 'hello' #argv[3]

print("script: %s\nfirst: %s\nsecond: %s\nthird: %s" % (script, value1, value2, value3))

script: something.py
first: 5
second: 6
third: hello


For example, a typical Jupyter Notebook might begin something like this:

In [30]:
# import required packages
import pandas as pd
import numpy as np

In [5]:
# define file paths and variables
path_input_file = '~/sio209/input.txt'
path_output_file = '~/sio209/output.txt'
iterations = 10
evalue = 1e-5
color = 'dark blue'
title = 'My plot'

### Reading files

We can read in a text file using `open()` and then print or use it all at once or one line at a time. Note that when we read the lines of a file, the lines are removed from the file handle object (called a `TextIOWrapper`).

In [6]:
#from sys import argv

filename = 'input_file.txt' #argv[1]

#### Read all at once

In [15]:
filename = 'test.txt'
txt = open(filename)

print("Here's your file %r:" % filename)
print(txt.read())
print(txt.read())
txt.read()

Here's your file 'test.txt':
Line 1
Line 2
Line 3




''

In [11]:

txt.read()

''

In [6]:
type(txt)

_io.TextIOWrapper

#### Read one line at a time

In [10]:
txt = open(filename)

txt.readline()

'1,This is line one.\n'

In [11]:
txt.readline()

'2,This is line two.\n'

In [12]:
txt.readline()

'3,This is line three.\n'

In [13]:
txt.readline()

''

#### Read lines as a list

In [14]:
txt = open(filename)

txt.readlines()

['1,This is line one.\n', '2,This is line two.\n', '3,This is line three.\n']

#### Open in a `with` block. Then use `for` loop, `read()`, `readline()`, or `readlines()`.

In [15]:
with open(filename, 'r') as f:
    for line in f:
        line = line.rstrip()
        print(line)

1,This is line one.
2,This is line two.
3,This is line three.


In [16]:
with open(filename, 'r') as f:
    print(f.read())

1,This is line one.
2,This is line two.
3,This is line three.



In [17]:
with open(filename, 'r') as f:
    print(f.readline())

1,This is line one.



In [18]:
with open(filename, 'r') as f:
    print(f.readlines())

['1,This is line one.\n', '2,This is line two.\n', '3,This is line three.\n']


#### Pandas can also read files, but it's better with tables

In [38]:
df = pd.read_csv(filename, header=None)

In [39]:
df

Unnamed: 0,0
0,Line 1
1,Line 2
2,Line 3


### Writing files

We can write files using `write()`.

In [18]:
outfile = 'output_file.txt'

In [19]:
line1 = 'Output line 1.'
line2 = 'Next line of output.'
line3 = 'Last line of output.'

#### Write the most basic way

In [27]:
target = open(outfile, 'r+')

target.write(line1)
target.write('\n')
target.write(line2)
target.write('\n')
target.write(line3)
target.write('\n')
target.read()
target.close()

#### Write in a `with` block

Again, we can use `with` to simplify things (avoid having to `close()` the file).

In [35]:
with open(outfile, 'w') as target:
    target.write(line1)
    target.write('\t')
    target.write(line2)
    target.write('\t')
    target.write(line3)
    target.write('\t')

#### Write with Pandas to comma-separated values or tab-separated values

In [40]:
df2 = df[1]

KeyError: 1

In [36]:
df2

NameError: name 'df2' is not defined

In [27]:
df2.to_csv('output_pandas.csv')

In [28]:
df2.to_csv('output_pandas.tsv', sep='\t')

### Functions

Functions allow you to carry out the same task multiple times. This reduces the amount of code you write, reduces mistakes, and makes your code easier to read.

#### Printing

In [29]:
def print_a_string(string):
    print('%s' % string)

In [30]:
print_a_string('Here is a string.')

Here is a string.


In [31]:
x = 'A string saved as a variable.'
print_a_string(x)

A string saved as a variable.


In [32]:
def print_two_things(one, two):
    print('%s AND %s' % (one, two))

In [33]:
x = 'yes'
y = 10
print_two_things(x, y)

yes AND 10


In [34]:
def print_three_things(*blob):
    v1, v2, v3 = blob
    print('%s,%s,%s' % (v1, v2, v3))

In [35]:
print_three_things('a', 31, ['x', 'y', 'z'])

a,31,['x', 'y', 'z']


In [36]:
def add_two(num1, num2):
    print(num1 + num2)

In [37]:
add_two(10, 5)

15


In [38]:
add_two(1.3, 4.4)

5.7


In [39]:
add_two('AAA', 'bbb')

AAAbbb


#### Returning

In [40]:
def combine_three_with_commas(*blob):
    v1, v2, v3 = blob
    return '%s,%s,%s' % (v1, v2, v3)

In [41]:
combine_three_with_commas(40, 50, 60)

'40,50,60'

In [42]:
x = combine_three_with_commas(44, 55, 66)

In [43]:
x

'44,55,66'

In [44]:
x = 100
y = 100
z = add_two(x, y)
z

200


In [45]:
def sum_product_exponent(v1, v2):
    s = v1 + v2
    p = v1 * v2
    e = v1 ** v2
    return s, p, e

In [46]:
sum_product_exponent(2, 5)

(7, 10, 32)

In [47]:
my_sum, my_product, my_exponent = sum_product_exponent(2, 5)

In [48]:
my_sum

7

In [49]:
my_product

10

In [50]:
my_exponent

32

### Assignment for Lesson 4

This assignment will test what you learned in Shaw (_LPTHW_) Exercises 11-26 and the above Jupyter notebook. 

1. Write some code, without using functions, that calculates the average of 5 numbers. Do it three different ways:
    * Write a .py file that takes input from the command line using `raw_input()` (Python 2) or `input()` (Python 3). After the script works, paste the text of the file into your notebook.
    * Write a .py file that takes input from the command line using `argv`. After the script works, paste the text of the file into your notebook.
    * Enter code into two Jupyter notebooks cells: the first stores value as variables, and the second computes the average.
2. Using functions, write some code that takes two strings, prints them with the first letter capitalized, prints them with all letters capitalized, prints the first and last letter of each, prints the length of each, and then prints the concatenation of the two strings. Do it two different ways:
    * Write a .py file that uses `argv`. After the script works, paste the text of the file into your notebook.
    * In your Jupyter notebook, comment out the `argv` portions and hard code in the values of your strings. Then make sure the code runs the same.
3. Using a text editor, create a comma-separated values file with 5 columns and 5 rows. Save it in the same directory as your Jupyter notebook. In the Jupyter notebook, read and print the file in different ways, and write new files, as follows:
    * Read your .csv file using `read()`, `readline()`, or `readlines()`, and print the output to the screen (`print` command is optional in notebooks!).
    * Do the same but use a `with` block and a different one of `read()`, `readline()`, or `readlines()`.
    * Using either of the two above methods, then change one row of data, and write your csv data to a new file.
    * Read your .csv file using Pandas and display the resulting DataFrame.
    * Save your DataFrame to a new file using Pandas.