![Py4Eng](https://dl.dropboxusercontent.com/u/1578682/py4eng_logo.png)

# I/O: `input`, files, filesystem
### Yoav Ram

# I/O

## User input

The `input` function is useful to get string input from the user. It works in the notebook, as well as when running scripts in the console.

In [32]:
name = input("What's your name?\n")

print("Hi", name)

What's your name?
Yoav
Hi Yoav


In [33]:
n_icecreams = input("How many icecreams would you like?")
price = input("How much does an icecream cost?")
print("That would be", price * n_icecreams)

How many icecreams would you like?2
How much does an icecream cost?1.5


TypeError: can't multiply sequence by non-int of type 'str'

For security reasons, `input` returns strings. It is the program's responsibility to convert the string to the desired type:

In [36]:
n_icecreams = int(input("How many icecreams would you like?"))
price = float(input("How much does an icecream cost?"))
print("That would be", price * n_icecreams)

How many icecreams would you like?2
How much does an icecream cost?1.5
That would be 3.0


You can use `eval` to evaluate the input string into a Python expression, but don't do it if you don't trust the user because it can lead to strange behaviour and side effects.

Let's see what happens when we give valud input (`2` and `1.5`) and when we give invalid input (`2` and `[1,2,3]`). Try it with `eval` and with the above code(`int` and `float`).

In [39]:
n_icecreams = eval(input("How many icecreams would you like?"))
price = eval(input("How much does an icecream cost?"))
print("That would be", price * n_icecreams)

How many icecreams would you like?2
How much does an icecream cost?1.5
That would be 3.0


## Exercise

Ask the user for a number between 1 and 10; if the number is not within that range, let him know and ask him again.

## Files

We'll start with simple text files and proceed to more complex formats.  
Let's read the list of crop plants located in `data/crops.txt`.

### Reading files
Whenever we want to work with a file, we first need to _open_ it. This is, not surprisingly, done using the `open` function.  
This function returns a file object which we can then use.

In [7]:
f = open(r'..\data\crops.txt','r')
crops = f.read()
f.close()
print(crops[:100])

Abelmoschus caillei
Abelmoschus esculentus
Acacia mearnsii
Acacia senegal
Acacia seyal
Acca sellowia


`open` function receives two parameters: the path to the file you want to open and the mode of opening (both strings). In this case - `r` for reading.

Notice the `r` before the string. This tells Python to treat the string as a *raw string* and not try to escape it, because otherwise it would try to do something weird with the `\c` substring. This is important on Windows machines, but you can also give Python paths in Unix style (`data/crops.txt`) and it would work fine.

`read` returns *all* the text from the file as a string. 

`close` then closes the file handle.

A more idiomatic way to do this, in which Python takes care of closing the file handle, is:

In [8]:
with open(r'..\data\crops.txt','r') as f:
    crops = f.read()
print(crops[:100])

Abelmoschus caillei
Abelmoschus esculentus
Acacia mearnsii
Acacia senegal
Acacia seyal
Acca sellowia


This idiom uses a [context manager](https://www.python.org/dev/peps/pep-0343/), and the file handle `f` is closed when the context manager block ends.

### Iterating over the file object

We can simply use a _for_ loop to go over all lines. This is the _best practice_, and also very simple to use:

In [10]:
with open(r'..\data\crops.txt','r') as f:
    for line in f:
        if line.startswith('Musa'):   # check if line starts with a given string
            print(line.strip())  # strip removes the \n from the end of the line

Musa balbisiana
Musa spp.
Musa textilis


#### Reading line by line with readline()
The `readline()` method allows us to read a single line each time. It works very well when combined with a _while_ loop, giving us good control of the program flow.

In [12]:
with open(r'..\data\crops.txt','r') as f:
    line = f.readline()    # read first line
    print(line)
    while line:
        line = f.readline().strip()
        if line.startswith('Triticum'):
            print(line)        

Abelmoschus caillei

Triticum aestivum
Triticum dicoccum
Triticum durum
Triticum monococcum
Triticum spelta
Triticum turanicum


There are other methods you can use to read files. For example, the `readlines()` returns all the lines as a list of strings.

## Exercise

1) Print the last line in the file.  
2) Find out how many _Garcinia_ species are in the file (use the `startswith()` method).

### Writing to a file

To write to a file, we first have to open it for writing. This is done using one of two modes: 'w' or 'a'.  
'w', for write, will let you write into the file. If it doesn't exist, it'll be automatically created. If it exists and already has some content, __the content will be overwritten__.
'a', for append, is very similar, only it will not overwrite, but add your text to the end of an existing file. 

Writing is done using good, old `print()`, only we add the argument `file = <file object>`.

In [14]:
with open(r'tmp.txt','w') as f:
    print('This is the first line', file=f)
    line = 'Another line'
    print(line, file=f)
    msg1 = 'Hello'
    msg2 = 'World!'
    print(msg1 + msg2, file=f)

In [18]:
%less tmp.txt

### Temporary files

Temporary files are easily created using the _tempfile_ module:

In [19]:
import tempfile

In [28]:
fname = tempfile.mktemp()
print("Writing to temp file", fname)
with open(fname, 'w') as f:
     print("This is a temporary file", file=f)

Writing to temp file C:\Users\yoavram\AppData\Local\Temp\tmpy7a2qmdo


In [29]:
%less $fname

See other methods in *tempfile* on how to create temporary directories, named temporary files, etc.

# Filesystem

Python offers plenty of ways to interact with the filesystem through the `os` and `os.path` modules.

Let's import these modules:

In [40]:
import os

In [45]:
files = os.listdir()
for fname in files:
    if os.path.isdir(fname):
        print(fname, "is a folder")
    elif os.path.isfile(fname):
        size = os.path.getsize(fname)
        print(fname, "is a file with size", size, "bytes")
    else:
        pass # do nothing

.ipynb_checkpoints is a folder
csv.ipynb is a file with size 5950 bytes
data.csv is a file with size 377 bytes
lecture5.ipynb is a file with size 56155 bytes
regexp.ipynb is a file with size 32049 bytes
session1.ipynb is a file with size 32039 bytes
session2.ipynb is a file with size 44817 bytes
session3.ipynb is a file with size 51595 bytes
session3a.ipynb is a file with size 21963 bytes
session4.ipynb is a file with size 24914 bytes
session5.ipynb is a file with size 34603 bytes
tmp.txt is a file with size 51 bytes


Here's a combination of functions to get the current directory (`os.getcwd`), change the directory (`os.chdir`), check if a file exists (`os.path.exists`), and split a filename from its extension:

In [53]:
curdir = os.getcwd()
os.chdir(r'..\data')
fname = 'crops.txt'
print(fname, 'exists?', os.path.exists(fname))
fname = os.path.splitext('crops.txt')[0] + '.csv'
print(fname, 'exists?', os.path.exists(fname))
os.chdir(curdir)

crops.txt exists? True
crops.csv exists? False


See the [os](https://docs.python.org/3.4/library/os.html) and [os.path](https://docs.python.org/3.4/library/os.path.html#module-os.path) modules for more functions.

## Colophon
This notebook was written by [Yoav Ram](http://www.yoavram.com) and is part of the _Python for Engineers_ course.

The notebook was written using [Python](http://pytho.org/) 3.4.4, [IPython](http://ipython.org/) 4.0.3 and [Jupyter](http://jupyter.org) 4.0.6.

This work is licensed under a CC BY-NC-SA 4.0 International License.

![Python logo](https://www.python.org/static/community_logos/python-logo.png)