# Interactive Input &  FILE I/O

## Overview

Up to now all our programs have run in their little bubble: they define the variable values they use and after they finish they leave little trace that they were ever run. 

Dealing and interacting with files is a very common way of processing information stored in a computer, and is something that your programs might often do. When we deal with files and our programs go beyond their "logical bubble", is when we can start doing some damage: delete information on files, delete files all together, fill up the hard disk, damage your installation (the operating system should actually have safeguards for this). This should not intimidate you, it is just that with power it comes responsibility.

In this notevook we will first go over the basic operations on files. 

## Interactivity: reading user input

One important aspect when running a program outside of the notebook, where the user cannot see or change variable values easily, is to provide a way to prompt the user for input. The function *input()* prompts the user and collects input typed on the keyboard. This is the simplest way to create interactive programs.

    keys=input("I will collect what you type: ")
    print("What you typed was:",keys)

The first line of code will present the user with a string, and in the notebook a text box where whatever you type on the keyboard until you hit return is "collected" (outside of the notebook this textbox does not appear, but we will have the chance to check the behaviour later on). Whatever you type is stored as a string in variable *keys* in this case.

If we expect numerical input from the user, the string needs to be converted.

    number_string=input("How many carbon atoms are there in butene?")
    number_number=int(number_string)
    if number_number == 4:
       print("Obviously!")
    else:
       print("Are you a physicist, or what?!")

## Dealing with files

In any computer hard-disk there are typically two types of files: text files and binary files. Text files are those you can read with a text editor program (think of Notepad, not Microsoft Word), and do not necessarily contain prose. (If you open this notebook, which has an extension \*.ipynb, on a text editor you will see that it is text in some form. The Jupyter notebook is a text file.) Binary files on the other end cannot be opened on a text editor, or when they do open they show up as ununderstandable characters. (Picture files like PNG or JPEG, music files like MP3, video files, Word DOC files, are all binary files.)

We will only be looking at manipulating text files. Although Python can also manipulate binary files, the caveat of such files is that the programmer needs to know *a priori* how the content of the file is structured, and since one cannot just look at the file on a text editor, this task is harder. Binary files further require knowledge of some low level computer architecture which is beyond the scope of this course. Manipulating text files will however illustrate the process, and most instruments and computer programs are able to write data in some form of text file.

We will start by loading the content of the file <a href="test.out">test.out</a> (you can open the file to look at its content). First we need to open a stream to the file content and set it to a variable. This is done with function *open()*

    stream1=open('test.out','r')

This takes a string containing the name of our file (including the full path if the file is not in the same directory as the notebook/script) and as a second argument a flag telling Python whether we are going to read or write to the file. ('r' for reading and 'w' for writing).

Via the variable *stream1* we now have direct access to the content of the file. If we loop over the file stream we can read the lines of the file one by one as strings

    for line in stream1:
        print(line)

Once we are done with the file we should tidy up, close the stream and leave the file in peace

    stream1.close()

If we are interested in a list with every line in the file we could build a list with a loop, or simply operate with the function *list()* on the stream

    stream2=open('test.out','r')
    list_content=list(stream2)
    stream2.close()
    list_content

Note at the end of each line/list element the linebreak represented by '\n' which counts as a single character.

We should be slightly careful when performing this operations as it puts all the content of the file into a list, which will be unmanageable if the file is several gigabytes in size.

It is just as simple to read a specific number of lines, using a loop and the method *.readline()*

    stream3=open('test.out','r')
    for i in range(2):
        line=stream3.readline()
        print(line)
    stream3.close()

The house keeping step of closing the stream can be a bit tedious an easily forgotten. The following construct will close the stream for us when we are done

    with open('test.out','r') as stream:
        list_content_again=list(stream)
    list_content_again

If we are just interested in some specific information in the file we can search for it as we read it.

The following code looks for the line with the substring 'leaves' and extracts the colour of the leaves in the text.

In [None]:
with open('test.out','r') as stream:
    for line in stream:
        if 'leaves' in line:
            leaf_colour = line.split()[4]
        
leaf_colour[:-1]

Note that the variable *line* is a string with each line in the file. If we are interested in individual words, we can form a list from a string with the **.split()** method.

    "Read my words. One by one.".split()

By default *.split()* separates the string on the blank spaces (space or tab characters), but we can choose any other character

    "Read my words. One by one.".split(".")

For completeness, the opposite operation to *.split()* is performed by *.join()*

    "--".join(['three','two','one','go'])

Is it clear how we are obtaining the colour of the leaves? Write some code below that extracts the colour of the sky instead.

The authors of those words were probably not in their best mood. So we are going to change the text to make it more cheerful (even if slightly psychedelic). The goal is thus to construct a list of the verses in the lyrics, but with *blue* leaves, a *bay* sky and a *glorious* day. Let us call this list *cheerful*.

This task could be done in one go with a single *for* loop (if you are feeling confortable you can try to implement such solution). We will however break the task into two. First, create the list *cheerful* where each element is a verse of the original text split into words.

Now change your list such that you replace the wanted words in the text, and put each verse together using the *.join()* mehtod such that *cheerful* is a list of strings with the verses of the lyrics.

If we want to write our cheerful version to a file, we just do

    with open('cheerful.out','w') as stream:
        stream.writelines(cheerful)

Note that if you open a file for writing you **will overwrite whatever was initially in the file**.

The <a href"https://docs.python.org/3/library/os.html">os module</a> provides many functions for interaction with the operating system, including the file system. We can get the list of the files in the current directory and see if our new created file is in place

    import os
    os.listdir()

## Summary

Files are an important way of permanent storage of data. Handling files is important not only for processing data generated by instruments or other programs, but also to store results generated by your own programs.

In this notebook we have seen how to access data in files. The function *open()* creates a stream to access the file content, which can be looped line by line. Each line is retrieved as a string, in which context the string method *.split()* becomes useful for further processing.
