# Input and output in Python

## References
- [input and output tutorial](https://docs.python.org/3/tutorial/inputoutput.html) from the official python tutorial.
- [Is explicitly closing files important?](https://stackoverflow.com/questions/7395542/is-explicitly-closing-files-important)


## Introduction
An important aspect to learn in a new programming language is input/output (or I/O). For our program to be useful, we need to be able to interact with the outside world. This means performing operartions like:
- reading and writing data to a file
- printing text output
- asking for user input
- connecting to databases or other network services


The majority of these operations are covered by the python standard library. We are going to see how to use them in this chapter.


⚠️ In reference to the chapter on functional programming, it is interesting to note that these functions perform so-called *side-effects*; therefore any code containing these operations is no longer *pure* and is not referentially transparent. The same function can return different values for the same argument if called multiple time and the function can have *long-distance* effects, that is it can modify the state of the program somewhere else leading to unexpected results. Therefore, I always suggest trying to separate input and output from the other computations in your program. For example, if you have a complex calculation requring several user inputs at several stages of the process, consider writing a function that only performs the calculation given all inputs and then require all inputs separately, for example through a single file. This makes your code easier to debug, test and understand.




## String input and output 
The most basic I/O operation in python is displaying text on the python console. This is done using the `print` function:

In [1]:
print("I am some text")

I am some text


The complemenentary operation, prompting for *user input* is done through `input`.
The function takes one string as argument, which gives the prompt displayed on the console. The return value of the function contains the input the user types:

In [3]:
user_input = input("Write some text")
print(f"The user wrote {user_input}")

The user wrote fasdf


⚠️ Note that `input` is a **blocking operation**. The program evaluation stops and waits until the user provides an input. Therefore, use `input` **very carefully* and only when truly necessary. A common issue is that someone first writes an interactive program expecting user input through `input` and later integrates it into a larger application that is supposed to run automatically without any user interaction. Suddenly, the application stops somehwere and does not run further because of a well hidden `input` call...



## File I/O
A second very common I/O operation in most programming languages is reading and writing from files. This is actually more complex than just writing on the console and consists of several steps:

1. You need to find *where* in the operating system the file is located. This gives you the so-called *path* to the file
2. You need to *open* the file for reading or writing. In some operationg systems, this *locks* the file, so that other users or processes cannot write simultaneously with you.
3. Now , you can write (or read) the contents of the file 
4. Finally, you need to *close* the file to make it accessible to other processes again and to finalise the writing: in some implementations, `write` only writes the text to a temporary location in memory (a so-called *buffer*) and only writes the content to the file when you close it. This is done to increase response times, as writing to memory is faster than writing to a file on the disk.

### Reading from a file

Let's see how to do this with an example: we want to open the file [hello.txt](./data/hello.txt) and read its contents.

1. The path is already identified, we know the file is in `./data/hello.txt`. We save this in a variable `path`
2. We now can use the built-in function [`open`](https://docs.python.org/3/library/functions.html#open) to open the file. This function returns a [file object](https://docs.python.org/3/glossary.html#term-file-object) that we can use to further manipulate the file. To ensure we only open the file for reading, we pass the string "r" to the second argument of `open`.
3. Now we can read the contents using `read`, `readline` or `readlines`. `read` only reads `n` characters from the file, `readline` reads a whole line, while `readlines` reads the whole file content as a list of strings, one item per line in the file. This knowledge is useful when we only want to read part of a file or when the file is too big to fit in memory and we can only read parts.
4. Finally, we close the file using the `close` method on the file object.




In [4]:
path = "./data/hello.txt"
file_ob = open(path, 'r')
contents = file_ob.readlines()
file_ob.close()
print(contents)

['Hello, I am a file with some text.\n']


Notice that calling `read`, `readline` or `readlines` *consumes* the file, either fully or to the corresponding location. This means that if we call `readlines` twice, we will get an empty list the second time:

In [20]:
path = "./data/hello.txt"
file_ob = open(path, 'r')
contents = file_ob.readlines()
print(contents)
other_contents = file_ob.readlines()
print(other_contents)
file_ob.close()


['Hello, I am a file with some text.\n']
[]


We can use this to read a file line-by-line by just iterating over the file using a `for` loop or a list compherension. The `file` object implements the [iterator](https://docs.python.org/3/glossary.html#term-iterator) protocol:

In [26]:
path = './data/lines.txt'
file_ob = open(path, 'r')
for line in file_ob:
    print(line)
file_ob.close()

this

file

has multiple

lines

how many

lines

does this file

have?


This is the most *pythonic* way to read a file line-by-line instead of reading the full contents at once.

### Writing to a file
The process to write data to a file is very similar, the main difference being that:
- We use `w` as a second argument of `open` to specify that we want to write to the file. If the file already exists, it will be erased before we write something else to it. If you want to append to the file, you should use `a` instead.
- We use `write` to write a *string* to the file. Other types of object should be converted to string before being written.

Let's see this in action by writing your name in a file called `me.txt` in [data](./data/)

In [5]:
path = "./data/me.txt"
file_ob = open(path, "w")
file_ob.write("Simone")
file_ob.close()

Congratulations! Your name is now written in stone. 

If we want to write the contents of an *iterable* to a file, we can use the `writelines` method:

In [28]:
path = "./data/numbers.txt"
file_ob = open(path, "w")
file_ob.writelines([str(i) + "\n" for i in range(10)])
file_ob.close()

Notice that for each line, we concatenate the `newline` `\n` symbol to the string to be written to write the text to a new line.




### Bonus: Context managers
As you can see, after *opening* and performing operations on a file, we always have to remember to *close* it. If we forget to close it, unexpected behavior can happen. If the program crashes later on, for example, we might have the situation where the text is not written to the file. If you open many files and you don't close them, the python intepreter can run out of memory. On some operating systems, the file contents are only updated after closing, etc ...

This pattern is very common dealing with many *resources*: files, connections, threads, servers, etc...  You acquire the access to the resource, do some work on it and finally you clean up after yourself by closing it again. Because of this, python offers a construct called [*context manager*](https://docs.python.org/3/reference/datamodel.html#context-managers) which implements exactly this beahvior:
- get access to a resource
- Do some work
- release this resource 

In the case of files, we can replace the open-read-close or open-write-close sequence  with a context manager. Context managers are used inside  the `with` statement:

In [17]:
with open("./data/hello.txt", "r") as file_ob: 
    contents = file_ob.readlines()
    print(contents)

['Hello, I am a file with some text.\n']


`with open(path) as name` opens the file in `path` and assigns it to the `name` file object. This object is only valid in the *scope* of the context manager, that is the indented block of code that follows the `:`. Once the python interpreter leaves the context manager, `file_ob.close()` is automatically called, ensuring the file is properly closed no matter what happens.

This pattern can be extended to any other resource that should be managed in a similar way,  for example database connections. If you want to learn how to implement context managers for other types of objects, please refer to the `contextlib` [documentation](https://docs.python.org/3/library/contextlib.html) in the python standard library.

### Bonus: Binary I/O

Another aspect of file I/O is accessing files in [*binary mode*](https://docs.python.org/3/library/io.html#binary-i-o); that means that instead of writing and reading *text*, we manipulate `bytes` in order to represent non-textual data. This is useful for interacting with measurement data and other non-textual information like images, machine learning model parameters and other complex strcuture, although in most cases you won't need the low-level control of binary I/O and will use libraries instead. 

To look at an example, let's write a sequence of `int` to `output.dat`[^1]  as a  sequence of bytes. Because one byte corresponds to 8 bits, using one byte per integer means we can unambigously store `2^8= =256` values.


[^1]: `.dat` is a typical "generic" extension to indicate that the file contains some sort of data. Filename extensions does not have any binding meaning by themselves, they are simply a convention for users to quickly see what contents to expect.

In [20]:
with open("data/output.txt", "wb") as out_file:
    bs = b"".join([i.to_bytes(1, 'little') for i in  range(10)])
    print(bs)
    out_file.write(bs)

b'\x00\x01\x02\x03\x04\x05\x06\x07\x08\t'


note that we used the mode `wb` for `write, binary`. The `write` function expects a [`bytes`](https://docs.python.org/3/library/stdtypes.html#bytes) object. Because of the historical connection between bytes and strings, we can produce a  bytes object by prepending `b` to a string literal. Therefore, we generate an array of bytes using [`to_bytes`](https://docs.python.org/3/library/stdtypes.html#int.to_bytes) and combine them with the [`join`](https://docs.python.org/3/library/stdtypes.html#str.join) method on the empty byte literal `b""`.

Now that we wrote out our sequence, we can try to read it back from the file:

In [35]:
with open("data/output.txt", "rb") as in_file:
    data = in_file.read()
    seq = [b for b in data]
    print(seq)

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]


And, surprise! We obtain our original sequence.

Note that because of quirk of python, the `read` method returns a `bytes` object, but when we try to access a single element (as we would do with a string), the entry is already an `int`.