# open and close files

## write to a file, the error-prone way

Until Python 2.5, the usual way to open a file and write something into it was like this:

In [None]:
out = open("say_hello.txt", "w")
out.write("hello\n")
out.write("world\n")
not_allowed = 1/0     # simulate the real world: an error happens while we are writing to a file

out.close()

In [None]:
!cat say_hello.txt

what has been written to the file? Nothing! The file is empty!

This is because the content is still in a **memory buffer** which has not been _flushed_ to the file. The reason for this is I/O optimization: Writing to a disk is costly, so Python only does it from time to time.

We can enforce writing to disk every time by providing `flush=True` to the `print` function:

In [None]:
out = open("say_hello.txt", "w")
print("hello", file=out, flush=True)
print("world", file=out, flush=True)
not_allowed = 1/0     # simulate the real world: an error happens druring the write process

out.close()

now we have flushed and written the data just before the crash:

In [None]:
!cat say_hello.txt

**However, this is error prone, a lot to type and easy to forget.** 🤢

## always use the `with` statement to safely open and write to a file

The `with` statement is a safe way to open a file and write content. If anything happens during the writing process, the memory buffer gets automatically flushed and written to the file, and the file gets closed properly:

In [None]:
with open("say_hello.txt", "w", encoding="utf-8") as out:  # always be explicit which encoding you use
    out.write("I ❤︎ ♚ and ♛\n")
    not_allowed = 1/0     # still creates an error, but now the content gets flushed before the program is terminating


We still receive the error, but at least our content has now reached its destiny:

In [None]:
!cat say_hello.txt

## Exercise 1
- [ ] catch the provoked error correctly and print out a message instead

## read from one file, write to another

The `with` statement also allows to open multiple files at the same time, allowing to copy content safely.

**Note:** The backslash `\` at the end of line 1 is needed to break the statement in two separate lines, making it more readable:

In [None]:
with open("say_hello.txt", "r", encoding="utf-8") as src, \
     open("say_many_hello.txt", "w", encoding="utf-8") as dest:
    
    content = src.read()             # read in all content
    content = content.rstrip("\n")   # remove trailing newline 
    
    for i in range(1,15):
        dest.write(f"{i}:\t{content}\n")


In [None]:
!cat say_many_hello.txt

## read a file, line by line

There is a `readline()` method available which does what it says on the lid: it reads a line!

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    line = src.readline()
    while line:
        print(line, end="")  # the line already contains a newline, so we set end="" to avoid double newlines
        line = src.readline()

This is doable, but not really convenient. Fortunately, Python allows us to use **a loop** instead:

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    for line in src:
        print(line)

## Exercise 2

- [ ] get rid of the additional newline above, by modifying the `print` statement
- [ ] get rid of the additional newline by removing the `\n` at the end of each line

## get all lines of a file as a list

for this task we could use the `readlines()` method:

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    all_lines = src.readlines()

In [None]:
all_lines

Almost. We still have the unecessary newline in every item, which we want to get rid of. And we might want to get rid of the numbers and the tabs, too. The most elegant way is by using a **list comprehension**:

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    all_lines = [
        line.rstrip('\n').split("\t")[1]     # 2. remove the newline `\n`, split the rest on the TAB `\t', use the second element `[1]`
        for line in src.readlines()          # 1. for every line in src.readlines()
    ]

In [None]:
all_lines

## Excercise 3

- [ ] rewrite the example above, using traditional for-loops instead of list-comprehension
- [ ] list comprehensions can contain an `if` statement (after the `for line in src.readlines()`). Use it to filter only the lines starting with `1` 

## in-place file editing

With the `open` statement, you cannot do inplace file editing – you would need to write the changes into a temporary file and later overwrite the original one. Because this is a often needed task, there is a standard library for it: [`fileinput`](https://docs.python.org/3/library/fileinput.html)

In [None]:
import fileinput

with fileinput.input(files=('say_many_hello.txt'), inplace=True) as f:
    for line in f:
        # do some processing
        op = line.replace('♚', 'king (♚)')
        op = op.replace('♛', 'queen (♛)')
        # print() the text you want to write back to the input files
        print(op, end='')  # make sure you don't add another newline

In [None]:
!cat say_many_hello.txt