# open and close files

## always use the `with` statement to open a file

Until Python 2.5, the usual way to open a file and write something into it was like this:

In [None]:
out = open("say_hello.txt", "w")
out.write("hello\n")
out.write("world\n")
not_allowed = 1/0     # simulate the real world: an error happens druring the write process

out.close()

In [None]:
!cat say_hello.txt

what has been written to the file? Nothing! The file is empty. This is because the content is still in a memory buffer which has not been _flushed_ to the file.We can enforce the `flush=True` by providing this attribute to the `print` function:

In [None]:
out = open("say_hello.txt", "w")
print("hello", file=out, flush=True)
print("world", file=out, flush=True)
not_allowed = 1/0     # simulate the real world: an error happens druring the write process

out.close()

now we have flushed and written the data just before the crash:

In [None]:
!cat say_hello.txt

**However, this is error prone, a lot to type and easy to forget.** 🤢

The `with` statement is a safe way to open a file and write content. If anything happens during the writing process, the memory buffer gets automatically flushed and written to the file, and the file gets closed properly:

In [None]:
with open("say_hello.txt", "w", encoding="utf-8") as out:
    out.write("I ❤︎ ♚ and ♛\n")
    not_allowed = 1/0     # still creates an error, but now the content gets flushed before the program is terminating


We still receive the error, but at least our content has now reached its destiny:

In [None]:
!cat say_hello.txt

## read from one file, write to another

The `with` statement also allows to open multiple files at the same time, allowing to copy content safely. **Note:** The backslash `\` at the end of line 1 is needed to break the statement in two separate lines:

In [None]:
with open("say_hello.txt", "r", encoding="utf-8") as src, \
     open("say_many_hello.txt", "w", encoding="utf-8") as dest:
    
    content = src.read()             # read in all content
    content = content.rstrip("\n")   # remove trailing newline 
    
    for i in range(1,11):
        dest.write(f"{i}:\t{content}\n")


In [None]:
!cat say_many_hello.txt

## read a file, line by line

There is a `readline()` method available which does what it says on the lid: it reads a line!

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    line = src.readline()
    while line:
        print(line, end="")  # the line already contains a newline, so we set end="" to avoid double newlines
        line = src.readline()

This is not really convenient. Why not using **a for loop** instead?

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    for line in src:
        print(line, end="")   # the line already contains a newline, so we set end="" to avoid double newlines

## get all lines of a file as a list

for this task we could use the `readlines()` method:

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    all_lines = src.readlines()

In [None]:
all_lines

Almost. We still have the unecessary newline in every item, which we want to get rid of. And we might want to get rid of the numbers and the tabs, too.

In [None]:
with open("say_many_hello.txt", "r", encoding="utf-8") as src:
    all_lines = [line.rstrip('\n').split("\t")[1] for line in src]

The line above is rather compact. It contains:

1. A list comprehension: `for line in src`
2. for every `line` we remove the newline, using `line.rstrip("\n")` method
3. the remaining string is splitted by the tabulator character: `split("\t")`
4. the `split` command returns a list, and because we are only interested in the second column, we add `[1]`

Voilà!

In [None]:
all_lines

Voilà!

## inplace file editing

With the `open` statement, you cannot do inplace file editing – you would need to write the changes into a temporary file and later overwrite the original one. Because this is a often needed task, there is a standard library for it: [`fileinput`](https://docs.python.org/3/library/fileinput.html)

In [None]:
import fileinput

with fileinput.input(files=('say_hello.txt','say_many_hello.txt'), inplace=True) as f:
    for line in f:
        # do some processing
        op = line.replace('♚', 'king')
        op = op.replace('♛', 'queen')
        # print() the text you want to write back to the input files
        print(op, end='')

In [None]:
!cat say_many_hello.txt

In [None]:
!cat say_hello.txt