# Working with files

Before a file can be read from or written to, it must be opened with the `open()` function. `open()` expects at least one argument: The name (possibly with path) of the file:

In [23]:
fh = open('data/names/names_short.txt')

If necessary (and generally recommended), the encoding of the file can be specified explicitly, if it is known:

In [24]:
fh = open('data/names/names_short.txt', encoding='utf-8')

Excursus on encoding: Encoding determines how a computer interprets bit sequences as characters.  We will cover this topic in more detail in one of the live sessions. For more in-depth coverage (or if you need to know on the spot), I recommend these texts:

* https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/
* https://docs.python.org/3/howto/unicode.html

When we don't need the file anymore, i.e. when we have read the file, it should be closed again so that the operating system can release the resource.

In [25]:
fh.close()

The object representing the open file provides several ways to access the contents of the file, including an iterator that we can use in a `for` loop. 


In [26]:
fh = open('data/names/names_short.txt', encoding='utf-8')
for line in fh:
    print(line)
fh.close()

Astrid

Ines

Christoph

Markus

Çınar

Đželila

Niklas

Anna

Stefanie

Raphael

Anna-Lena

Silvia

Julian

Simon

Katharina

Michael

Dominik

Maria

Kevin

Bianca

Thomas

Nora

Manuel

Selina

Gabriel

Daniel

Thomas

Nina

Michael

Fabio

Theresa

Manuel

Carina

Philipp

Lukas

Wolfgang

Anna

Doris

Thomas

Muhammed

Christoph

Lisa-Marie

Jessica

Maria

Thomas

Florian

Martin

Anna

Oliver

Gregor

Helmut

Florian

Matteo

David

Marlene

Vanessa

Lea

Jan

Béla

Verena

Manuel

Björn

Tobias

Denise

Emma

Lukas

Sarah

Oliver

Janine

Manuel

Georg

Lorenz

Verena

Caroline

Laura

Felix

Simon

Lea

Peter

Sandra

Julia

Sophie

Jacqueline

Nina

Sebastian

David

Matthias

Patrick

Selina

Fabian

Daniel

Sabine

Josef

Lisa

Carina

Florian

Fabian

Viktoria

Christoph

Emilia



## Open a file in a context manager
It is good style to close an opened file as well. But if, for example, the program crashes while the file is open, the `close()` method can no longer be executed. To avoid such problems it is recommended to use a context manager:

In [27]:
with open('data/names/names_short.txt', encoding='utf-8') as fh:
    for line in fh:
        print(line)

Astrid

Ines

Christoph

Markus

Çınar

Đželila

Niklas

Anna

Stefanie

Raphael

Anna-Lena

Silvia

Julian

Simon

Katharina

Michael

Dominik

Maria

Kevin

Bianca

Thomas

Nora

Manuel

Selina

Gabriel

Daniel

Thomas

Nina

Michael

Fabio

Theresa

Manuel

Carina

Philipp

Lukas

Wolfgang

Anna

Doris

Thomas

Muhammed

Christoph

Lisa-Marie

Jessica

Maria

Thomas

Florian

Martin

Anna

Oliver

Gregor

Helmut

Florian

Matteo

David

Marlene

Vanessa

Lea

Jan

Béla

Verena

Manuel

Björn

Tobias

Denise

Emma

Lukas

Sarah

Oliver

Janine

Manuel

Georg

Lorenz

Verena

Caroline

Laura

Felix

Simon

Lea

Peter

Sandra

Julia

Sophie

Jacqueline

Nina

Sebastian

David

Matthias

Patrick

Selina

Fabian

Daniel

Sabine

Josef

Lisa

Carina

Florian

Fabian

Viktoria

Christoph

Emilia



## Other methods to read from a file

### read()
The `read()` method reads the entire file contents as a string. So we get the whole file content as a (sometimes very long) string:

In [28]:
with open('data/names/names_short.txt', encoding='utf-8') as fh:
    data = fh.read()
print(data)    

Astrid
Ines
Christoph
Markus
Çınar
Đželila
Niklas
Anna
Stefanie
Raphael
Anna-Lena
Silvia
Julian
Simon
Katharina
Michael
Dominik
Maria
Kevin
Bianca
Thomas
Nora
Manuel
Selina
Gabriel
Daniel
Thomas
Nina
Michael
Fabio
Theresa
Manuel
Carina
Philipp
Lukas
Wolfgang
Anna
Doris
Thomas
Muhammed
Christoph
Lisa-Marie
Jessica
Maria
Thomas
Florian
Martin
Anna
Oliver
Gregor
Helmut
Florian
Matteo
David
Marlene
Vanessa
Lea
Jan
Béla
Verena
Manuel
Björn
Tobias
Denise
Emma
Lukas
Sarah
Oliver
Janine
Manuel
Georg
Lorenz
Verena
Caroline
Laura
Felix
Simon
Lea
Peter
Sandra
Julia
Sophie
Jacqueline
Nina
Sebastian
David
Matthias
Patrick
Selina
Fabian
Daniel
Sabine
Josef
Lisa
Carina
Florian
Fabian
Viktoria
Christoph
Emilia



### readlines()
This method reads each line of the file as an element in a list (a list is another sequence type we will learn about soon):

In [29]:
with open('data/names/names_short.txt', encoding='utf-8') as fh:
    lines = fh.readlines()
print(lines)    

['Astrid\n', 'Ines\n', 'Christoph\n', 'Markus\n', 'Çınar\n', 'Đželila\n', 'Niklas\n', 'Anna\n', 'Stefanie\n', 'Raphael\n', 'Anna-Lena\n', 'Silvia\n', 'Julian\n', 'Simon\n', 'Katharina\n', 'Michael\n', 'Dominik\n', 'Maria\n', 'Kevin\n', 'Bianca\n', 'Thomas\n', 'Nora\n', 'Manuel\n', 'Selina\n', 'Gabriel\n', 'Daniel\n', 'Thomas\n', 'Nina\n', 'Michael\n', 'Fabio\n', 'Theresa\n', 'Manuel\n', 'Carina\n', 'Philipp\n', 'Lukas\n', 'Wolfgang\n', 'Anna\n', 'Doris\n', 'Thomas\n', 'Muhammed\n', 'Christoph\n', 'Lisa-Marie\n', 'Jessica\n', 'Maria\n', 'Thomas\n', 'Florian\n', 'Martin\n', 'Anna\n', 'Oliver\n', 'Gregor\n', 'Helmut\n', 'Florian\n', 'Matteo\n', 'David\n', 'Marlene\n', 'Vanessa\n', 'Lea\n', 'Jan\n', 'Béla\n', 'Verena\n', 'Manuel\n', 'Björn\n', 'Tobias\n', 'Denise\n', 'Emma\n', 'Lukas\n', 'Sarah\n', 'Oliver\n', 'Janine\n', 'Manuel\n', 'Georg\n', 'Lorenz\n', 'Verena\n', 'Caroline\n', 'Laura\n', 'Felix\n', 'Simon\n', 'Lea\n', 'Peter\n', 'Sandra\n', 'Julia\n', 'Sophie\n', 'Jacqueline\n', 'Nina

<div class="alert alert-block alert-info">
<b>Exercise 1</b><p>How many lines does the file names_short.txt have?</p></div>

## Write to a file
So far we have only read from files. To be able to write to a file, we need to open it in a special way. The `open()` function expects as second argument a string specifying how to open a file. If we do not specify, the default value `r` (for `read`) is assumed.

```
with open('data/first_name/names_short.txt', encoding='utf-8') as fh:
```
leads to the same result as 

```
with open('data/first_name/names_short.txt', 'r', encoding='utf-8') as fh:
```


If we want to open a file for writing, we use `'w'` (for `write`) instead of `'r'`.

In [30]:
with open('data/testfile.txt', 'a', encoding='utf-8') as fh:
    fh.write('I am a text.')

<div class="alert alert-block alert-info">
<b>Exercise 2</b><p>
Write a program that does the following:
<ol>
<li>Read in the contents of the file "data/names/names_short.txt".</li>
<li>Write the content of the read file into a new file "mynames.txt"</li>
<li>Ask (with input()) the user for his/her first name and store it in a variable </li>
<li>Append the requested name to the end of the "mynames.txt" file</li>
</ol>
</div>

## Write binary data
So far we have always assumed that we are reading text from a file or writing to a file. If we're dealing with binary data (i.e. anything that isn't plain text, e.g. images, PDF files, Word files, etc.), we need to specify that explicitly with the letter `'b'`:

In [31]:
with open('img/string1.png', 'rb') as fh_in, open('img/testimage.png', 'wb') as fh_out:
    data = fh_in.read()
    fh_out.write(data)

## Literature


  * Python Tutorial: 
	(https://docs.python.org/3/tutorial/inputoutput.html#reading-and-writing-files)
  * Sweigart: https://automatetheboringstuff.com/2e/chapter9/
  * https://www.w3schools.com/python/python_file_handling.asp
  * https://www.geeksforgeeks.org/file-handling-python/
  