# Reading and Writing Files
## Introduction

This chapter is about reading files from the file system and writing to them. So you can finally store data and reuse it!

This notebook covers the [ninth chapter](https://automatetheboringstuff.com/2e/chapter9/) of the book.

Another good way to learn file handling (specific to file creation and writing to files): [W3C Schools: Python File Open](https://www.w3schools.com/python/python_file_handling.asp)

## Summary

### Path Management

* Root-Folder (Anchor)
  * Folder-1 (Parent is the Root-Folder)
    * Folder-2 (Parent is Folder-1)
      * Folder-3 (Parent is Folder-2)
        * Filename.Extension
        
From the point of view of the file, all folders above it are the _parent_. The filename is also called _stem_ and the extension is the _suffix_.

#### Windows
In Windows, _backslashes_ are used to address files: `Root-Folder\Folder-1\Folder-2\Folder-3\Filename.Extension`. An example would be `C:\Users\harry\school\potions.xlsx`. The `C:` points to the drive, for example a hard disk or a USB-Stick. `.xlsx` is the file extension.

#### Unix-based Systems
On Linux and MacOS, _slashes_ are used to address files `/Folder-1/Folder-2/Folder-3/Filename.Extension`. The path starts with a slash because the first slash indicates the root folder. An example would be `/home/harry/school/potions.xlsx`.

#### Paths in Python
You don't have to worry about the OS your Python app is running on. If you work with paths, use the `Path` class from the `pathlib` module:

In [None]:
from pathlib import Path

path = Path("home", "harry", "school")

filepath = path / "potions" / "polyjuice.xlsx"

Depending on the operating system the code is run on, you'll either get a `WindowsPath`- or a `PosixPath`- object. You can join paths with a slash, thanks to slash-overloading (see the last line in the code above). This method automatically applies the correct separators between the folder names. Note that the first or the second operator must be a Path-object when you use the slash to join paths.

To get the different parts of a filepath, make use of the following attributes:

In [None]:
filepath.anchor  # '/'
filepath.parent  # PosixPath('/home/harry/school/potions')
filepath.parents  # ['/home/harry/school/potions', '/home/harry/school', '/home/harry', '/home', '/']
filepath.stem  # 'polyjuice'
filepath.suffix  # '.xlsx'
filepath.drive  # '', under Windows this could be 'C:'

import os

os.path.basename(filepath)  # 'polyjuice.xlsx'
os.path.dirname(filepath)  # '/home/harry/school/potions'
os.path.split(filepath)  # ('/home/harry/school/potions', 'polyjuice.xlsx')

You can retrieve the home directory path with `Path.home()`.

#### Absolute and Relative Paths

Absolute and relative paths work the same as they do in a terminal. If you started Python in the `/home/harry/school/potions` directory, you can switch to any other directory with

In [None]:
os.chdir("../../quidditch/players")

This means up two directory hierarchies and then change into the `quidditch` directory and finally into the `players` folder. This example uses a _relative path_. `..` is just a shorthand for the parent directory and the single dot `.` is a shorthand for the current directory. It also works by passing the _absolute path_:

In [None]:
os.chdir("/home/harry/quidditch/players")  # (Absolute Path)

An absolute path starts with the root directory, meaning with `/` on Unix platforms and with the drive letter on Windows platforms. You can check if a path is absolute or not with the `.is_absolute()`-method on the path object.

#### Create Folders
There are two ways to create a directory with Python:

In [None]:
# Way 1
import os

os.makedirs(
    "/home/harry/school/transformation/cats"
)  # Creates a `transformation` folder with a `cats` folder in it

# Way 2
from pathlib import Path

Path("/home/harry/school/transformation").mkdir()  # Creates a `transformation` folder

Path("/home/harry/school/divination/essays/november").mkdir(
    parents=True
)  # Creates a `divination` folder with an `essays` folder with a `november` folder in it

#### Gathering Information
Get the filesize and directory contents with the `os` module.

In [None]:
import os

os.path.getsize("/home/harry/school/potions/polyjuice.xlsx")
# 653364 (bytes)

os.listdir("/home/harry/school/potions")
# ['felix-felicis.xlsx', 'love-potion.xlsx', 'polyjuice.pptx', 'polyjuice.xlsx']

#### Path Validity
If you are working with paths on a system, you'll always have to make sure that the path actually exists. Otherwise, you'll run into errors. The following methods on the path object can help you with that:

In [None]:
filepath.exists()  # Checks whether the path exists or not
filepath.is_file()  # Checks if the filepath points to a file or not
filepath.is_dir()  # Checks whether the path points to a directory or not

### Reading and Writing Files
Reading and writing files is not that trivial. But fortunately Python has simplified this a lot. The main difficulty is to ensure that when Python writes to a file, no other application should be able to write to that same file. This could cause collisions and it is important to avoid them.

So when you create, modify or read a file, you have to reserve it and indicate in which mode you want to open it. The modes we're looking at in this chapter are the following:
  * _r_ read
  * _w_ write
  * _a_ append
  
To reserve a file, use the `open(...)` function. It returns a file handler.

In [None]:
path = Path.home() / "todo.txt"
file = open(
    path, "r"
)  # The second argument indicates that you want to read the file. If you write to it, you'll get an error.

To unblock the file after you used it, use the `close()` method on the file handler.

> You can use the `with`-block when opening files. If you do that, you don't need any close-call, because that will then be done automatically. It is not mentioned in the book, but it is best practice to use the with-block!

In [None]:
with open(path, "r") as file:
    file.read()

# file.close() gets called automatically

#### Read a File
A pointer is set on the file at the very beginning. To read the contents of the file, use the `read()` method on the file handler.

In [None]:
file.read()
# 'Homework\n\n- Essay on mandrakes\n- Divination chart\n- Train cup transfiguration\n'

You can see the `\n` newline-characters which are invisible in the original file:
```
Homework

- Essay on mandrakes
- Divination chart
- Train cup transfiguration
```

Based on these newlines, you can also read the file line-by-line:

In [None]:
file.seek(0)  # Resets the file-read-pointer to the beginning again

file.readlines()
# ['Homework\n', '\n', '- Essay on mandrakes\n', '- Divination chart\n', '- Train cup transfiguration\n']

#### Write to a File
In order to write to a file, you have two possibilities. First, lets add a todo by opening the file in the _append_ mode:

In [None]:
with open(Path.home() / "todo.txt", "a") as file:
    file.write("- Study hippogriffs")

On line two, you write a string to the file. This will just append it to the already existing content because you opened the file in the append-mode. But you can also overwrite a file by opening it in _w_-mode. A more common reason to open a file in the write-mode is to create one. This works the same as appending to a file, but the file does not need to exist yet:

In [None]:
with open(Path.home() / "newfile.txt", "w") as file:
    file.write("content of the new file")

![The Newly Created File](images/newfile.png)

#### Persist Application Data
##### Shelve Module
You can also store information data like variables, lists, and so on in a file. The file will be binary, so it is only readable by your Python program. To do so, you can use the `shelve` module and store your data as you would in a dictionary.

In [1]:
import shelve

wizards = ["hermione", "ron", "harry", "hagrid", "snape"]

with shelve.open(
    "programdata"
) as shelf:  # This creates a file in your current working directory
    shelf["wizards"] = wizards  # Add the wizards to the shelf dictionary

with shelve.open("programdata") as shelf:
    loaded_wizards = shelf["wizards"]
    print(loaded_wizards)

['hermione', 'ron', 'harry', 'hagrid', 'snape']


##### Manually
With the `pprint` module's `pformat()` function, a list is stringified as it would be written in Python code. You can therefore create a Python module out of your code. This works as follows:

In [2]:
import pprint

wizards = ["hermione", "ron", "harry", "hagrid", "snape"]

with open("hogwarts.py", "w") as file:
    file.write(f"wizards={pprint.pformat(wizards)}\n")

import hogwarts

print(hogwarts.wizards)

['hermione', 'ron', 'harry', 'hagrid', 'snape']


## Exercises

### Exercise 1: List Files
Create a program that lists files of a given folder with its filesize. The file extension should not be printed.

In [None]:
# implement here

### Exercise 2: Grocery List
Create a tool that asks for items which will be added to a grocery list. When finished (when the user inputs `done`), the list should be available as a `groceries.txt` file.

In [None]:
# implement here

### Exercise 3: Save Battleships
Remember the battleships game from the dictionaries chapter? It can be enhanced a little bit further. When the user types in `save`, the state of the game should be saved (use the `shelve` module) and the game ends (use `raise KeyboardInterrupt` in Jupyter notebooks for that). If the user starts the game again and a data file is available, the game should continue at the same stage. Then, the state file should be deleted. Work with this code or your own:

In [None]:
ships = {
    "A1": False,
    "A2": False,
    "A3": False,
    "A4": False,
    "B1": True,
    "B2": True,
    "B3": False,
    "B4": False,
    "C1": False,
    "C2": False,
    "C3": False,
    "C4": True,
    "D1": False,
    "D2": False,
    "D3": False,
    "D4": False,
}

win = False
failed_hits = 0

while not win:
    cell = input("On which cell do you want to set off the bomb?")
    hit = ships.get(cell, False)
    if hit:
        print("You hit a ship!")
        ships[cell] = False
        failed_hits = 0
    else:
        failed_hits += 1
        if failed_hits >= 3:
            break
    win = True not in ships.values()

if win:
    print("You won.")
else:
    print("You lost.")