# Files and filesystems

## Learning Objective

Today we will be learning:

* read and write files

## Success criteria

By the end of this notebook, you will have been successful if:

* read and printed a file
* checked and caught a file error
* written to, then read from a file.

## Vocabulary

 Word | Definition 
------|------------
 OS | Operating System
 cwd | current working directory
 path separator | the character to represent new folders in the file tree
 filename | the full name of the file within the directory
 filepath | the combined name of filename and some (all) of its parent directories
 file extension | the trailing letters (and dot) of the filename 
 . | a shortcut to mean the current directory
 .. | a shortcut to mean the parent directory
 ~ | a shortcut to mean the user's home directory
 
 

## Review

Important concepts to recall are:

* binary representation

Data at rest or in transit is represented as binary numbers.

In [None]:
import os.path    # import the library that uses your os 
fp = "hello.txt"
os.path.isfile(fp)


This code has a an important **sequence**.

Run the code, then mix up the order of the lines

In [None]:
fh = open("hello.txt")  # 1. Open a file handle
fh.close()              # 4. Close the file handle
msg = fh.read()         # 2. Read the whole file into a the variable `msg`
print(msg)              # 3. Print the message


### File handles or descriptors

A file handle is how that your OS uses to keep track of accesses to files. There is a limited number of these handles and whenever you **open** one, you must take to **close** it again in case you run out of them.

A typical number is 1000, so you can run out of them easily if you open them in a loop.

Python can help here by using a **with** statement. Below is the same code, but using a **with** block to *guarantee* the file is closed again.

https://www.techopedia.com/definition/3313/file-handle

In [None]:
with open("hello.txt") as fh:
    msg = fh.read()
print(msg)

A common error when opening file is FileNotFoundError, 
so it's often worth checking if the filepath

In [None]:
with open("non-existing-file.txt") as fh:
    msg = fh.read()
print(msg)

In [None]:
import os.path as osp

fp = "./no-such-file.txt"
# Use an `if` statement to check
if osp.isfile(fp):
    with open(fp) as fh:
        msg = fh.read()
    print(msg)
else:
    print("No such file as {}".format(osp.abspath(fp)))
    

In [None]:
osp.abspath?

In [None]:
# Or use a try.. except block
try:
    with open(fp) as fh:
        msg = fh.read()
    print(msg)
except FileNotFoundError:
    print("No such file as {}".format(osp.abspath(fp)))

### File modes


Character | Meaning
----------|---------------------------------------------------------------
'r'       | open for reading (default)
'w'       | open for writing, truncating the file first
'x'       | create a new file and open it for writing
'a'       | open for writing, appending to the end of the file if it exists
'b'       | binary mode
't'       | text mode (default)
'+'       | open a disk file for updating (reading and writing)

These uses of `open` are all equivalent:

* `open(filename)`
* `open(filename, 'r')`
* `open(filename, 'rt')`

A word of warning here, if you open a file in `write` mode with the `'w'`,
then the file will be **truncated** or deleted before writing to it.

To **add** to the end of a file, you will need `append` mode, or `'a'`.

### Writing files

You often write to file to record results.

Look at the code below and decide:

* what mode is the file opened in?
* what will go into file?
* what will the last thing written be?

In [None]:
with open('countdown.txt', 'a') as result:
    for i in range(10, 0, -1):
        result.write("{}\n".format(i)) 
    result.write("Boom")

**Read** the code below and decide:

* will this code write to the bottom of the file?
* what will the last thing written be?
* what is the error in this code?

**Fix** and run the code so that it runs without error.

In [None]:
with open('countdown.txt', 'w') as result:
    for i in range(10, 0, -1):
        result.write("{}\n".format(i)) 
result.write("Boom")

**Write** some code that reads the `countdown.txt` file and prints the file.

Remember you will need to:

1. Use `with` to open a file in reading mode.
2. `read` the data from the file
3. `print` the data

In [None]:
# Read and print the data from countdown.txt here





A more efficient way to read the file is to read it line by line.
Python allows you to use `for` to iterate over each line,
so that the *whole* file is *not* read into file at once.
It is faster too as just a little data is read before doing something with it.

This is particularly important for very large files.

**Write** some code that reads the `countdown.txt` file and prints each line

1. Use `with` to open a file in reading mode.
2. Iterate over each line
    * `print` the line


In [None]:
# Read and print each line from countdown.txt here





## Binary mode

Opening a file in text mode requires that every byte can be converted into a character.
If there are bytes that *cannot* be converted, then an error will be thrown.

![texture_bitmap](../images/simple_texture.bmp)

The image *above* is stored as bytes, many of which cannot be converted to text.
The code *below* will raise an error.

What change can you make to prevent the error from happening?

In [None]:
fp = "../images/simple_texture.png"
with open(fp, 'rb') as im:
    msg = im.read()
print(msg)

In [None]:
open?

In [None]:
with open("hello.txt", 'rt') as fh:
    msg = fh.read()
print(msg)

In [None]:
import os
os.listdir('.')

In [None]:
from PIL import Image
fp = "./simple_texture.png"

assert osp.isfile(fp)

with Image.open(fp) as im:
    im.rotate(45).show()

In [None]:
import os

for f in os.listdir('.'):
    print(f)

In [None]:
x = None
y = None
x == y
type(x)

In [None]:
float('-0')


In [None]:
repr(None)