# Working with Files

## Reading Files

In [90]:
f = open('15_test.txt', 'r')

FileNotFoundError: [Errno 2] No such file or directory: '15_test.txt'

In [5]:
f.read()

"And Now for Something Completely Different\nMonty Python and the Holy Grail\nMonty Python's Life of Brian\nMonty Python Live at the Hollywood Bowl\nMonty Python's The Meaning of Life"

In [9]:
f.close()

In [7]:
f.read()

ValueError: I/O operation on closed file.

In [10]:
f = open('15_test.txt', 'r')
try:
    print(f.read())
finally:
    f.close()

And Now for Something Completely Different
Monty Python and the Holy Grail
Monty Python's Life of Brian
Monty Python Live at the Hollywood Bowl
Monty Python's The Meaning of Life


In [11]:
with open('15_test.txt', 'r') as f:
    print(f.read())

And Now for Something Completely Different
Monty Python and the Holy Grail
Monty Python's Life of Brian
Monty Python Live at the Hollywood Bowl
Monty Python's The Meaning of Life


In [13]:
f = open('15_test.txt', 'r')
1/0
f.close()

ZeroDivisionError: division by zero

In [14]:
f.read()

"And Now for Something Completely Different\nMonty Python and the Holy Grail\nMonty Python's Life of Brian\nMonty Python Live at the Hollywood Bowl\nMonty Python's The Meaning of Life"

In [15]:
with open('15_test.txt', 'r') as f:
    1/0

ZeroDivisionError: division by zero

In [16]:
f.read()

ValueError: I/O operation on closed file.

#### More about With Statements

In [24]:
class SetupAndTeardown:  
    def __init__(self):
        print('Instantiating')

    def __enter__(self):
        print('Setup')

    def __exit__(self, exc_type, exc_value, traceback):
        print('Teardown')

In [25]:
with SetupAndTeardown() as st:
    print(f'Doing some stuff')

Instantiating
Setup
Doing some stuff
Teardown


In [26]:
class SetupAndTeardown:  
    def __init__(self):
        print('Instantiating')

    def __enter__(self):
        print('Setup')

    def __exit__(self, exc_type, exc_value, traceback):
        print(exc_type)
        print(exc_value)
        print(traceback)
        print('Teardown')

In [27]:
with SetupAndTeardown() as st:
    print(1/0)


Instantiating
Setup
<class 'ZeroDivisionError'>
division by zero
<traceback object at 0x1473aec00>
Teardown


ZeroDivisionError: division by zero

### Reading Files (continued)

In [18]:
with open('15_test.txt', 'r') as f:
    print(f.readline())
    print(f.readline())

And Now for Something Completely Different

Monty Python and the Holy Grail



In [19]:
with open('15_test.txt', 'r') as f:
    while line := f.readline():
        print(line)

And Now for Something Completely Different

Monty Python and the Holy Grail

Monty Python's Life of Brian

Monty Python Live at the Hollywood Bowl

Monty Python's The Meaning of Life


In [20]:
with open('15_test.txt', 'r') as f:
    print(f.readlines())

['And Now for Something Completely Different\n', 'Monty Python and the Holy Grail\n', "Monty Python's Life of Brian\n", 'Monty Python Live at the Hollywood Bowl\n', "Monty Python's The Meaning of Life"]


In [21]:
with open('15_test.txt', 'r') as f:
    lines = [l.strip() for l in f.readlines()]

In [22]:
with open('15_test.txt', 'r') as f:
    lines = f.read().split('\n')

In [23]:
with open('15_test.txt', 'r') as f:
    lines = f.read().splitlines()

## Writing files

In [28]:
with open('15_new.txt', 'w') as f:
    f.write('Some text')

In [29]:
with open('15_new.txt', 'w') as f:
    f.write('Some text')

with open('15_new.txt', 'w') as f:
    f.write('Some other text')

In [30]:
with open('15_new.txt', 'r') as f:
    print(f.readlines())

['Some other text']


In [31]:
with open('15_new.txt', 'a') as f:
    f.write('Some text')
    f.write('Some other text')

In [32]:
with open('15_new.txt', 'r') as f:
    print(f.readlines())

['Some other textSome textSome other text']


In [33]:
with open('15_new.txt', 'w') as f:
    lines = ['Some text\n', 'Some other text\n']
    f.writelines(lines)


In [34]:
with open('15_new.txt', 'r') as f:
    print(f.readlines())

['Some text\n', 'Some other text\n']


In [35]:
with open('15_new.txt', 'a') as f:
    f.read()

UnsupportedOperation: not readable

In [36]:
with open('15_new.txt', 'w+') as f:
    print(f.read())




In [37]:
with open('15_new.txt', 'w+') as f:
    f.write('This is some new text')
    print(f'File contents are: {f.read()}')
    

File contents are: 


In [38]:
with open('15_new.txt', 'w+') as f:
    f.write('This is some new text')
    f.seek(0)
    print(f'File contents are: {f.read()}')

File contents are: This is some new text


In [39]:
with open('15_new.txt', 'w+') as f:
    f.write('This is some new text')
    f.seek(0)
    print(f'File contents are: {f.read()}')
    f.seek(0)
    f.write('Spam')
    f.seek(0)
    print(f'File contents are: {f.read()}')

File contents are: This is some new text
File contents are: Spam is some new text


## Binary Files

In [41]:
with open('15_logo.jpg', 'rb') as f:
    print(f.read()[0:10])

b'\xff\xd8\xff\xe0\x00\x10JFIF'


In [42]:
with open('15_new.txt', 'wb') as f:
    f.write('asdf')

TypeError: a bytes-like object is required, not 'str'

In [43]:
with open('15_new.txt', 'wb') as f:
    f.write('asdf'.encode('utf-8'))

In [44]:
from itertools import batched

def hex_pair_to_bytes(hex_pair):
    hex_str = ''.join(hex_pair)
    dec_number = int(hex_str, 16)
    return dec_number.to_bytes(1)

def hex_to_bytes(hex_pairs):
    with open('15_colors.col', 'ab') as f:
        for pair in batched(hex_pairs, 2):
            f.write(hex_pair_to_bytes(pair))


hex_pairs = 'FF0000FFA500FFFF000080000000FF800080'
hex_to_bytes(hex_pairs)

In [45]:
with open('15_colors.col', 'rb') as f:
    print(f.read())

b'\xff\x00\x00\xff\xa5\x00\xff\xff\x00\x00\x80\x00\x00\x00\xff\x80\x00\x80'


## Buffering

In [50]:
with open('15_colors.col', 'rb') as f:
    print(type(f))

<class '_io.BufferedReader'>


In [54]:
with open('15_new.txt', 'w') as f:
    print(type(f))

<class '_io.TextIOWrapper'>


In [55]:
f = open('15_buffering.txt', 'a')
for i in range(100):
    f.write(str(i))

In [56]:
# Check file to see what's written before closing it
f.close()

In [75]:
# empty the file
with open('15_buffering.txt', 'w'):
    pass

f = open('15_buffering.txt', 'w', buffering=1)
for i in range(100):
    f.write(str(i))
    f.flush()    

In [76]:
f.close()

In [57]:
import io

io.DEFAULT_BUFFER_SIZE

8192

In [58]:
# empty the file
with open('15_buffering.txt', 'w'):
    pass

f = open('15_buffering.txt', 'wb', buffering=3)
for i in range(100):
    f.write(str(i).encode('utf-8'))
    

## Creating and Deleting Files and Directories

In [59]:
import os

os.mkdir('some_directory')

In [60]:
os.mkdir('some_directory/level1')

In [61]:
os.makedirs('some_directory/level1/level2/level3/level4/level5')

In [62]:
os.rmdir('some_directory')

OSError: [Errno 66] Directory not empty: 'some_directory'

In [63]:
os.rmdir('some_directory/level1/level2/level3/level4/level5')

In [64]:
os.rmdir('some_directory/level1/level2/level3/level4')
os.rmdir('some_directory/level1/level2/level3')
os.rmdir('some_directory/level1/level2')
os.rmdir('some_directory/level1')

In [65]:
import shutil

shutil.rmtree('some_directory')

In [66]:
os.mkdir('some_directory')
with open('some_directory/another_file.txt', 'w') as f:
    pass

os.mkdir('some_directory/another_directory')

In [67]:
os.listdir('some_directory')

['another_directory', 'another_file.txt']

In [68]:
for name in os.listdir('some_directory'):
    if os.path.isdir(f'some_directory/{name}'):
        print(f'{name} is a directory')
    elif os.path.isfile(f'some_directory/{name}'):
        print(f'{name} is a file')
    else:
        print(f'{name} does not exist')

another_directory is a directory
another_file.txt is a file


In [69]:
# create file
with open('to_be_removed.txt', 'w') as f:
    pass

# remove file
os.remove('to_be_removed.txt')

In [70]:
if os.path.exists('to_be_removed.txt'):
    os.remove('to_be_removed.txt')

## Serializing, Deserializing, and Pickling Data

In [71]:
import json

data = {
    'foo': 1,
    'bar': 2,
    'spam': ['spam', 'spam', 'spam']
}

print(json.dumps(data))


{"foo": 1, "bar": 2, "spam": ["spam", "spam", "spam"]}


In [72]:
with open('data.json', 'w') as f:
    f.write(json.dumps(data))

In [73]:
with open('data.json', 'r') as f:
    data = json.loads(f.read())

print(f'Data is now type: {type(data)}')
print(data)

Data is now type: <class 'dict'>
{'foo': 1, 'bar': 2, 'spam': ['spam', 'spam', 'spam']}


In [79]:
class User:
    def __init__(self, name, email):
        self.name = name
        self.email = email


    def to_json(self):
        return json.dumps(
            {
                'name': self.name,
                'email': self.email
            })

    @staticmethod
    def from_json(data):
        data = json.loads(data)
        return User(data['name'], data['email'])


In [80]:
u = User('Alice', 'alice@unlockingpython.com')
data = u.to_json()
new_u = User.from_json(data)
print(new_u.name)
print(new_u.email)

Alice
alice@unlockingpython.com


In [81]:
import pickle

u = User('Alice', 'alice@unlockingpython.com')

with open('alice.pkl', 'wb') as f:
    pickle.dump(u, f)

In [82]:
with open('alice.pkl', 'rb') as f:
    new_u = pickle.load(f)

In [83]:
new_u.name

'Alice'

## Exercises

**1.**
For each of the following pieces of code, predict if they will run successfully or raise an error. you can double-check in Python if you’re unsure:

  **a.**
```
with open('test.txt', 'a') as f:
    f.write('line 1\n')
    f.write('line 2\n')
    f.read()
```
    
  **b.**
```
with open('test.txt', 'w+') as f:
    f.write('line 1\n')
    f.write('line 2\n')
    f.read()
```

  **c.**
```
with open('test.txt', 'r+') as f:
    f.write('line 1')
```

  **d.**
```
with open('test.txt', 'w+') as f:
    f.write('line 1')
f.read()
```

**2.**

Read the file `15_colors.col`, generated from the string `FF0000FFA500FFFF000080000000FF800080` (this string and code that generates it is included in the exercise files) and convert it back into pretty-printed hex codes of this form:

`#FF0000, #FFA500, etc.`

**3.** 

Write a function, `remove_directory`, that performs the same action as `shutil.rmtree`, using only functions from the `os` module. When a valid directory path is passed to the function `remove_directory`, it should empty it of all files and directories if it’s not already empty, and then remove it with `os.rmdir`.

Remember that any directories within the target directory can have files and directories of their own. The structure you are deleting may be several levels deep! You may want to consider a recursive solution.

**4.**

In Chapter 10, “Functions,” exercise problem 4, we covered the concept of run-length encoding and wrote the functions encode and decode in order to compress ASCII art. Working solutions for the encode and decode functions are provided in the exercise files for you to use:

In [85]:
def encode(data_str):
    encoded_data = []
    count = 0
    last_char = data_str[0]
    for char in data_str:
        if char != last_char: # encountered a new character!
            encoded_data.append((last_char, count))
            count = 0
            last_char = char
        count += 1
    encoded_data.append((last_char, count))
    return encoded_data

def decode(encoded_data):
    data_str = ''
    for char, count in encoded_data:
        data_str += char * count
    return data_str
        

Use these, or your own solution, these to read the file `15_ascii.txt` (which is 2,754 bytes in size), compress it using the encode function, and write the result to a binary file `15_ascii.bin`.

Consider the minimum amount of information needed to be written to the binary file. You will probably want to write a single character, followed by an 8-bit number, followed by another character, followed by another 8-bit number, etc. What is the resulting size of this file?

Finally, write a function that reads the file back in and decompresses it using the decode function. Check that the decompressed version looks the same as the original!