# Working With Files

Another very common task is opening a file and reading in its contents. In most data analysis contexts, this is handled by specific methods of specialized libraries (e.g. `pandas` and `read_csv()`). However, often we'll need a more genaralized techique for opening an arbitrary file and parsing the contents:

In [10]:
# We begin by using the open function to creating a TextIOWrapper object
f = open("/home/jovyan/test_file.txt")

In [11]:
f

<_io.TextIOWrapper name='/home/jovyan/test_file.txt' mode='r' encoding='UTF-8'>

In [2]:
# We can read the contents of the file as a string
f.read()

'taco taco burrito\nllama llama alpaca\n'

In [3]:
# Note that this consumes the iterable...
f.read()

''

In [4]:
# We always want to close the connection when we're done... this prevents memory leaks
# Pro Tip: This may or may not be a trap on a certain company's technical interview ;)
f.close()

In [6]:
# The object still exists but attempts to read will throw an exception
f.read()

ValueError: I/O operation on closed file.

## Writing to files

So far so good, but what about writing to a file? If we examine the `f` object above we can see that it defaults to mode `r`, which stands for read. We can create another file connection in mode `w` for write:

In [7]:
# Establishing our file connections:
f = open("/home/jovyan/test_file.txt")
w = open("/home/jovyan/test_output.txt", "w")

In [8]:
# Note that looping a file object goes line by line by default.
for num,line in enumerate(f):
    print(line)
    w.write(f"line {num}: {line}")

In [9]:
# Always clean up
f.close()
w.close()

## Better connection management with context blocks

A very helpful feature of Python is the ability to set up a context block. This offers several advantages - it's more readable and it automatically cleans up after itself once the block is exited. Using context blocks is considered best practice whenever possible:

In [13]:
# We set up a context block with a `with` keyword

with open("/home/jovyan/test_file.txt") as f:
    for line in f:
        print(line)
        
# No need to close the connection... it closes itself!

taco taco burrito

llama llama alpaca



## Other reading options

Often times we want to get each line of a file into a list. We can use the `readlines` method for that:

In [15]:
with open("/home/jovyan/test_file.txt") as f:
    lines = f.readlines()
    
lines

['taco taco burrito\n', 'llama llama alpaca\n']