# Week 06, Worksheet 1: Files

`string`s are all sorts of fun. That is, until we run out of things to come up with -- which happens pretty fast. (Even I can entertain myself for only so long: just about 1 minute.)

More often than not, using `string`s (and the practice of programming at-large) is the practice of working with files, or information stored _outside_ of a program. Here, we use the `open` function to access the file. Our use of this function features two parts:

```python
#   1. File's path (as string)
#     |
open(filename, mode)
#              |
#   2. File mode (see table below)
```
1. The `path` represents the location of the file
2. The `mode` tells the function what kinds of work we might do on or with the file

Modes include:

|Mode |Purpose |
|-----|--------|
|`r`  |Read a file|
|`w`  |Write to a file|
|`a`  |Append to a file|

This week, we'll focus on simply reading and writing to a file. First, let's read from a file:

In [None]:
# Fun fact: if you recognize any of the names in this poem, you've clearly seen the musical _Cats_.
#           T.S. Eliot wrote the book on which that musical is based; this is its first poem.
cat_poem = open("eliot_the_naming_of_cats.txt","r")

Once our file is open, like any object in Python, we gain the ability to use certain _methods_ of it. In order to access the content, we need to use one of two methods which allow us to do so:

1. `read`
2. `readlines`

(As in anything programming, there are many more ways to complete the above actions; as you extend your knowledge in programming -- particularly Python -- you'll discover them and likely opt to use other methods.)

<div class="alert alert-block alert-warning">
    Using any one of these methods will <em>automatically consume the contents of the file</em>. This means that once one of these methods is used, the only way to access the file's contents is to <b>open</b> it again.
</div>

Below, we'll see what using the `read` method looks like.

In [None]:
text = cat_poem.read()
text

Perhaps that wasn't quite what you expected -- however, sometimes we get exactly what we ask for. This is one of those times.

`read` pulls in the exact contents of the file without respect to things like _control characters_ -- another important part of `string`s. Recall the use of `\t` in the opening weeks of the course. Here, we see a new one, `\n`, which indicates that a _new line_ should occur _exactly at that point_.

Should we `print` the variable, we'll see the result of our _control character_ `\n`.

In [None]:
print(text)

_Much_ better. But, I told you there were other ways to do this, so there must be some benefit. Let's look at `readlines`.

In [None]:
cat_poem = open("eliot_the_naming_of_cats.txt","r")
lines = cat_poem.readlines()
print(lines)

Curious. We get a `list` of `string`s containing all of the lines in the file _including_ the `\n` control characters. This means that we can use useful _data structure_ operations on it like, for example:

In [None]:
# Counting the number of lines in the poem
print("There are", len(lines), "lines in the poem.")

# Getting the second-to-last line
print(lines[-2])

# Getting a slice of the list from a spot in the middle
print(lines[10:15])

### Formatting

Let's say, for argument's sake, we want to print the poem as poems are traditionally seen: line by line with line numbers to the left of the line, separated by some space. Ok, here we go:

In [None]:
print("The Naming of Cats")
print("  T.S. Eliot")
print()
line_num = 1
for line in lines:
    print(line_num,line)
    line_num += 1

Cool and all, but what's with that space between the lines? 

Oh! We forgot that every line actually has an `\n` character after it. To rid outselves of this `\n`, we'll use a new method: `rstrip`.

_And_, that number to the left looks a bit close. We _could_ solve this using our typical approach to `print`ing things. Or, we could learn a _new_ way to format out strings: the `f-string`.

The `f-string` allows us to create a _template_ `string` -- something that holds the variables we want to `print` and the formatting we want to use to `print` them using slightly modified syntax:

In [None]:
line_num = 1

print("The Naming of Cats")
# See what I did here with \n?
print("  T.S. Eliot\n")

for line in lines:
    print(f"{line_num}\t{line.rstrip()}")
    line_num += 1

But, even that's a bit clunky -- _there're too many line numbers!_. Let's print a number to the left of every _five_ (5).

In [None]:
line_num = 1

print("The Naming of Cats")
print("  T.S. Eliot\n")

for line in lines:
    # The "%" is called the modulus -- it asks if there's any remainder after division
    if line_num % 5 == 0:
        print(f"{line_num}\t{line.rstrip()}")
    else:
        print(f"\t{line.rstrip()}")
    line_num += 1

Much better!

### Finishing this activity

Test yourself by completing the [final activity](f2_week-1-worksheet-yeats.md)!
