<h1>Reading and Writing Files</h1>

Reading from a file is particularly useful in data analysis applications, but it's also applicable to any situation in which you want to analyze or modify information sorted in a file. 

When you want to work with the information in a text file, the first step is to read the information into memory.

For this case we will be using the next files:
- `some_file.txt`
- `pi_digits.txt`

<h2>1. Reading a File </h2>


There are several ways that we could open, read and modify the files. 


<h3>1.1 First approach (recomended)</h3>
Consider the next cell 


In [6]:
with open('pi_digits.txt') as file_object:
    contents = file_object.read()
print(contents)

3.1415926535
  8979323846
  2643383279


To do any work with a file, even just printing its contents, you first need to *open* the file to access it. The `open()` function needs one argument: the name of the file you want to open. Python looks for this file in the directory where the program that's currently being executed is stored.

The `open()` function returns an object representing the file

The keyword `with` closes the file once access to it is no longer needed. In the second approach the keyword `close()` is used, but in this first scenario is not used. You dould open and close the file by calling `open()` and `close()`, but if a bug in your program prevents the `close()` method from being executed, the file may nevver close. this may seem trivial, but improperly closed files can cause data to be lost or corrupted. An if you cal `close()` too early in your porogram you'll find yoursefl trying to work witha *closed* file.

Once the file object representing *pi_digits.txt* is obtained, the `read()` method in the second line of the program is used to read the entire contents of the file and store itt as one long sring in `contents`



<h3>1.2 Second approach approach</h3>

Consider the next cell



In [8]:
f = open('some_file.txt','r')
file_data = f.read()
f.close()


- First open the file using the built-in fuction, `open`. This requires a string that shows the path the file. The `open` function returns a file object, which is a Python object through wich Python interacts with the file itself. Here, we assign this object to the variable `f`. 

- There are optional parameters you can specify in the `open` function. One is the mode in which we open the file. Here, we use `r` or read only. This is actually the default value for the mode argument. 

- Use the `read` method to access the contents from the fil object. This `read` method takes the text contained in a file and puts it into a string. Here, we assign the string returned from this method into the variable `file_data`. 

- When finished with the file, use the `close` method to free up any systems resources taken up by the file.

### 1.3 Calling the `read` Method with an Integer

In the code in previous cells, the call of `.read()` had no arguments passed to it. This defaults to reading all the remainder of the fil from its current position -the whole file. I you pass the `read` method an integer argument, it will read up to that number of characters, output all of them, and keep the *windwow* at hat position ready to read on. 

Let's see in an example that uses the following file, `camelot.txt`, which contains the next text
```
We're the knights of the round table
We dance whenever we're able
```

In [13]:
with open("camelot.txt") as song:
    print(song.read(2)) # this will print 'We' and let the cursor just in the apostrophe
    print(song.read(8)) # this will print "'re the " until the space before knights
    print(song.read())  # this will print the reminder text

We
're the 
knights of the round table
We dance whenever we're able


### 1.3 Reading Line by Line

`\n`s blocks of the text are newline characters. The newline character marks the end of a line, and tells a program (such as a text editor) to go down to the next line. However, looking at the stream of characters in the file, `\n` is just another character.

Fortunately, Python knows that these are spceial characters and you can ask it to read one line at a time.

Conveniently, Python will loop over the lines of a file using the syntax `for line in file`. I can use this to create a list of lines in the file. Because each line still has its newline character attached, I remove this using `.strip()`


In [14]:
camelot_lines = []
with open("camelot.txt") as f:
    for line in f:
        camelot_lines.append(line.strip())
print(camelot_lines)

["We're the knights of the round table", "We dance whenever we're able"]


### 1.4 Making a List of Lines from a file

When you use `with`, the file object returned by `open()` is onlyavailable inside the `with` block that contains it. If you want to retain access to a file's contentsoutside the `with` block, you can store the fil'es lines in a list inside the block and then work with that list. You can process parts of the file immediately and postpone some processing for later in the program. 




In [16]:
filename = 'pi_digits.txt'

with open(filename) as file_object:
    lines = file_object.readlines()
    
for line in lines:
    print(line.rstrip())

3.1415926535
  8979323846
  2643383279


In the previous cell, in line 4, the `readlines()` method takes each line from the file and stores it in a list. This list is then assigned to lines, which we can continue to work with afterthe width block ends.  In line 7 the method `.rstrip()` is explained in the section 1.3

**NOTE: When Python reads froma text file, it interprets all text in the file as a string. If you read ina numer and want to work with that value in a numerical context, you'll have to convert it to an integer using the 

<h2>2. Writing to a file </h2>

One of the simplest ways to save dat is to wrtie it to a file. When you write text to a file, the output will still be available after you close the terminal containing your program's output. You can examine output after a program finishes running, and you can share the output files with others as well. You can also write programs that read the text back into memory and work with it again later. 

Consider the next cell

In [10]:
with open('programming.txt','w') as file_obj:
    file_obj.write("I love programming.")

The call to `open()` in line 1 has two arguments. The first argument is still the name of the file we want to open. the second argument is `'w'`, which tells Python that we want to open the file in *write mode*. You can open a file in **read mode (`'r'`)**, **write mode (`'w'`)**, or **append mode(`'a'`)**,or a mode that allows you **read and write to the file (`'r+'`)**. If you omitthe mode argument, Python opens the file in read-only mode by default. 

The `open()` function automatically creates the file you're writing to if it doesn't exists. However, be careful opeining a file in write mode (`'w'`) because if the file does exist, Python will erase the contents of the file before returning the file object.

In line 2 the method `write()` is used on the file to write a string to the file. This program has no terminal output, but if you open the file *programming.txt* you'll see the line *I love programming*.

**NOTE: Python can only write strings to a text file. If you want to store numerical data in a text file, you'll have to convert the data to string format first using the str() function.**

**NOTE: The `write()` function doesn't add any newlines to the text you write. So if you write more than one line without including newline characters, your file may not lok the way you want it to**

<h2>3. Appending to a file</h2>

If you want to add content to a file instead of writing over existing content, you can open the file in *append mode*. when you open a file in append mode, Python doesn't erase the contents of the file before returning the file object. Any lines you write to the file will be added at the end of the file. If the file doesn't exist yet, Python will create an empty file for you. 


## 4. Quiz - Flying Circus Cast List

You're going to create a list of the actors who appeared in the television programme Monty Python's Flying Circus. 

Write a function called `create_cast_list` that takes a filename as input and returns a list of actors' names. It will be run on the file `flying_circus_casts.txt` (this information was collected from imdb.com). Each line of that file consists of an actor's name, a comma, and then some (messy) information about roles they played in the programme. You'll need to extract *only* the name and add it to a list. You might use the `.split()` method to process each line


In [10]:
def create_cast_list(filename):
    cast_list = []
    #use with to open the file filename
    #use the for loop syntax to process each line
    #and add the actor name to cast_list
    file_info = []
    with open(filename) as file:
        
        for line in file:
            cast_list.append(line.split(',')[0])
    return cast_list
        

    return cast_list

cast_list = create_cast_list('flying_circus_cast.txt')
for actor in cast_list:
    print(actor)

Graham Chapman
Eric Idle
Terry Jones
Michael Palin
Terry Gilliam
John Cleese
Carol Cleveland
Ian Davidson
John Hughman
The Fred Tomlinson Singers
Connie Booth
Bob Raymond
Lyn Ashley
Rita Davies
Stanley Mason
David Ballantyne
Donna Reading
Peter Brett
Maureen Flanagan
Katya Wyeth
Frank Lester
Neil Innes
Dick Vosburgh
Sandra Richards
Julia Breck
Nicki Howorth
Jimmy Hill
Barry Cryer
Jeannette Wild
Marjorie Wilde
Marie Anderson
Caron Gardner
Nosher Powell
Carolae Donoghue
Vincent Wong
Helena Clayton
Nigel Jones
Roy Gunson
Daphne Davey
Stenson Falke
Alexander Curry
Frank Williams
Ralph Wood
Rosalind Bailey
Marion Mould
Sheila Sands
Richard Baker
Douglas Adams
Ewa Aulin
Reginald Bosanquet
Barbara Lindley
Roy Brent
Jonas Card
Tony Christopher
Beulah Hughes
Peter Kodak
Lulu
Jay Neill
Graham Skidmore
Ringo Starr
Fred Tomlinson
David Hamilton
Suzy Mandel
Peter Woods
