# 08. File Input Output

Python allows us to open and read, create, update and delete data from files.

For A levels, `csv` and `txt` are commonly worked with.

#### <u>File Handling</u>

To open a file, the method `open()` is used.

The syntax for opening a file is: 

`open(filename, mode)`

It takes in two <i>parameters</i> - filename and mode

There are 4 main modes for opening a file:

<table>
   <tr>
    <th>Character</th>
    <th>Mode</th>
    <th>Function</th>
   </tr>
   <tr>
    <td>"r"</td>
    <td>Read</td>
    <td>Opens a file for reading (throws an error if the file does not exist)</td>
   </tr>
   <tr>
    <td>"w"</td>
    <td>Write</td>
    <td>Opens a file for writing (creates the file if it does not exist)</td>
   </tr>
   <tr>
    <td>"a"</td>
    <td>Append</td>
    <td>Opens a file for appending (creates the file if it does not exist)</td>
   </tr>
   <tr>
    <td>"x"</td>
    <td>Create</td>
    <td>Creates the specified file (throws an error if the file already exists)</td>
   </tr>
</table>

If no <code>mode</code> is supplied, it is in the read mode by default.

It is always a good practice to close any files using the <b>.close()</b> method when we are done using them. Very often, changes made to a file may not appear until we close it.

<b>Example 1: </b>Opening a text file named sample1.txt

In [3]:
# assign a variable to the file

# using the open() function to open sample1.txt
f1 = open("sample1.txt")

# using the close() function to close sample1.txt ( important as you will be marked for it )
f1.close()

#### <u>Reading a file</u>

To read the entire content of a file, you can use the `read()` method

In [11]:
f1 = open('sample1.txt', 'r')

data = f1.read()

f1.close()

print(data)

Hello
World


To read a line of the file, you can use the `readline()` method

In [8]:
# This code snippet reads the first two lines of sample1.txt

f1 = open('sample1.txt', 'r')

data = f1.readline() # calling it for the first time reads the FIRST line

data2 = f1.readline() # calling it AGAIN moves on to the next line, and so on for additional calls

f1.close()

print(data)
print(data2)

Hello

World


To read the lines of the file ( similar to the `read()` method ), you can also use `readlines()` method.

However, it returns a list instead. Each item in the list represents each row. 

In the snippet below, there is a newline (`\n`) after Hello. This is because "World" starts on the next line.

What do you observe from `readlines()` compared to `read()`?

In [9]:
f1 = open('sample1.txt', 'r')

data = f1.readlines()

f1.close()

print(data)

['Hello\n', 'World']


<b>Example 2:</b> Given that there are numbers in the file sample2.txt, each on a new row, find the sum of all the numbers. (using `readlines()`)

In [14]:
f2 = open('sample2.txt', 'r')

data = f2.readlines()

print("Raw data is: ", data)

total = 0

for n in data:
    number = int(n.strip()) ## can you guess what does .strip() do?
    total += number

f2.close()

print("The total sum is: ", total)

Raw data is:  ['2\n', '4\n', '7\n', '5\n', '4\n', '1\n', '4']
The total sum is:  27


#### <u>Writing to a file</u>


To write to an existing file, the mode in the `open()` function must include either:

- `"a"` : Append - will append to the end of the file
- `"w"` : Write - will overwrite any existing content of the file

<b>Example 3:</b> Append content to the file "sample3.txt"

In [17]:
f3 = open("sample3.txt" , "r")

initial = f3.read()
print("Initial data is: ",initial)

f3.close()

# now we will use the "a" mode

f3 = open("sample3.txt", "a")

f3.write("Now the file has more content!")
 
f3.close()

# lets read the new content of the file

f3 = open("sample3.txt" , "r")

print("New content is: ", f3.read())

f3.close()

Initial data is:  This is the content of sample3 text file!
New content is:  This is the content of sample3 text file!Now the file has more content!


<b>Example 4:</b> Opening "sample3.txt" and overwritting its content

In [18]:
f3 = open("sample3.txt", "w")
f3.write("Oops i have deleted the content of sample3!")
f3.close()

f3 = open("sample3.txt" , "r")
print(f3.read())
f3.close()

Oops i have deleted the content of sample3!


#### <u>Using the split() method</u>

Data in files are usually seperated using `,` ( commas ), `" "` ( white spaces ), `\n` ( newlines ) and etc.

The <b>.split()</b> method of a string can be used conveniently to extract the data separated by a particular character into a list. When no argument is passed into the method, the separator is a white space by default.

In [20]:
# With no argument

row1 = "1 2 3 4 5"
output1 = row1.split()
print(output1)

# With argument, split using comma

row2 = "6,7,8,9,10"
output2 = row2.split(",")
print(output2)

['1', '2', '3', '4', '5']
['6', '7', '8', '9', '10']


#### <u>Using the strip() method</u>

Notice how when we use the `readlines()` method, there are `\n` newlines at the end of each row - suggesting that it moves on to the next row.

To remove this newline character at the end of each line, as well as leading and trailing white spaces, you can use the `strip()` method of a string when no argument is passed into it

Here are some examples on how to use it

In [26]:
word1 = "      hello     " # removing leading and trailing white space
print(word1.strip())

word2 = "   he   llo    "
print(word2.strip())

word3 = "   he  llo \n\n" # removing newline characters
print(word3.strip())

hello
he   llo
he  llo


You can also specify your own character to strip from the string!

In [27]:
word4 = ",,,hello,"
print(word4.strip(','))

word5 = "hey hey hey hhh"
print(word5.strip('h'))

hello
ey hey hey 


#### <u><b>WITH</b> statement</u>

In Python, the `with`statement is used when working with an external resource, such as a text file, a database file, etc. 

It ensures that the file is automatically closed when the nested code finishes running or when there is an exception that may otherwise potentially jeopardise the integrity of the data stored.

Here is how to use the `with` statement

In [30]:
numbers = []

with open("sample2.txt") as f5: # with open(filename, mode) as <variable>:
    data = f5.readlines()
    for n in data: 
        number_to_append = int(n.strip())
        numbers.append(number_to_append)

print(numbers)

[2, 4, 7, 5, 4, 1, 4]


#### <u><b>CSV</b> Module</u>

The <b>CSV (Comma Separated Values)</b> format is the most common import and export format for spreadsheets and databases. Python's CSV module offers another way for us to read from and write into files.

Take a look at how the <b>.reader()</b> method behaves, paying close attention to the second argument supplied into the method.

In [32]:
import csv

with open("sample4.txt") as f1:
    csv_file = csv.reader(f1, delimiter=',')
    for line in csv_file:
        print(line)

['Hello', 'World', 'This', 'Is', 'The', 'Content', 'Of', 'Sample4', 'Text', 'File']


The following code shows how the <b>.writer()</b> and <b>.writerow()</b> methods behave. Take a look at the file that is created after you run the code.

In [33]:
import csv

lst1 = ['eagle', 'fox', 'giraffe', 'horse']
lst2 = ['iguana', 'jaguar', 'kangaroo', 'llama']

with open("sample5.txt", "w") as f1:
    csv_file = csv.writer(f1)
    csv_file.writerow(lst1)
    csv_file.writerow(lst2)

with open("sample6.txt", "w") as f1:
    csv_file = csv.writer(f1, delimiter = "\n")
    csv_file.writerow(lst1)
    csv_file.writerow(lst2)

<font color=orange>Tip: </font>A CSV file can also be treated like a text file, except every value is seperated by a ",". This can easily be resolved using the `split()` method.
Since the delimiter of CSVs are commas, you can apply `.split(",")` to each row of data to get each individual value!