# ISE224 LectureNote 6-1: How to work with File I/O
---

**Topics**
1. Introduction to File Input and Output
2. Processing Records
---

### Introduction to File Input and Output

**Concept: When a program needs to save data for later use, it writes the data in a file. The data can be read from the file at a later time.**

So far, the programs/applications you've developed necessitate users to input data each time they execute the program. This is because the data, stored in RAM and referenced by variables, vanishes once the program ceases to operate. To preserve data across multiple instances of the program, a method for saving it is required. By saving data in a `file`, typically located on a computer's disk, it remains accessible even after the program halts. This stored data can then be fetched and utilized at a later point in time.

#### Writing Data

Programmers usually refer to the process of saving data in a file as **“writing data”** to the file. When a piece of data is written to a file, it is ***copied from a variable in RAM to the file***. This is illustrated in the following figure. The term output file is used to describe a file that data is written to. It is called an `output file` because the program stores `output` in it.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-1.png">

#### Reading Data  

Retrieving data from a file is referred to as `"reading data"` from the file. When data is read, it is transferred from the file to RAM and associated with a variable. This concept is depicted in the following figure. A file that provides data for reading is known as an "input file," since it supplies input to the program.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-2.png">

#### How python read/write file?

There are always three steps that must be taken when a file is used by a program.

- **Open the file.** ***Opening a file creates a connection between the file and the program***. Opening an output file usually creates the file on the disk and allows the program to write data to it. Opening an input file allows the program to read data from the file.

- **Process the file.** In this step, data is either written to the file (if it is an output file) or read from the file (if it is an input file).

- **Close the file.** When the program is finished using the file, the ***file must be closed***. Closing a file disconnects the file from the program.

#### Types of Files

In general, there are two types of files: **text** and **binary.**.

- A `text` file contains data that has been encoded as text, using a scheme such as ASCII or Unicode. Even if the file contains numbers, those numbers are stored in the file as a series of characters. As a result, the file may be opened and viewed in a text editor such as Notepad.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-3.png">

-  A `binary` file contains data that has not been converted to text. The data that is stored in a binary file is intended only for a program to read. As a consequence, you cannot view the contents of a binary file with a text editor.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-4.png">


#### File Access Methods

Most programming languages provide two different ways to access data stored in a file: sequential access and direct access.  

- **Sequential Access:** When you work with a sequential access file, you access data from the beginning of the file to the end of the file. If you want to read a piece of data that is stored at the very end of the file, you have to read all of the data that comes before it—you cannot jump directly to the desired data.

- **Direct Access:** When you work with a direct access file (which is also known as a random access file), you can jump directly to any piece of data in the file without reading the data that comes before it. This is similar to the way a CD player or an MP3 player works. You can jump directly to any song that you want to listen to.  

In this class, we will use *sequential access files*. Sequential access files are easy to work with, and you can use them to gain an understanding of basic file operations.

#### Filenames and File Objects

Most computer users are accustomed to the fact that files are identified by a `filename`. For example, when you create a document with a word processor and save the document in a file, you have to specify a filename. When you use a utility such as Windows Explorer to examine the contents of your disk, you see a list of filenames. The following figure shows how three files named `cat.jpg`, `notes.txt`, and `resume.docx` might be graphically represented in Windows.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-5.png">


Each operating system has its own rules for naming files. Many systems support the use of filename extensions, which are short sequences of characters that appear at the end of a filename preceded by a period (which is known as a “dot”). For example, the files depicted in the prior figure have the extensions `.jpg`,`.txt`, and`.doc`. The extension usually indicates the type of data stored in the file. 

- For example, the `.jpg` extension usually indicates that the file contains a graphic image that is compressed according to the JPEG image standard. 

- The `.txt` extension usually indicates that the file contains text. 

- The `.doc` extension (as well as the .docx extension) usually indicates that the file contains a Microsoft Word document.


In order for a program to work with a file on the computer’s disk, the program must create a **file object** in memory. A **file object** is an object that is associated with a *specific file* and provides a way for the program to work with that file. In the program, a variable references the file object. This variable is used to carry out any operations that are performed on the file. 

### Opening a File

You use the `open` function in Python to open a file. 

The `open` function creates a **file object** and associates it with a file on the disk. Here is the general format of how the open function is used:

**Syntax:**
```
file_variable = open(filename, mode)
```

In the general format:

- **file_variable** is the name of the variable that will reference the file object.

- **filename** is a string specifying the name of the file.

- **mode** is a string specifying the mode (reading, writing, etc.) in which the file will be opened.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-6.png">


#### Example.

For example, suppose the file `customers.txt` contains customer data, and we want to open it for reading. Here is an example of how we would call the open function:

In [5]:
customer_file = open('customers.txt', 'r')

After this statement executes, the file named `customers.txt` will be opened, and the variable `customer_file` will reference a file object that we can use to read data from the file.

Suppose we want to create a file named `sales.txt` and write data to it. Here is an example of how we would call the open function:

In [6]:
sales_file = open('sales.txt', 'w')

After this statement executes, the file named `sales.txt` will be created, and the variable `sales_file` will reference a file object that we can use to write data to the file.

---

#### Warning!!!!!

Remember, when you use the 'w' mode, you are creating the file on the disk. If a file with the specified name already exists when the file is opened, the contents of the existing file will be deleted.

---

### Specifying the Location of a File

When you pass a file name that does not contain a path as an argument to the open function, the Python interpreter assumes the file’s location is the same as that of the program. For example, suppose a program is located in the following folder on a Windows computer:

C:\Users\cxc1920\Documents

If the program is running and it executes the following statement, the file test.txt is created in the same folder:

**test_file = open('test.txt', 'w')**

If you want to open a file in a different location, you can specify a path as well as a filename in the argument that you pass to the open function. If you specify a path in a string literal (particularly on a Windows computer), be sure to prefix the string with the letter **r**. Here is an example:

test_file = open(**r**'C:\Users\cxc1920\temp\test.txt', 'w')

- This statement creates the file `test.txt` in the folder `C:\Users\cxc1920\temp`. 

- The **r** prefix specifies that the string is a raw string. This causes the Python interpreter to read the **backslash characters** as literal backslashes. Without the **r** prefix, the interpreter would assume that the backslash characters were part of escape sequences, and an error would occur.

**Syntax:**  
```
test_file = open('C:\\tmp\\test.txt', 'w')
```

In [11]:
test_file = open(r'C:\tmp\test.txt', 'w')

### Writing Data to a File  

you have worked with several of Python’s library functions, and you have even written your own functions. Now, we will introduce you to another type of function, which is known as a `method`. 

A `method` is a function that belongs to an `object` and **performs some operation using that object**. Once you have opened a file, you use the file object’s methods to perform operations on the file.

For example, file objects have a method named write that can be used to write data to a file. Here is the general format of how you call the write method:

**file_variable.write(string)**

Let’s assume customer_file references a file object, and the file was opened for writing with the 'w' mode. 

Here is an example of how we would write the string ‘Charles Pace’ to the file:

In [10]:
customer_file = open('customers.txt', 'w')
customer_file.write('Charles Pace')

12

The following code shows another example:

In [14]:
name = "Charles Pace"
customer_file.write(name)

12

In Python, you use the file object’s close method to close a file. For example, the following statement `closes` the file that is associated with customer_file:

In [16]:
customer_file.close()

#### Example. open an output file, writes data to it, then closes it.

In [22]:
# Program 6 - 1
# This program writes three lines of data
# to a file.
def writefile():
    # Open a file named philosophers.txt.
    outfile = open('philosophers.txt', 'w') # opens the file philosophers.txt using the 'w' mode. 
                                            # It also creates a file object in memory and 
                                            # assigns that object to the outfile variable 
    # Write the names of three philosphers
    # to the file.
    outfile.write('John Locke\n')
    outfile.write('David Hume\n')
    outfile.write('Edmund Burke\n')

    # Close the file.
    outfile.close()

writefile()

### Reading Data From a File

If a file has been opened for reading (using the `'r'` mode) you can use the file object’s read method to read its entire contents into memory. When you call the read method, it returns the file’s contents as a string. 

In [23]:
# Program 6 - 2
# This program reads and displays the contents
# of the philosophers.txt file.
def readfile():
    # Open a file named philosophers.txt.
    infile = open('philosophers.txt', 'r')

    # Read the file's contents.
    file_contents = infile.read()

    # Close the file.
    infile.close()

    # Print the data that was read into
    # memory.
    print(file_contents)

readfile()

John Locke
David Hume
Edmund Burke



Although the read method allows you to easily read the entire contents of a file with one statement, many programs need to read and process the items that are stored in a file **one at a time**. 

For example, suppose a file contains a series of sales amounts, and you need to write a program that calculates the total of the amounts in the file. The program would read each sale amount from the file and add it to an accumulator.

In Python, you can use the readline method to read a line from a file. (A line is simply a string of characters that are terminated with a \n.) The method returns the line as a string, including the \n.

In [25]:
# Program 6 - 3
# This program reads the contents of the
# philosophers.txt file one line at a time.
def Readbyline():
    # Open a file named philosophers.txt.
    infile = open('philosophers.txt', 'r')

    # Read three lines from the file
    line1 = infile.readline()
    line2 = infile.readline()
    line3 = infile.readline()

    # Close the file.
    infile.close()

    # Print the data that was read into
    # memory.
    print(line1)
    print(line2)
    print(line3)
    
Readbyline()

John Locke

David Hume

Edmund Burke



Before we examine the code, notice that a blank line is displayed after each line in the output. This is because each item that is read from the file ends with a newline character (\n). Later, you will learn how to remove the newline character.

<img src="https://raw.githubusercontent.com/cxc1920/ISE224/main/pictures/6-7.png">


### Concatenating a Newline to a String

Program 6-1 wrote three string literals to a file, and each string literal ended with a \n escape sequence. In most cases, the data items that are written to a file are not string literals, but values in memory that are referenced by variables. This would be the case in a program that prompts the user to enter data and then writes that data to a file.

When a program writes data that has been entered by the user to a file, it is usually necessary to concatenate a `\n` escape sequence to the data before writing it. This ensures that each piece of data is written to a separate line in the file

In [38]:
# Program 6 - 4
# This program gets three names from the user
# and writes them to a file.

def WriteNames2File():
    # Get three names.
    print('Enter the names of three friends.')
    name1 = input('Friend #1: ')
    name2 = input('Friend #2: ')
    name3 = input('Friend #3: ')

    # Open a file named friends.txt.
    myfile = open('friends.txt', 'w')

    # Write the names to the file.
    myfile.write(name1 + '\n') # myfile.write(f'{name1}\n')
    myfile.write(name2 + '\n')
    myfile.write(name3 + '\n')

    # Close the file.
    myfile.close()
    print('The names were written to friends.txt.')

WriteNames2File()

Enter the names of three friends.
Friend #1: Joe
Friend #2: Rose
Friend #3: Bob
The names were written to friends.txt.


### Reading a String and Stripping the Newline from it

Sometimes complications are caused by the \n that appears at the end of the strings that are returned from the readline method. For example, did you notice in the sample output of Program 6-3 that a blank line is printed after each line of output? This is because each of the strings that are printed in lines 17 through 19 end with a \n escape sequence. When the strings are printed, the \n causes an extra blank line to appear.

- `[:-1]` 
- `.split('\n')[0]`
- `.rstrip('\n')` 

In [37]:
# Program 6 - 5
# This program reads the contents of the
# philosophers.txt file one line at a time. 
# use [:-1] or .split('\n')[0] or .rstrip('\n') to remove '\n' in the end
def Readbyline():
    # Open a file named philosophers.txt.
    infile = open('philosophers.txt', 'r')

    # Read three lines from the file
    line1 = infile.readline().split('\n')[0]
    line2 = infile.readline().rstrip('\n')
    line3 = infile.readline()[:-1]

    # Close the file.
    infile.close()

    # Print the data that was read into
    # memory.
    print(line1)
    print(line2)
    print(line3)
    
Readbyline()

John Locke
David Hume
Edmund Burke


### Appending Data to an Existing File

When you use the `'w'` mode to open an output file and a file with the specified filename already exists on the disk, the existing file will be deleted and a new empty file with the same name will be created. Sometimes you want to preserve an existing file and append new data to its current contents. Appending data to a file means writing new data to the end of the data that already exists in the file.

In Python, you can use the `'a'` mode to open an output file in append mode, which means the following.

- If the file already exists, it will not be erased. If the file does not exist, it will be created.  
- When data is written to the file, it will be written at the end of the file’s current contents.  

For example, assume the file friends.txt contains the following names, each in a separate line:

For example, assume the file `friends.txt` contains the following names, each in a separate line:

Joe  
Rose  
Bob  

For example, assume the file friends.txt contains the following names, each in a separate line:

In [39]:
# Program 6 - 6
myfile = open('friends.txt', 'a')
myfile.write('Matt\n')
myfile.write('Chris\n')
myfile.write('Suze\n')
myfile.close()

### Writing and Reading Numeric Data

Strings can be written directly to a file with the write method, but numbers must be converted to strings before they can be written. Python has a built-in function named str that converts a value to a string. For example, assuming the variable num is assigned the value 99, the expression str(num) will return the string '99'.

In [40]:
# Example.
myfile = open('tmp.txt', 'a+')
myfile.write(99)
myfile.close()

TypeError: write() argument must be str, not int

In [44]:
# Example
myfile = open('tmp.txt', 'a+')
myfile.write('99')
myfile.write('101')
myfile.close()

If you check the `tmp.txt`, you will find that the data stored is '99101'. How to save the numbers in newline?

**outfile.write(str(num1) + '\n')**

In [42]:
# Program 6 - 7
# This program demonstrates how numbers
# must be converted to strings before they
# are written to a text file.

def WriteNumber2File():
    # Open a file for writing.
    outfile = open('numbers.txt', 'w')

    # Get three numbers from the user.
    num1 = int(input('Enter a number: '))
    num2 = int(input('Enter another number: '))
    num3 = int(input('Enter another number: '))

    # Write the numbers to the file.
    outfile.write(str(num1) + '\n')
    outfile.write(str(num2) + '\n')
    outfile.write(str(num3) + '\n')

    # Close the file.
    outfile.close()
    print('Data written to numbers.txt')

WriteNumber2File()

Enter a number: 22
Enter another number: 14
Enter another number: -99
Data written to numbers.txt


When you read numbers from a text file, they are always read as `strings`. For example, suppose a program uses the following code to read the first line from the numbers.txt

In [47]:
infile = open('numbers.txt', 'r')
value = infile.readline()
print(value)
print(type(value))
infile.close()

22

<class 'str'>


Recall from Chapter 2 that Python provides the built-in function int to convert a string to an integer, and the built-in function float to convert a string to a floating-point number. 

For example, we could modify the code previously shown as follows:

In [48]:
infile = open('numbers.txt', 'r')
string_input = infile.readline()
value = int(string_input)
print(value)
print(type(value))
infile.close()

22
<class 'int'>


In [49]:
infile = open('numbers.txt', 'r')
value = int(infile.readline())
print(value)
print(type(value))
infile.close()

22
<class 'int'>


A more complete demonstration. The contents of the numbers.txt file are read, converted to integers, and added together.

In [50]:
# Program 6 - 8
# This program demonstrates how numbers that are
# read from a file must be converted from strings
# before they are used in a math operation.

def main():
    # Open a file for reading.
    infile = open('numbers.txt', 'r')

    # Read three numbers from the file.
    num1 = int(infile.readline())
    num2 = int(infile.readline())
    num3 = int(infile.readline())

    # Close the file.
    infile.close()

    # Add the three numbers.
    total = num1 + num2 + num3

    # Display the numbers and their total.
    print(f'The numbers are: {num1}, {num2}, {num3}')
    print(f'Their total is: {total}')

# Call the main function.
if __name__ == '__main__':
    main()

The numbers are: 22, 14, -99
Their total is: -63


#### Note: What does the if __name__ == "__main__": do in Python?

A Python programme uses the condition if __name__ == '__main__' to only run the code inside the if statement when the program is run directly by the Python interpreter. The code inside the if statement is not executed when the file's code is imported as a module.