## **Objectives**

After completing this lab you will be able to:

* Read text files using Python libraries


---

## Download Data

In [2]:
# Uncomment these if working locally, else let the following code cell run.

import urllib.request

url = 'https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/data/example1.txt'
filename = 'Example1.txt'
urllib.request.urlretrieve(url, filename)

## Download Example file
print(f"Downloaded {filename}")

Downloaded Example1.txt


---

## Reading Text Files

One way to read or write a file in Python is to use the built-in `open` function. The `open` function provides a **File object** that contains the methods and attributes you need in order to read, save, and manipulate the file. In this notebook, we will only cover **.txt** files. The first parameter you need is the file path and the file name. An example is shown as follow:

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/ReadOpen.png" width="500">

The mode argument is optional and the default value is **r**. In this notebook we only cover two modes:

<ul>
    <li>**r**: Read mode for reading files </li>
    <li>**w**: Write mode for writing files</li>
</ul>

For the next example, we will use the text file **Example1.txt**. The file is shown as follows:

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/ReadFile.png" width="100">

We read the file:

In [5]:
# Read the Example1.txt

example1 = "example1.txt"
file1 = open(example1, "r")

We can view the attributes of the file.

The name of the file:

In [6]:
# Print the path of file

file1.name

'example1.txt'

The mode the file object is in:

In [7]:
# Print the mode of file, either 'r' or 'w'

file1.mode

'r'

We can read the file and assign it to a variable :

In [8]:
# Read the file

FileContent = file1.read()
FileContent

'This is line 1 \nThis is line 2\nThis is line 3'

The **/n** means that there is a new line.

We can print the file:

In [9]:
# Print the file with '\n' as a new line

print(FileContent)

This is line 1 
This is line 2
This is line 3


The file is of type string:

In [10]:
# Print out the type of the file content

type(FileContent)

str

It is very important that the file is closed in the end. This frees up resources and ensures consistency across different Python versions.

In [12]:
# Close the file after we are done using it

file1.close()

In [1]:
with open("Example1.txt","r") as file1: 

     file_stuff=file1.readline() 

print(file_stuff) 

This is line 1 



---

## A Better Way to Open a File

Using the `with` statement is better practice, it automatically closes the file even if the code encounters an exception. The code will run everything in the indent block then close the file object.

In [16]:
# Open the file using the 'with' statement

with open(example1, "r") as file1:
    FileContent = file1.read()
    print(FileContent)

This is line 1 
This is line 2
This is line 3


The file object is closed, we can verify this by running the following cell:

In [17]:
# Verify the file has been successfully closed

file1.closed

True

We can see the info in the file:

In [18]:
# Check the contents of the file

print(FileContent)

This is line 1 
This is line 2
This is line 3


The syntax is a little confusing as the file object is after the `as` staement. We also don't explicitly close the file. Therefore we summarize the steps in a figure:

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/ReadWith.png" width="500">

We don't have to read the entire file, for example, we can read the first 4 characters by entering three as a parameter to the method **.read()**

In [21]:
# Read the first four characters in the file

with open(example1, "r") as file1:
    print(file1.read(4))

This


Once the method `.read(4)` is called the first 4 characters are called. If we call the method again, the next 4 characters are called. The output for the following cell will demonstrate the process for different inputs to the method `read()`:

In [22]:
# Read different amounts of characters

with open(example1,"r") as file1:
    print(file1.read(4))
    print(file1.read(4))
    print(file1.read(7))
    print(file1.read(15))

This
 is 
line 1 

This is line 2


The process is illustrated in the below figure, and each color represents the part of the file read after the method `read()` is called:

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/read.png" width="500">


Here is an example of using the same file, but instead we read 16, 5, and then 9 characters at a time:

In [25]:
# Read different amounts of characters

with open(example1,'r') as file1:
    print(file1.read(16))
    print(file1.read(5))
    print(file1.read(9))

This is line 1 

This 
is line 2


We can also read one line of the file at a time using the method `readline()`:

In [28]:
# Read one full line

with open(example1,'r') as file1:
    print("First line: " + file1.readline())

First line: This is line 1 



We can also pass an arguement to `readline()` to specify the number of characters we want to read. However, unlike `read()`, `readline()` can only read one line at most.

In [27]:
with open(example1,'r') as file1:

    # Does not read past the end of the line
    print(file1.readline(20))

    # Returns the next 20 characters
    print(file1.read(20))

This is line 1 

This is line 2
This 


We can use a loop to iterate through each line:

In [30]:
# Iterate through each line using a for loop

# Open te file
with open(example1,'r') as file1:

    # Set our for loop index variable
    i = 0;

    # Define our for loop, print the strings in increments, update index
    for line in file1:
        print("Iteration", str(i), ": ", line)
        i += 1

Iteration 0 :  This is line 1 

Iteration 1 :  This is line 2

Iteration 2 :  This is line 3


We can use the medthod `readlines()` to save the text file to a list:

In [41]:
# Read all the lines from the text file and save them to a list

with open(example1,'r') as file1:
    FileAsList = file1.readlines()

Each element of the list corresponds to a line of text:

In [42]:
# Print the first line

FileAsList[0]

'This is line 1 \n'

In [43]:
# Print the second line

FileAsList[1]

'This is line 2\n'

In [34]:
# Print the third line

FileAsList[2]

'This is line 3'

In [44]:
# Print the entire list

FileAsList

['This is line 1 \n', 'This is line 2\n', 'This is line 3']

In [46]:
# Remove '\n' from each line

FileAsList = [line.strip() for line in FileAsList]
print(FileAsList)

['This is line 1', 'This is line 2', 'This is line 3']
