# Reading Files Python

Estimated time needed: **40** minutes

## Objectives

After completing this lab you will be able to:

*   Read text files using Python libraries

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">
    <ul>
        <li><a href="https://download/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2022-01-01">Download Data</a></li>
        <li><a href="https://read/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2022-01-01">Reading Text Files</a></li>
        <li><a href="https://better/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2022-01-01">A Better Way to Open a File</a></li>
    </ul>

</div>

---

## Downloading Data over HTTP

In [1]:
## Import URL handling module 
import urllib.request

In [2]:
## Full path URL to the file for download
url = "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/data/example1.txt"
## Downloaded file name
downloaded_file = "DownloadedFile.txt"
## Download a file from the internet
urllib.request.urlretrieve(url, downloaded_file)

('DownloadedFile.txt', <http.client.HTTPMessage at 0x230e5d8db80>)

In [3]:
## Wget as command-line utility for downloading files from the web
!wget -O /resources/data/read_file_name.txt https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/data/example1.txt

'wget' is not recognized as an internal or external command,
operable program or batch file.


<h2 id="read">Reading Text Files</h2>


One way to read or write a file in Python is to use the built-in `open()` function. The function returns a file object that contains the methods and attributes we need in order to read, save, and manipulate the file. In this notebook, we will only cover `.txt` files.

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/ReadOpen.png" width="500" />


The mode argument is optional and the default value is `r`. In this notebook we only cover two modes:

- `r`: Read mode for reading files
- `w`: Write mode for writing files

The content of the example file is shown below.

```markdown
This is line 1
This is line 2
This is line 3
```

In [4]:
## Read the file
read_file_name = "DownloadedFile.txt"
read_file = open(read_file_name, "r")

We can view the attributes of the file.


In [5]:
## Print file name
print(read_file.name)
## Show the mode of the file 
print(read_file.mode)

DownloadedFile.txt
r


In [6]:
## Read the file content
file_content = read_file.read()
## The read-in file content as string
print(type(file_content))
## Print raw content
print(repr(file_content))
## Print the file content with "\n" as a new line
print(file_content)

<class 'str'>
'This is line 1 \nThis is line 2\nThis is line 3'
This is line 1 
This is line 2
This is line 3


`/n` means new line.

It is important that the file is closed in the end. This frees up resources and ensures consistency across different python versions.

In [7]:
## Close file to free up resources
read_file.close()

<h2 id="better">A Better Way to Open a File</h2>


Using the `with` statement is better practice. It automatically closes the file even if the code encounters an exception. The code will run everything in the indent block then close the file object.

In [8]:
## Automatically close the file with the "with" statement
with open(read_file_name, "r") as read_file:
	file_content = read_file.read()
	print(file_content)

This is line 1 
This is line 2
This is line 3


In [9]:
## Verify if the file is closed
print(read_file.closed)

True


We can see the content in the file.

In [10]:
## Print the file content with "\n" as a new line
print(file_content)

This is line 1 
This is line 2
This is line 3


The syntax is a little confusing as the file object comes after the `as` keyword. Also, We don't explicitly close the file. 

We summarize the steps in the figure below.

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/ReadWith.png" width="500" />

We don't have to read the entire file. For example, we can read the 1st 4 characters.

In [11]:
## Read 1st N characters
with open(read_file_name, "r") as read_file:
	print(read_file.read(4))

This


Once the method `read(4)` is called, the 1st 4 characters are called. If we call the method again, the next 4 characters are called. The output for the following cell will demonstrate the process for different inputs to the method `read()`.

In [12]:
## Read a certain amount of characters each time
with open(read_file_name, "r") as read_file:
	print(read_file.read(4))
	print(read_file.read(4))
	print(read_file.read(7))
	print(read_file.read(15))

This
 is 
line 1 

This is line 2


See the illustration below. Each color represents the part of the file read after the method `read()` is called.

<img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-PY0101EN-SkillsNetwork/labs/Module%204/images/read.png" width="500" />

Here is an example using the same file, but instead we read 16, 5, and then 9 characters at a time.

In [13]:
## Read a certain amount of characters each time
with open(read_file_name, "r") as read_file:
	print(read_file.read(16))
	print(read_file.read(5))
	print(read_file.read(9))

This is line 1 

This 
is line 2


We can also use `readline()` to read one line of the file at a time.

In [14]:
## Read one line at a time
with open(read_file_name, "r") as read_file:
	print("Read a line: " + read_file.readline())
	print("Read a line: " + read_file.readline())

Read a line: This is line 1 

Read a line: This is line 2



We can also pass an argument to `readline()` to specify the number of charecters we want to read. However, unlike `read()`, `readline()` can only read one line at most.

In [15]:
with open(read_file_name, "r") as read_file:
	print(read_file.readline(20)) ## Not read past the end of the line
	print(read_file.read(20)) ## Return the next 20 characters

This is line 1 

This is line 2
This 


We can use a loop to iterate through each line:


In [16]:
## Iterate through the lines in file
with open(read_file_name, "r") as read_file:
	i = 0
	for line in read_file:
		print(f"Iteration {i}: {line}")
		i = i + 1

Iteration 0: This is line 1 

Iteration 1: This is line 2

Iteration 2: This is line 3


We can use the method <code>readlines()</code> to save the text file to a list:


In [17]:
## Read all lines as a list
with open(read_file_name, "r") as read_file:
	list_lines = read_file.readlines()
	print(list_lines)

['This is line 1 \n', 'This is line 2\n', 'This is line 3']


Each element of the list corresponds to a line of text:


In [18]:
## Print the 1st line
print(list_lines[0])
## Print the 2nd line
print(list_lines[1])
## Print the 3rd line
print(list_lines[2])

This is line 1 

This is line 2

This is line 3


---

Author(s):

- [Joseph Santarcangelo](https://www.linkedin.com/in/joseph-s-50398b136/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2022-01-01)

Other Contributor(s):

- [Mavis Zhou](https://www.linkedin.com/in/jiahui-mavis-zhou-a4537814a?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkPY0101ENSkillsNetwork19487395-2022-01-01)