<h1>Reading Files Python</h1>

<p><strong>Welcome!</strong> This notebook will teach you about reading the text file in the Python Programming Language.</p>

<h2 id="download">Download Data</h2>

In [None]:
# Download Example file

!wget -O Example1.txt https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/labs/example1.txt

--2020-08-06 03:42:22--  https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/labs/example1.txt
Resolving s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)... 67.228.254.196
Connecting to s3-api.us-geo.objectstorage.softlayer.net (s3-api.us-geo.objectstorage.softlayer.net)|67.228.254.196|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 45 [text/plain]
Saving to: ‘Example1.txt’


2020-08-06 03:42:22 (7.92 MB/s) - ‘Example1.txt’ saved [45/45]



<hr>

<h2 id="read">Reading Text Files</h2>

One way to read or write a file in Python is to use the built-in <code>open</code> function. The <code>open</code> function provides a <b>File object</b> that contains the methods and attributes you need in order to read, save, and manipulate the file. In this notebook, we will only cover <b>.txt</b> files. The first parameter you need is the file path and the file name. An example is shown as follow:

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Images/ReadOpen.png" width="500" />

 The mode argument is optional and the default value is <b>r</b>. In this notebook we only cover two modes: 
<ul>
    <li><b>r</b> Read mode for reading files </li>
    <li><b>w</b> Write mode for writing files</li>
</ul>

For the next example, we will use the text file <b>Example1.txt</b>. The file is shown as follow:

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Images/ReadFile.png" width="200" />

 We read the file: 

In [None]:
# Read the Example1.txt

example1 = "Example1.txt"
file1 = open(example1, "r")

type(file1)

_io.TextIOWrapper

 We can view the attributes of the file.

The name of the file:

In [None]:
# Print the path of file

print(file1.name)

Example1.txt


 The mode the file object is in:

In [None]:
# Print the mode of file, either 'r' or 'w'

file1.mode

'r'

We can read the file and assign it to a variable :

In [None]:
# Read the file

FileContent = file1.read()
FileContent

'This is line 1 \nThis is line 2\nThis is line 3'

The <b>/n</b> means that there is a new line. 

We can print the file: 

In [None]:
# Print the file with '\n' as a new line

print(FileContent)

This is line 1 
This is line 2
This is line 3


The file is of type string:

In [None]:
# Type of file content

type(FileContent)

str

 We must close the file object:

In [None]:
# Close file after finish

file1.close()

<hr>

<h2 id="better">A Better Way to Open a File</h2>

Using the <code>with</code> statement is better practice, it automatically closes the file even if the code encounters an exception. The code will run everything in the indent block then close the file object. 

In [None]:
# Open file using with

with open(example1, "r") as file1:
    FileContent = file1.read()
    print(FileContent)

This is line 1 
This is line 2
This is line 3


The file object is closed, you can verify it by running the following cell:  

In [None]:
# Verify if the file is closed

file1.closed

True

 We can see the info in the file:

In [None]:
# See the content of file

print(FileContent)

This is line 1 
This is line 2
This is line 3


The syntax is a little confusing as the file object is after the <code>as</code> statement. We also don’t explicitly close the file. Therefore we summarize the steps in a figure:

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Images/ReadWith.png" width="500" />

We don’t have to read the entire file, for example, we can read the first 4 characters by entering three as a parameter to the method **.read()**:


In [None]:
# Read first four characters

with open(example1, "r") as file1:
    print(file1.read(4))

This


Once the method <code>.read(4)</code> is called the first 4 characters are called. If we call the method again, the next 4 characters are called. The output for the following cell will demonstrate the process for different inputs to the method <code>read()</code>:

In [None]:
# Read certain amount of characters

with open(example1, "r") as file1:
    print(file1.read(4))
    print(file1.read(4))
    print(file1.read(7))
    print(file1.read(15))

This
 is 
line 1 

This is line 2


The process is illustrated in the below figure, and each color represents the part of the file read after the method <code>read()</code> is called:

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Images/ReadChar.png" width="500" />

 Here is an example using the same file, but instead we read 16, 5, and then 9 characters at a time: 

In [None]:
# Read certain amount of characters

with open(example1, "r") as file1:
    print(file1.read(16))
    print(file1.read(5))
    print(file1.read(9))

This is line 1 

This 
is line 2


We can also read one line of the file at a time using the method <code>readline()</code>: 

In [None]:
# Read one line

with open(example1, "r") as file1:
    print("first line: " + file1.readline())

first line: This is line 1 



 We can use a loop to iterate through each line: 


In [None]:
# Iterate through the lines

with open(example1,"r") as file1:
        i = 0;
        for line in file1:
            print("Iteration", str(i), ": ", line)
            i = i + 1;

Iteration 0 :  This is line 1 

Iteration 1 :  This is line 2

Iteration 2 :  This is line 3


We can use the method <code>readlines()</code> to save the text file to a list: 

In [None]:
# Read all lines and save as a list

with open(example1, "r") as file1:
    FileasList = file1.readlines()

 Each element of the list corresponds to a line of text:

In [None]:
# Print the first line

FileasList[0]

'This is line 1 \n'

In [None]:
# Print the second line

FileasList[1]

'This is line 2\n'

In [None]:
# Print the third line

FileasList[2]

'This is line 3'

<h2 id="write">Writing Files</h2>

 We can open a file object using the method <code>write()</code> to save the text file to a list. To write the mode, argument must be set to write <b>w</b>. Let’s write a file <b>Example2.txt</b> with the line: <b>“This is line A”</b>

In [None]:
# Write line to file

with open('Example2.txt', 'w') as writefile:
    writefile.write("This is line A")

 We can read the file to see if it worked:

In [None]:
# Read file

with open('Example2.txt', 'r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B



In [None]:
# Write lines to file

with open('Example2.txt', 'w') as writefile:
    writefile.write("This is line A\n")
    writefile.write("This is line B\n")

The method <code>.write()</code> works similar to the method <code>.readline()</code>, except instead of reading a new line it writes a new line. The process is illustrated in the figure , the different colour coding of the grid represents a new line added to the file after each method call.

<img src="https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/PY0101EN/Chapter%204/Images/WriteLine.png" width="500" />

You can check the file to see if your results are correct 

In [None]:
# Check whether write to file

with open('Example2.txt', 'r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B



 By setting the mode argument to append **a**  you can append a new line as follows:

In [None]:
# Write a new line to text file

with open('Example2.txt', 'a') as testwritefile:
    testwritefile.write("This is line C\n")

 You can verify the file has changed by running the following cell:

In [None]:
# Verify if the new line is in the text file

with open('Example2.txt', 'r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B
This is line C



 We write a list to a <b>.txt</b> file  as follows:

In [None]:
# Sample list of text

Lines = ["This is line A\n", "This is line B\n", "Line 3 okay\n"]
Lines

['This is line A\n', 'This is line B\n', 'Line 3 okay\n']

In [None]:
# Write the strings in the list to text file

with open('Example3.txt', 'w') as writefile:
    for line in Lines:
        print(line)
        writefile.write(line)

This is line A

This is line B

Line 3 okay



 We can verify the file is written by reading it and printing out the values:  

In [None]:
# Verify if writing to file is successfully executed

with open('Example3.txt', 'r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B
Line 3 okay



We can again append to the file by changing the second parameter to <b>a</b>. This adds the code:

In [None]:
# Append the line to the file

with open('Example2.txt', 'a') as testwritefile:
    testwritefile.write("This is line E\n")

We can see the results of appending the file: 

In [None]:
# Verify if the appending is successfully executed

with open('Example2.txt', 'r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B
This is line D
This is line E



<h2 id="copy">Copy a File</h2> 

Let's copy the file <b>Example2.txt</b> to the file <b>Example3.txt</b>:

In [None]:
# Copy file to another

with open('Example2.txt','r') as readfile:
    with open('Example3.txt','w') as writefile:
          for line in readfile:
                writefile.write(line)

We can read the file to see if everything works:

In [None]:
# Verify if the copy is successfully executed

with open('Example3.txt','r') as testwritefile:
    print(testwritefile.read())

This is line A
This is line B
Line 3 okay



After reading files, we can also write data into files and save them in different file formats like **.txt, .csv, .xls (for excel files) etc**. Let's take a look at some examples.

Now go to the directory to ensure the <b>.txt</b> file exists and contains the summary data that we wrote.