### A. Text Files and Lines

Recall that a Python string can be thought of as a sequence of characters. In a similar way, a text file can be thought of as a sequence of lines

For example, consider the following sample of a text file

    From stephen.marquard@uct.ac.za Sat Jan 5 09:14:16 2008
    Return-Path: <postmaster@collab.sakaiproject.org>
    Date: Sat, 5 Jan 2008 09:12:18 -0500
    To: source@collab.sakaiproject.org
    From: stephen.marquard@uct.ac.za
    Subject: [sakai] svn commit: r39772 - content/branches/
    Details: http://source.sakaiproject.org/viewsvn/?view=rev&rev=39772

These files are in a standard format for a file containing multiple mail messages. The lines which start with “From ” separate the messages and the lines which start with “From:” are part of the messages. For more information about the mbox format, see en.wikipedia.org/wiki/Mbox.

To break the file into lines, there is a special character that represents the “end of the line” called the newline character.

### B. Newline

In Python, the newline character is represented by \n

*(Even though this looks like two characters, it is actually a single character.)*

In [1]:
mystr = "A\nB"
print(mystr)

A
B


In [2]:
len(mystr)

3

**Note:** 
*So when we look at the lines in a file, we need to imagine that there is a special invisible character called the newline at the end of each line that marks the end of the line.*

### C. Reading Files

In [3]:
import os

In [4]:
os.chdir(r"C:\Users\Goutham-ROG\Downloads") #if u want to use backwardslash use'r' before the path

In [5]:
#File handle
fhand = open("mbox-short -S5.txt")

In [6]:
fhand

<_io.TextIOWrapper name='mbox-short -S5.txt' mode='r' encoding='cp1252'>

**Note:**
*File handle does not contain the data for the file*

**1. Reading the data using a loop**

We can easily construct a for loop to read through and count each of the lines in a file:

In [None]:
#Reading all the  lines:
for line in fhand:
    print(line)

In [7]:
#to get specific line :
c = 0
fhand = open("mbox-short -S5.txt")
for line in fhand:
    c += 1
    if c == 6:
        print(line)
        break

X-Sieve: CMU Sieve 2.3



In [8]:
fhand = open("mbox-short -S5.txt")
for i, line in enumerate(fhand): #enumerate is used to skip the additional counter
    if i == 5:
        print(line)
        break

X-Sieve: CMU Sieve 2.3



In [9]:
#reading 10lines
count = 0
fhand = open("mbox-short -S5.txt")
for line in fhand:
    print(line)
    count += 1
    if count == 5:
        break


From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008

Return-Path: <postmaster@collab.sakaiproject.org>

Received: from murder (mail.umich.edu [141.211.14.90])

	 by frankenstein.mail.umich.edu (Cyrus v2.3.8) with LMTPA;

	 Sat, 05 Jan 2008 09:14:16 -0500



In [10]:
#reading 10lines
count = 0
fhand = open("mbox-short -S5.txt")
for line in fhand:
    line = line.rstrip() #removes any newline char at the end
    print(line)
    count += 1
    if count == 5:
        break

From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008
Return-Path: <postmaster@collab.sakaiproject.org>
Received: from murder (mail.umich.edu [141.211.14.90])
	 by frankenstein.mail.umich.edu (Cyrus v2.3.8) with LMTPA;
	 Sat, 05 Jan 2008 09:14:16 -0500


In [11]:
n = eval(input("Enter the line no:"))
fhand = open("mbox-short -S5.txt")
count = 0
for line in fhand:
    line = line.rstrip()
    count += 1
    if count == n:
        print(line)
        break

Enter the line no:5
	 Sat, 05 Jan 2008 09:14:16 -0500


In [12]:
def l(n):
    #n = eval(input("Enter the line no:"))
    #fhand = open("mbox-short -S5.txt")
    for i, line in enumerate(fhand): #enumerate is used to skip the additional counter
        if i == n-1:
            line = line.rstrip()
            print(line)
            break

l(25)

	by nakamura.uits.iupui.edu (8.12.11.20060308/8.12.11) with ESMTP id m05ECJVp010329


In [13]:
l(6)



In [14]:
#Total number of lines present in the file:
n_l = 0
fhand = open("mbox-short -S5.txt")
for i in fhand:
    n_l += 1
print("No. of lines:",n_l)


No. of lines: 1910


**2. Reading data using the read method for files**

In [None]:
#Reading data using read() method:
fhand = open("mbox-short.txt")



In [None]:
#length of the data


In [None]:
#Let's see how the data is read


**Disadvantage:**

Remember that this form of the open function should only be used if the file data will fit comfortably in the main memory of your computer. If the file is too large to fit in main memory, you should write your program to read the file in chunks using a for or while loop.

### D. Letting the user choose the file name

### E. Using try, except and open

### F. Searching through the file

a) For example, if we wanted to read a file and only print out lines which started with the prefix “From:

b) Can we have the list of email id?

c) Extract lines which contain the string “@uct.ac.za” (i.e., they come from the University of Cape Town in South Africa):

d) How many emails were received from University of Cape Town