## Python File Handling - Text Files


Many computer programs work with files. Files help in stroing information permanently. But disk file reading/writing is slower compared to memory. Disk data is read/write in big buffers by the Operating system and when file is closed those temporary buffers are "flushed" into the disk. That's why it is important to "close" files opened for reading and/or writing. When you open a file, you get a "file handle" - this does setting up of buffers etc. File read/write happens via the handle. When file is closed, data is flushed from buffer to actual disk storage. File handle is "disconnected" from the file at that point.


### Text Files

A text file can be understood as a sequence of characters
consisting of alphabets, numbers and other special
symbols. Files with extensions like .txt, .py, .csv, etc.
are some examples of text files. When we open a text file
using a text editor (e.g., Notepad), we see several lines
of text. However, the file contents are not stored in such
a way internally. Rather, they are stored in sequence
of bytes consisting of 0s and 1s. In ASCII, UNICODE or
any other encoding scheme, the value of each character
of the text file is stored as bytes. So, while opening a
text file, the text editor translates each ASCII value
and shows us the equivalent character that is readable
by the human being. For example, the ASCII value 65 will be displayed by a text
editor as the letter ‘A’ since the number 65 in ASCII character set represents ‘A’.
Each line of a text file is terminated by a special character, called the End of Line (EOL).

### Binary Files

Binary files are also stored in terms of bytes (0s and 1s),
but unlike text files, these bytes do not represent the
ASCII values of characters. Rather, they represent the
actual content such as image, audio, video, compressed
versions of other files, executable files, etc. These files
are not human readable. Thus, trying to open a binary
file using a text editor will show some garbage values.
We need specific software to read or write the contents
of a binary file.

## open, close functions

In [1]:

# open a file for writing

# "file.text" is name of the file. "w" is mode (write mode in this case). "w" mode will overwrite file
# you can use "a" to append more lines to existing content

f = open("file.txt", "w")
f.close()

![file open modes](images/file_open_modes.png)

## The write() method

write() method takes a string as an argument and writes
it to the text file. It returns the number of characters
being written on single execution of the write() method.
Also, we need to add a newline character (\n) at the end
of every sentence to mark the end of line.

In [2]:
# open a file for writing

f = open("file.txt", "w")

# write a single line. Note the '\n' character. Without
# that "write" won't automatically put newline character!

f.write("this is simple text\n")
f.write("this is second line of text\n")

# close the file
f.close()

## The writelines() method

This method is used to write multiple strings to a file.
We need to pass an iterable object like lists, tuple, etc.
containing strings to the writelines() method.

In [3]:
# open a file for writing

f = open("file.txt", "w")


# write a single line. Note the '\n' character. Without
# that "write" won't automatically put newline character!

f.write("this is simple text\n")
f.write("this is second line of text\n")

# write more than one line at a time. Again "\n" needed explicitly
f.writelines([ "this is third line\n", "this is fourth line\n"])


# close the file
f.close()

## Using file as iterable object for reading line by line

In [4]:
# open the same file for reading. "r" standard 
f = open("file.txt", "r")

# file can be iterated to get each line to process it
# Here we read each line from file and print it
for i in f:
    print(i)

f.close()

this is simple text

this is second line of text

this is third line

this is fourth line



## The readline([n]) method

This method reads one complete line from a file where
each line terminates with a newline (\n) character. It
can also be used to read a specified number (n) of bytes
of data from a file but maximum up to the newline
character (\n).

If no argument or a negative number is specified, it
reads a complete line and returns string.

In [5]:
myobject = open("file.txt",'r')
print(myobject.readline(10))
print(myobject.readline())
print(myobject.readline())

this is si
mple text

this is second line of text



## The readlines() method

The method reads all the lines and returns the lines
along with newline as a list of strings.

In [6]:
f = open("file.txt", "r")

# we can read all lines one shot into a list
lines = f.readlines()
print(type(lines))
for i in lines:
    print(i)
f.close()

<class 'list'>
this is simple text

this is second line of text

this is third line

this is fourth line



## read function to read entire file content as a string


### The read() method

This method is used to read a specified number of bytes
of data from a data file. The syntax of read() method is:

```python
    file_object.read(n)
```

In [7]:
f = open("file.txt", "r")

# we can read entire file content in one-shot as a string
s = f.read()

print(s)
f.close()

this is simple text
this is second line of text
this is third line
this is fourth line



In [8]:
f = open("file.txt", "r")

# we can read next n characters as string
s = f.read(10)

print("first 10 chars:", s)

s = f.read(12) # read further 12 chars

print("next 12 chars:", s)

f.close()

first 10 chars: this is si
next 12 chars: mple text
th


## open file in 'append' mode

In [9]:
# open the same file for append. Append mode -> add more text at the end of file
# rather than starting from the beginning of the file

f = open("file.txt", "a")

# write a list of lines into the file
f.writelines(["This is fifth line\n", "This is sixth line\n"])

# close the file
f.close()


In [10]:
f = open("file.txt", "r")

# we can read entire file content in one-shot as a string
s = f.read()

print(s)
f.close()

this is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line



## readline function to read single line at a time

In [11]:
# read single line at a time

f = open("file.txt", "r")

while True:
    line = f.readline()
    if line == "":
        break
    print(line)
    
f.close()

this is simple text

this is second line of text

this is third line

this is fourth line

This is fifth line

This is sixth line



## handling file using 'with' statement

with statement can automatically closes the files opened. There is no need to remember to close 

In [12]:
with open("file.txt", "r") as f:
    # we can read entire file content in one-shot as a string
    s = f.read()
    print(s)
    # file is closed automatically after with statement ends

this is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line



## Setting Offsets in a File

The functions that we have learnt till now are used to
access the data sequentially from a file. But if we want
to access data in a random fashion, then Python gives
us **seek()** and **tell()** functions to do so


### The tell() method

This function returns an integer that specifies the
current position of the file object in the file. The position
so specified is the byte position from the beginning of
the file till the current position of the file object. The
syntax of using tell() is:

```python
    file_object.tell()
```

### The seek() method
This method is used to position the file object at a
particular position in a file. The syntax of seek() is:

```python
    file_object.seek(offset [, reference_point])
```

In the above syntax, offset is the number of bytes by
which the file object is to be moved. reference_point
indicates the starting position of the file object. That is,
with reference to which position, the offset has to be
counted. It can have any of the following values:

* 0 - beginning of the file
* 1 - current position of the file
* 2 - end of file

By default, the value of reference_point is 0, i.e.
the offset is counted from the beginning of the file. For
example, the statement fileObject.seek(5,0) will
position the file object at 5th byte position from the
beginning of the file.

In [13]:
## Program 2-2 Application of seek() and tell()

print("Learning to move the file object")
fileobject = open("file.txt","r+")
print("File content: file.txt\n")
s = fileobject.read()
print(s)

print("Initially, the position of the file object is:", fileobject.tell())
fileobject.seek(0)

print("Now the file object is at the beginning of the file:", fileobject.tell())
fileobject.seek(5)

print("We are moving to 5'th byte position from the beginning of file")
print("The position of the file object is at", fileobject.tell())

print("File content from position 5\n")
s = fileobject.read()
print(s)

Learning to move the file object
File content: file.txt

this is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line

Initially, the position of the file object is: 125
Now the file object is at the beginning of the file: 0
We are moving to 5'th byte position from the beginning of file
The position of the file object is at 5
File content from position 5

is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line



## Program 2-5 To perform reading and writing operation in a text file

In [22]:
# open file for w+ (write plus means write and read)

fileobject = open("report.txt", "w+")

print ("WRITING DATA IN THE FILE")
print() # to display a blank line
while True:
    line = input("Enter a sentence ")
    fileobject.write(line)
    fileobject.write('\n')
    choice = input("Do you wish to enter more data? (y/n): ")
    if choice in ('n','N'):
        break
        
print("The byte position of file object is ", fileobject.tell())
# now change file position
fileobject.seek(0) #places file object at beginning of file

print()
print("READING DATA FROM THE FILE")
s = fileobject.read()
print(s)
    
fileobject.close()

WRITING DATA IN THE FILE

Enter a sentence Hello World
Do you wish to enter more data? (y/n): y
Enter a sentence All the world's stage
Do you wish to enter more data? (y/n): n
The byte position of file object is  34

READING DATA FROM THE FILE
Hello World
All the world's stage



## standard input, output, error as files

Standard input (keyboard), standard output, error (terminal screen) can be treated as files. These special files are available from **sys** module

In [23]:
import sys

In [24]:
sys.stderr.write("Error!!")

Error!!

In [25]:
sys.stdout.write("Howdy")

Howdy

In [26]:
sys.stderr.writelines(["Error 1\n", "Error 2\n"])

Error 1
Error 2


In [27]:
sys.stdout.writelines(["hello\n", "howdy\n"])

hello
howdy


In [28]:
# doesn't work in Jupter notebook. Try with ipython
s = sys.stdin.read()

s

''

## file open with absolute path

f = open("c:\\mydir\\myfile.txt", "r")


    or equivalently using a raw string file pathname
    
f = open(r"c:\mydir\myfile.txt", "r")


## file open with relative path

f = open("..\\parent_dir_file.txt", "r")

    or equivalently using a raw string file pathname
    
f = open(r"..\parent_dir_file.txt", "r")

## CBSE Sample Question Paper (2020-21) Computer Science (083)

Write a function in Python that counts the number of “Me” or “My” words present in a text file “STORY.TXT”.

If the “STORY.TXT” contents are as follows:

```txt
My first book
was Me and
My Family. It 
gave me
chance to be
Known to the world.
```

The output of the function should be:

Count of Me/My in file: 4

In [8]:
# This is to prepare the input file

with open("story.txt", "w") as f:
    f.write("""
My first book
was Me and
My Family. It 
gave me
chance to be
Known to the world.    
    """)

In [10]:
def displayMeMy():
    num = 0
    # "rt" mode is read-text file mode. It is same as "r"
    with open("story.txt","rt") as f:
        N = f.read()
        # print(N)
        M = N.split()
        for x in M:
            if x=="Me" or x== "My":
                # print(x)
                num = num + 1
    print("Count of Me/My in file:",num)
    
displayMeMy()

Count of Me/My in file: 3


## CBSE Sample Question Paper (2020-21) Computer Science (083)


Write a function AMCount() in Python, which should read each character of a text file STORY.TXT,
should count and display the occurance of alphabets A and M (including small cases a and m too).

Example:
    
If the file content is as follows:

```txt
Updated information
As simplified by official websites.
```

The EUCount() function should display the output as:

```txt
A or a: 4
M or m: 2
```

In [14]:
# This is to prepare the input file

with open("story.txt", "w") as f:
    f.write("""
Updated information
As simplified by official websites.  
    """)

In [15]:
def count_A_M():
    with open("story.txt","r") as f:
        A,M = 0,0
        r = f.read()
        for x in r:
            if x[0]=="A" or x[0]=="a" :
                A = A + 1
            elif x[0]=="M" or x[0]=="m":
                M = M + 1
                
    print("A or a: ", A)
    print("M or m: ", M)
    
count_A_M()

A or a:  4
M or m:  2
