## Python File Handling - Text Files


Many computer programs work with files. Files help in stroing information permanently. But disk file reading/writing is slower compared to memory. Disk data is read/write in big buffers by the Operating system and when file is closed those temporary buffers are "flushed" into the disk. That's why it is important to "close" files opened for reading and/or writing. When you open a file, you get a "file handle" - this does setting up of buffers etc. File read/write happens via the handle. When file is closed, data is flushed from buffer to actual disk storage. File handle is "disconnected" from the file at that point.


### Text Files

A text file can be understood as a sequence of characters
consisting of alphabets, numbers and other special
symbols. Files with extensions like .txt, .py, .csv, etc.
are some examples of text files. When we open a text file
using a text editor (e.g., Notepad), we see several lines
of text. However, the file contents are not stored in such
a way internally. Rather, they are stored in sequence
of bytes consisting of 0s and 1s. In ASCII, UNICODE or
any other encoding scheme, the value of each character
of the text file is stored as bytes. So, while opening a
text file, the text editor translates each ASCII value
and shows us the equivalent character that is readable
by the human being. For example, the ASCII value 65 will be displayed by a text
editor as the letter ‘A’ since the number 65 in ASCII character set represents ‘A’.
Each line of a text file is terminated by a special character, called the End of Line (EOL).

### Binary Files

Binary files are also stored in terms of bytes (0s and 1s),
but unlike text files, these bytes do not represent the
ASCII values of characters. Rather, they represent the
actual content such as image, audio, video, compressed
versions of other files, executable files, etc. These files
are not human readable. Thus, trying to open a binary
file using a text editor will show some garbage values.
We need specific software to read or write the contents
of a binary file.

## open, write, close functions

In [2]:

# open a file for writing

# "file.text" is name of the file. "w" is mode (write mode in this case). "w" mode will overwrite file
# you can use "a" to append more lines to existing content

f = open("file.txt", "w")


# write a single line. Note the '\n' character. Without
# that "write" won't automatically put newline character!

f.write("this is simple text\n")
f.write("this is second line of text\n")

# write more than one line at a time. Again "\n" needed explicitly
f.writelines([ "this is third line\n", "this is fourth line\n"])

# close the file
f.close()

![file open modes](images/file_open_modes.png)

## Using file as iterable object for reading line by line

In [3]:
# open the same file for reading. "r" standard 
f = open("file.txt", "r")

# file can be iterated to get each line to process it
# Here we read each line from file and print it
for i in f:
    print(i)

f.close()

this is simple text

this is second line of text

this is third line

this is fourth line



## readlines function to read all lines as a list

In [4]:
f = open("file.txt", "r")

# we can read all lines one shot into a list
lines = f.readlines()
print(type(lines))
for i in lines:
    print(i)
f.close()

<class 'list'>
this is simple text

this is second line of text

this is third line

this is fourth line



## read function to read entire file content as a string

In [5]:
f = open("file.txt", "r")

# we can read entire file content in one-shot as a string
s = f.read()

print(s)
f.close()

this is simple text
this is second line of text
this is third line
this is fourth line



## open file in 'append' mode

In [6]:
# open the same file for append. Append mode -> add more text at the end of file
# rather than starting from the beginning of the file

f = open("file.txt", "a")

# write a list of lines into the file
f.writelines(["This is fifth line\n", "This is sixth line\n"])

# close the file
f.close()


In [7]:
f = open("file.txt", "r")

# we can read entire file content in one-shot as a string
s = f.read()

print(s)
f.close()

this is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line



## read 'n' characters at a time

In [8]:
f = open("file.txt", "r")

# read only 'n' characters
s = f.read(22)

print(s)
f.close()

this is simple text
th


## readline function to read single line at a time

In [9]:
# read single line at a time

f = open("file.txt", "r")

while True:
    line = f.readline()
    if line == "":
        break
    print(line)
    
f.close()

this is simple text

this is second line of text

this is third line

this is fourth line

This is fifth line

This is sixth line



## handling file using 'with' statement

with statement can automatically closes the files opened. There is no need to remember to close 

In [10]:
with open("file.txt", "r") as f:
    # we can read entire file content in one-shot as a string
    s = f.read()
    print(s)
    # file is closed automatically after with statement ends

this is simple text
this is second line of text
this is third line
this is fourth line
This is fifth line
This is sixth line



## standard input, output, error as files

Standard input (keyboard), standard output, error (terminal screen) can be treated as files. These special files are available from **sys** module

In [11]:
import sys

In [12]:
sys.stderr.write("Error!!")

Error!!

In [14]:
sys.stdout.write("Howdy")

Howdy

In [20]:
sys.stderr.writelines(["Error 1\n", "Error 2\n"])

Error 1
Error 2


In [21]:
sys.stdout.writelines(["hello\n", "howdy\n"])

hello
howdy


In [32]:
# doesn't work in Jupter notebook. Try with ipython
s = sys.stdin.read()

s

''

## file open with absolute path

f = open("c:\\mydir\\myfile.txt", "r")


    or equivalently using a raw string file pathname
    
f = open(r"c:\mydir\myfile.txt", "r")


## file open with relative path

f = open("..\\parent_dir_file.txt", "r")

    or equivalently using a raw string file pathname
    
f = open(r"..\parent_dir_file.txt", "r")