## Basic File IO in Python 

We already saw how to read and write csv files using Pandas. 
File is just a sequence of bytes stored in the hard-drive. Since all access to files (including keyboard and monitor) is regulated by the operating system, this interfacing with OS. Programming languages use abstractions to hide the
details from the end user. 

In python any open file is represented abstractly as a **file** object aka  file handle. You **open** a file and get such a handle.

#### Opening a file 

In [1]:
fh = open("ThatFile","r")

FileNotFoundError: [Errno 2] No such file or directory: 'ThatFile'

Since there was no such file and exception is raised indicating this. 

In [8]:
fh = open("ThisFile","r")

#### Iterating over a file, linewise

If you simply iterate over the handle iterate line by line:

In [9]:
for l in fh:
    print(l)

This is a file to test out python file io.

Though you see the text here as a sequence of lines,

internally it is just one long sequence of letters.

The newline is recorded as a letter too.

And at the end of the file there is an implicit EOF but that does not count.



Why is there a blank line after every line? 

Because, each line in the file created by the text editor contains a new line character. That belongs to string **l** and so when **l** is printed a new line is printed. Further, print always adds its own newline at the end of the string it prints as you can see below

In [4]:
print("hello")
print("hello")

hello
hello


In [10]:
for l in fh:
    print(l,end="")

Hmm. What went wrong? The file handle interacts with the OS and provides among various things, a "current" location
within the file. So, when you executed the previous for loop this location reached the end of the file. So, when we try to execute it again, we find that there is nothing left to read (and nothing left to print).

In [6]:
fh.close()
fh = open("ThisFile","r")

In [7]:
for l in fh:
    print(l,end="")

This is a file to test out python file io.
Though you see the text here as a sequence of lines,
internally it is just one long sequence of letters.
The newline is recorded as a letter too.
And at the end of the file there is an implicit EOF but that does not count.


#### Reading one line at a time

You can also read a file one line at a time. 

In [None]:
fhagain = open("ThisFile","r")
s = fhagain.readline()
print(s,end="")

In [None]:
s = fh.readline()
print(s,end="")

You can open the same file any number of times and get new handles (**file objects**) and each of them has their own value for the current location in the file. Yet another way this handle abstraction helps.

In [None]:
s = fhagain.readline()
print(s,end="")

#### Moving around the current location

You can move your current location to the beginning of a file at anytime using **seek**

In [None]:
fh.seek(0)
s = fh.readline()
print(s,end="")

You can also find your location in the file

In [None]:
fh.tell()  

If you want to move back by 10 positions you can 

In [None]:
fh.seek(fh.tell() - 10)

In [None]:
print(fh.readline())

#### More ways to read a file

You can also use the function **read** to read from the current location to the end of the file.

In [None]:
s = fh.read()
print(s,end="")

**read** allows you to specify the number of characters you want to read. 

In [None]:
fh.seek(0)
print(fh.read(40))

But remember end of line is a character and it counts.

In [None]:
print(fh.read(5))

You can also read the file as a list of lines.

In [None]:
fh.seek(0)
ls = fh.readlines()
print(ls[0],end="")
print(ls[1],end="")

In [None]:
fh.close()
fh.readline()

If you try to read from a **stale** handle you will generate an exception

#### Writing to a File 

In [11]:
fh = open("ThatFile","w")

This works because, when you open a file to write it is created if it does not exist. Even more important is to remember that if it exists then it is trucated and rewritten!

In [12]:
fh.write("This is the first line\nand this is the second line")

50

**write** returns the number of letters it managed to write.

In [13]:
fh.write("Yet another line")

16

As you will soon find out, it is not actually "yet another line" since there was no new line character inserted.

In [14]:
fr = open("ThatFile","r")
print(fr.read())
fr.close()




That tells you something important. Just because you **write** does not mean that it is written to the actual file by the OS. Sometimes it postpones the write to a later time. Which means, if you try to read the same file in parallel you may not find what believe is in the file. You can force the write by **flush**ing the file. When you close a file it is automatically flushed.

In [15]:
fh.flush()

In [16]:
fr = open("ThatFile","r")
print(fr.read())
fr.close()

This is the first line
and this is the second lineYet another line


You can open a file for reading and writing with the same handle by using "w+" or "r+" as the second argument. The first one will create a file if necessary and truncate existing files. The latter assumes that the file exists and raises and Exception otherwise.

In [17]:
f = open("NewFile","r+")

In [18]:
f = open("NewFile","w+")

In [19]:
f.write("This is a sentence")

18

In [20]:
print(f.read(),end="")

There is just one location for **f** and it is used for reading and writing.

In [5]:
f.seek(0)

0

In [6]:
f.read()

'This is a sentence'

In [7]:
f.write(" A second sentence.\n\nAnd another Para")

37

In [8]:
f.seek(0)

0

In [9]:
print(f.read())

This is a sentence A second sentence.

And another Para


In [10]:
f.seek(f.tell() - 10)

47

In [11]:
print(f.tell())
print(f.read(3))
f.seek(f.tell())

47
oth


50

In [12]:
f.write("What Happens Now?")

17

In [13]:
f.seek(0)
print(f.read())

This is a sentence A second sentence.

And anothWhat Happens Now?


In [14]:
f.seek(0)
f.write("What about now?")

15

In [15]:
f.seek(0)
print(f.read())

What about now?nce A second sentence.

And anothWhat Happens Now?


In [16]:
f.seek(10)
f.write("Another")
f.seek(0)
print(f.read())

What aboutAnothere A second sentence.

And anothWhat Happens Now?


## Modules in Python 

Any python file is a module. You can import it using the **import** command. You need to qualify any name in the module by prefixing the name of the module. You can also say **from module import \*** and have all those names available directly. What happens if mulitple modules with common names are imported? You may also import a module with shortname as in **import numpy as np** 
