## File I/O

Python has a built-in function called `open()` that can be used to
manipulate files. The help information for open is below:

In [3]:
help(open)

Help on built-in function open in module io:

open(...)
    open(file, mode='r', buffering=-1, encoding=None,
         errors=None, newline=None, closefd=True, opener=None) -> file object
    
    Open file and return a stream.  Raise IOError upon failure.
    
    file is either a text or byte string giving the name (and the path
    if the file isn't in the current working directory) of the file to
    be opened or an integer file descriptor of the file to be
    wrapped. (If a file descriptor is given, it is closed when the
    returned I/O object is closed, unless closefd is set to False.)
    
    mode is an optional string that specifies the mode in which the file
    is opened. It defaults to 'r' which means open for reading in text
    mode.  Other common values are 'w' for writing (truncating the file if
    it already exists), 'x' for creating and writing to a new file, and
    'a' for appending (which on some Unix systems, means that all writes
    append to the end of the f

In [6]:
help(strip)

NameError: name 'strip' is not defined

The main two parameters we'll need to worry about are the name of the
file and the mode, which determines whether we can read from or write to
the file. `open()` returns a file object, which acts like a pointer into the file.
An example will make this clear. In the code below, I've opened a file
that contains one line:

    $ cat testfile.txt
    abcde
    fghij

Now let's open this file in Python:

In [3]:
f = open('testfile.txt','r')
f.read()

'test it for the first time\nsecond one\n'

The second input, 'r' means I want to open the file for reading only. I
can not write to this handle. 

In [5]:
#help(f.read)

In [None]:
f.close() # close the old handle

In [None]:
f.read()  # can't read anymore because the file is closed.

The file we are using is a long series of characters, but two of the
characters are new line characters. If we looked at the file in
sequence, it would look like "abcde\nfghij\n". Separating a file into
lines is popular enough that there are two ways to read whole lines in a
file. The first is to use the readlines() method:

In [32]:
f = open('testfile.txt','r')
lines = f.readlines()
print (lines)
f.close()  # Always close the file when you are done with it

['test it for the first test\n', 'second one\n']


In [33]:
f = open('testfile.txt')
num_lines = sum(1 for line in f)
f.close() 
num_lines

2

In [24]:
sum([1,2])

3

A very important point about the readline method is that it *keeps* the
newline character at the end of each line. You can use the strip()
method to get rid of the string.

File handles are also iterable, which means we can use them in for loops
or list extensions:

In [80]:
test_array = []

In [82]:
import glob
files = glob.glob("*.txt")
for i in range (0,len(files)):
    file = open('testfile.txt','r')
    lines = [line.strip() for line in file]
    for i in range (0,len(lines)):
        if 'Country of origin' in lines[i] or \
           'Previous institution' in lines[i] or \
           'Area of interest' in lines[i] :
            test_array.append(lines[i])
f.close()

In [83]:
lines

['Name: Reem',
 'Surname: Omer',
 'Sex: F',
 'Country of origin: Sudan',
 'Previous institution: SUST',
 'Current institution: AIMS',
 'Area of interest: Reading']

In [84]:
test_array

['Country of origin: Sudan',
 'Previous institution: SUST',
 'Area of interest: Reading',
 'Country of origin: Sudan',
 'Previous institution: SUST',
 'Area of interest: Reading',
 'Country of origin: Sudan',
 'Previous institution: SUST',
 'Area of interest: Reading',
 'Country of origin: Sudan',
 'Previous institution: SUST',
 'Area of interest: Reading']

In [79]:
import glob
files = glob.glob("*.txt")
files

['richie.txt', 'testfile.txt', 'WriteaFile.txt']

In [10]:
Lines = []
f = open('testfile.txt','r')
for line in f:
    Lines.append(line.strip())
f.close()

print (Lines)

In [3]:
import fileinput
with fileinput.FileInput('sampleText.txt', inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace('time', 'test'), end='')

In [20]:
f = open('testfile.txt','r+')
filedata = f.read()
f.close()

newdata = filedata.replace("second","new data")

f = open('WriteaFile.txt','w')
f.write(newdata)
f.close()


In [108]:
line = 'test it for the ; first time'
rtr = line.partition('; ')[2].rstrip()
rtr

'first time'

In [92]:
url = 'abcdc.com'
if url.endswith('.com'):
    url = url[:-4]
url

'abcdc'

In [94]:
import re
url = 'abcdc.com'
url = re.sub('\.com$', '', url)
url

'abcdc.com'

In [9]:
#Confirm here that the two lists are the same


These are equivalent operations. It's often best to handle a file one
line at a time, particularly when the file is so large it might not fit
in memory.


*Caution!* Opening a file with the 'w' option means start writing *at
the beginning*, which may overwrite old material. If you want to append
to the file without losing what is already there, open it with 'a'.

Writing to a file uses the write() command, which accepts a string.

## Writing to a file

Now suppose  we want to create a file and write the following  lines to it:
   
    The programming in design space class was cool today!
    I really enjoyed it.
    I can't  afford to miss  the next class.

First create  the file you want to  write to. Choose a suitable name for your file, say,   WriteaFile.txt.

In [16]:
fpr = open('WriteaFile.txt', 'w')                # Create a file named Write2File.txt for writing ('w'), fpr is the file pointer or the file object
fpr.write('The python class was cool today!\n')  # first line. The symbol \n indicates the end of line and that the cursor will be positioned at the beginning of the next line
fpr.write('I really enjoyed it\n')
fpr.write("I can't afford to miss the next class")# beware of the inner single quote in the word can't. Use double quote here instead of a single quote

fpr.close()                                            # always close your file, otherwise it will be empty!

If you forgot the new line character \n, your file will contain only one line!
The file WriteaFile.txt will be created in the current working directory.   You can use gedit or your favorite editor to view the file you have created. 

In [None]:
outfile = open('outfile.txt','w')
outfile.write('This is the first line!')
outfile.close()

Suppose you are given a list of students and their respective  marks and you want to write a file in which the first column contains the students names and the second column contains their respective marks. That is your file must look like this:

   ### Students Names $\quad\quad\quad$       Marks
        Thabo                90
        Wilfrid              90.1
        Liu                  99.99
        Marc                 100 

In [None]:
std = open('StudentsMarks.txt', 'w')                   # Create a file named StudentsMarks.txt for writing
std.write("Students' Mark \t  Marks \n")               # the tab character \t create a huge space between the first and the second column
std.write('Thabo \t ' + str(90) +'\n')                 # convert numbers into strings before writing to a file 
std.write('Wilfrid \t ' + str(90.1) +'\n')            #  don't forget to go to the next line with the end of line character \n
std.write('Liu \t ' + str(99.99)  + '\n') 
std.write('Marc  \t ' + str(100))

std.close() 

## Exercises

1. Write write a program that reads a file where each line contains a number and returns a
>list containing the square of those numbers

2. Write a pogram that receives as command line parameter the name of a file and counts
>the number of lines of that file.

3. Write a program that copie the contents of a one file into another, both received as command line parameters

4. Write a program that reads a file where each line contains a number and print the sum of
>all the numbers.

5. Write a program that reads a file and writes another file that is the line by line reversal.
>That is the first line of the new file is the last line the former and so on.
