
### Python Programming

##### by Narendra Allam
Copyright 2019

# Chapter 7

## File IO

#### Topics Covering

-	Creating file
-	File reading
-	File writing
-	File modes
-	Line by line file reading
-	Writing multiple lines
-	seek()
-	tell()
-	os.getcwd()
-	os.mkdir()
-	os.chdir()
-	os.remove()
-	os.rmdir()

-	Use Case - CSV file reading and writing

File is  anything that generally is saved on permanent storage devices with a name. Content of file can be simple text, binary data(image, audio, video) etc. Only text files are dicussed in this chapter.

__open():__ This is the function in python to open a file.

```python
Syntax:
file_handle = open(<filename>, <mode>)
```

Open function -- opens a file and returns a file object, through which we perform all operations on a file.

__Note:__ 
    In the above statement, we are trying to open abc.txt, if abc.txt is not exisiting, we get "IOError" in Python 2.x and we get "FileNotFoundError" in Python 3.x

<b>Modes:</b><br>
<u>Text Modes</u>

    r or rt - read mode, if file not exists throws IOError
    w or wt - write mode, if file not exists creates new one
    a or at - append mode is write mode but starts writing, from the end of the file

    r+ or rt+ - read write
    w+ or wt+ - write read
    a+ or at+ - append read
    
<u>Binary Modes</u>

    rb - Binary read
    wb - Binary write
    ab - append
    rb+ - read and write in binary
    wb+ - read and write in binary
    ab+ - read and append in binary


__File creation and writing__

In [1]:
f = open('abc.txt', 'w')

f.write("Once upon a time in India, there was a king called Tippu.")
f.close()

<b> f </b> is the file object, which holds a buffer in RAM, which will be synced to hard disk later. <b>close()</b> function ensures the sync between content written to a file and memory buffer. It flushes all the content to a file on hard disk.

Let's check whether the file is created or not. For this, we can run OS commands from jupyter notebook. Just prefix '!' symbol with the command.

In [3]:
!ls # On windows run : !dir

'='
 abc.txt
 book1.json
 book.json
 books.json
 bookslist.json
 book.xml
 browser.png
'Chapter 10 Object Orientation.ipynb'
'Chapter 11.0 MYSQLDBSetupWalkthrough.ipynb'
'Chapter 11 MySQL DB Connection .ipynb'
'Chapter 12 ExceptionHandling.ipynb'
'Chapter 12 Object Orientation Latest.ipynb'
'Chapter 13 Threading.ipynb'
'Chapter 14 Logging.ipynb'
'Chapter 15 Email - FTP.ipynb'
'Chapter 16 UnitTesting.ipynb'
'Chapter 17 Regular Expressions.ipynb'
'Chapter 18 Numpy.ipynb'
'Chapter 19 Pandas.ipynb'
'Chapter 1 Introduction.ipynb'
'Chapter 20 Matplotlib.ipynb'
'Chapter 2 Strings.ipynb'
'Chapter 3 Control Structures.ipynb'
'Chapter 4 Data Structures.ipynb'
'Chapter 5 Functions.ipynb'
'Chapter 6 Modules.ipynb'
'Chapter 7 File IO.ipynb'
'Chapter 8 Comprehensions, Lambdas and Functional Programming.ipynb'
'Chapter 9 Serialization.ipynb'
 Comprehensions.ipynb
'Control Structures.ipynb'
 country.xml
 CountWays.ipynb
 create.png
 cs.ipynb
 data.csv
 data.txt
 d

Now, let's check the content of the file we created.

In [4]:
!cat abc.txt # On windows run : !type abc.txt

Once upon a time in India, there was a king called Tippu.

In [5]:
f = open('abc.txt')
txt = f.read()
print(txt)
f.close()

Once upon a time in India, there was a king called Tippu.


__Open file with context manager:__

In [6]:
with open('abc.txt') as f:
    txt = f.read()
    print(txt)

Once upon a time in India, there was a king called Tippu.


Do you want to check, which folder your are in? This getcwd() function gives you current working directory.
```python
import os
print os.getcwd()
```

In [7]:
import os
print(os.getcwd())

/hdd/notebooks/VASU/PythonNotebooks-final


In [2]:
!pwd

/hdd/notebooks/VASU/PythonNotebooks-final


__Reading an existing file__

read() function reads entire file content as a string.

In [8]:
f = open('abc.txt', 'r')
s = f.read()
print(s)
f.close()

Once upon a time in India, there was a king called Tippu.


__To Read n characters, read(n)__

In [9]:
f = open('abc.txt', 'r')
s = f.read(10)
print(s)
f.close()

Once upon 


__To Write Multiline text__

In [10]:
f = open('abc.txt', 'w')
f.write("""Once upon a time in India, there was a king called Tippu.
Tippu was so tall and handsome and brave. He was looking for a brave and beautiful
bride and sent the message to all of his citizens.""")
f.close()

__Check the file content__

In [11]:
!cat abc.txt # !type abc.txt for windows

Once upon a time in India, there was a king called Tippu.
Tippu was so tall and handsome and brave. He was looking for a brave and beautiful
bride and sent the message to all of his citizens.

__Reading text into a list of strings__

readlines() function returns all lines in the file as a list of strings

In [12]:
f = open('abc.txt', 'r')
l = f.readlines()
print(l)
f.close()

['Once upon a time in India, there was a king called Tippu.\n', 'Tippu was so tall and handsome and brave. He was looking for a brave and beautiful\n', 'bride and sent the message to all of his citizens.']


__Line by line file reading__

In [13]:
f = open('abc.txt', 'r')

for line in f:
    print(line, end='')
    
f.close()

Once upon a time in India, there was a king called Tippu.
Tippu was so tall and handsome and brave. He was looking for a brave and beautiful
bride and sent the message to all of his citizens.

__Writing multiple lines__

In [14]:
f = open('abc.txt', 'w')
l = ['Once upon a time in India, there was a king called Tippu.\n',
 'Tippu was so tall and handsome and brave. He was looking for a brave and beautiful\n',
 'bride and sent the message to all of his citizens.\n',
 'He was waiting for years...']
f.writelines(l)
f.close()

In [15]:
!cat abc.txt

Once upon a time in India, there was a king called Tippu.
Tippu was so tall and handsome and brave. He was looking for a brave and beautiful
bride and sent the message to all of his citizens.
He was waiting for years...

A new write operation  in exisiting file with some text discards the exisiting text, so we have to use append mode to add the text at the end.

In [16]:
f = open('abc.txt', 'w')
f.write("Apple is sweet !!!")
f.close()

In [17]:
!cat abc.txt

Apple is sweet !!!

In [18]:
f = open('abc.txt', 'a')
f.write("Orange is sour!\n")
f.close()

In [19]:
!cat abc.txt

Apple is sweet !!!Orange is sour!


After writing into a file, file pointer moves to the end, so reading after writing won't read anything. Check the below  example.

In [20]:
f = open('abc.txt', 'a+')

f.write("Sky is blue! \n Milk is White.")

s = f.read()

print ("File content = ", s)

f.close()

File content =  


__seek(): Moving the file pointer in file__

Syntax:
```python
f.seek(<Offset>, <Whence>)
```

1. f.seek(n, io.SEEK_SET)
 reads from nth character from start of the file

In [21]:
!cat abc.txt

Apple is sweet !!!Orange is sour!
Sky is blue! 
 Milk is White.

In [22]:
import io
f = open('abc.txt', 'a+')
f.seek(20, io.SEEK_SET)
s = f.readline()
print(s)
f.close()

ange is sour!



io.SEEK_SET, io.SEEK_CUR and io.SEEK_END are the reference points from which offset needs to be considered in a file.

```python
f.seek(5, io.SEEK_SET) # moves file pointer to 5th character from file start
f.seek(-10, io.SEEK_CUR) # moves file pointer to 10th character from current position
f.seek(-10, io.SEEK_END) # moves file pointer to 50th character from end of the file
```

__f.tell():__ this function returns current offset position from start of the file

In [35]:
import io
f = open('abc.txt', 'rb')
f.seek(5, io.SEEK_SET)
print (f.tell())

print (f.readline())
print (f.tell())

f.seek(-3, io.SEEK_CUR)
print (f.readline())
print (f.tell())

f.seek(-6, io.SEEK_END)
print (f.readline())
print (f.tell())

f.seek(0, io.SEEK_SET)
print (f.readline())
print (f.tell())

f.seek(0, io.SEEK_END)
print (f.readline())
print (f.tell())

5
b' is sweet !!!Orange is sour!\n'
34
b'r!\n'
34
b'White.'
63
b'Apple is sweet !!!Orange is sour!\n'
34
b''
63


In [24]:
!cat data.txt

here are plenty of artificial intelligence (AI) trends emerging from China these days, but arguably the most intriguing are those that show just how close humans are to meeting their match at the hands of machines. This week, we saw 100Credit, a big data platform company, has launched a low-cost robot called "Little 100Credit" that reportedly collects bad debts over the phone 90% as often as its human counterparts . Yes, a robot calls up customers who owe money and helps people repay their debts
almost as well as a human.


__Program:__ Read text from a text file, find the word with most number of occurances

In [25]:
from collections import Counter

f = open('data.txt')
s = f.read()
f.close()

letters = [char for char in s if char.isalnum() or char == ' ']   
words =  ''.join(letters).split()
print ('Most frequently occured word:', Counter(words).most_common(1))

Most frequently occured word: [('a', 4)]


__Storing data as CSV file__

In [26]:
import datetime

l = [(10001, datetime.date(1953, 9, 2), 'Georgi', 'Facello', 'M', datetime.date(1986, 6, 26)) ,
(10002, datetime.date(1964, 6, 2), 'Bezalel', 'Simmel', 'F', datetime.date(1985, 11, 21)) ,
(10003, datetime.date(1959, 12, 3), 'Parto', 'Bamford', 'M', datetime.date(1986, 8, 28)) ,
(10004, datetime.date(1954, 5, 1), 'Chirstian', 'Koblick', 'M', datetime.date(1986, 12, 1)) ,
(10005, datetime.date(1955, 1, 21), 'Kyoichi', 'Maliniak', 'M', datetime.date(1989, 9, 12)) ,
(10006, datetime.date(1953, 4, 20), 'Anneke', 'Preusig', 'F', datetime.date(1989, 6, 2)) ,
(10007, datetime.date(1957, 5, 23), 'Tzvetan', 'Zielinski', 'F', datetime.date(1989, 2, 10)) ,
(10008, datetime.date(1958, 2, 19), 'Saniya', 'Kalloufi', 'M', datetime.date(1994, 9, 15)) ,
(10009, datetime.date(1952, 4, 19), 'Sumant', 'Peac', 'F', datetime.date(1985, 2, 18)) ,
(10010, datetime.date(1963, 6, 1), 'Duangkaew', 'Piveteau', 'F', datetime.date(1989, 8, 24))]

In [27]:
','.join(['Apple', 'Orange', 'Banana', 'Peach'])

'Apple,Orange,Banana,Peach'

In [28]:
f = open('data.csv', 'w')
for rec in l:
    s = ','.join([str(rec[0]), rec[1].strftime('%Y-%m-%d'), 
         rec[2], rec[3], rec[4], rec[5].strftime('%Y-%m-%d')])
    s += '\n'
    f.write(s)
f.close()

In [29]:
!cat data.csv

10001,1953-09-02,Georgi,Facello,M,1986-06-26
10002,1964-06-02,Bezalel,Simmel,F,1985-11-21
10003,1959-12-03,Parto,Bamford,M,1986-08-28
10004,1954-05-01,Chirstian,Koblick,M,1986-12-01
10005,1955-01-21,Kyoichi,Maliniak,M,1989-09-12
10006,1953-04-20,Anneke,Preusig,F,1989-06-02
10007,1957-05-23,Tzvetan,Zielinski,F,1989-02-10
10008,1958-02-19,Saniya,Kalloufi,M,1994-09-15
10009,1952-04-19,Sumant,Peac,F,1985-02-18
10010,1963-06-01,Duangkaew,Piveteau,F,1989-08-24


In [30]:
import datetime
f = open('data.csv')
for rec in f:
    rec = rec.rstrip('\n')
    
    l = rec.split(',')
    l[0] = int(l[0])
    
    year, month, day = [int(x) for x in l[1].split('-')]
    l[1] = datetime.date(year, month, day)
    
    year, month, day = [int(x) for x in l[5].split('-')]
    l[5] = datetime.date(year, month, day)
    
    print(l)
f.close()

[10001, datetime.date(1953, 9, 2), 'Georgi', 'Facello', 'M', datetime.date(1986, 6, 26)]
[10002, datetime.date(1964, 6, 2), 'Bezalel', 'Simmel', 'F', datetime.date(1985, 11, 21)]
[10003, datetime.date(1959, 12, 3), 'Parto', 'Bamford', 'M', datetime.date(1986, 8, 28)]
[10004, datetime.date(1954, 5, 1), 'Chirstian', 'Koblick', 'M', datetime.date(1986, 12, 1)]
[10005, datetime.date(1955, 1, 21), 'Kyoichi', 'Maliniak', 'M', datetime.date(1989, 9, 12)]
[10006, datetime.date(1953, 4, 20), 'Anneke', 'Preusig', 'F', datetime.date(1989, 6, 2)]
[10007, datetime.date(1957, 5, 23), 'Tzvetan', 'Zielinski', 'F', datetime.date(1989, 2, 10)]
[10008, datetime.date(1958, 2, 19), 'Saniya', 'Kalloufi', 'M', datetime.date(1994, 9, 15)]
[10009, datetime.date(1952, 4, 19), 'Sumant', 'Peac', 'F', datetime.date(1985, 2, 18)]
[10010, datetime.date(1963, 6, 1), 'Duangkaew', 'Piveteau', 'F', datetime.date(1989, 8, 24)]


In [31]:
rec1 = '1234 John 23000 Male'
rec2 = '1235 Samantha 34000 Female'

f = open('data.csv', 'w')

l = rec1.split()
rec = ','.join(l) + '\n'
f.write(rec)

l = rec2.split()
rec = ','.join(l) + '\n'
f.write(rec)

f.close()


In [32]:
!cat data.csv

1234,John,23000,Male
1235,Samantha,34000,Female


__Reading data from CSV file__

In [33]:
f = open('data.csv')
for line in f:
    l = line.rstrip('\n').split(',')
    print(l)
f.close()

['1234', 'John', '23000', 'Male']
['1235', 'Samantha', '34000', 'Female']


__With:__ Open a file with context manager

In [34]:
with open('abc.txt') as f:
    print(f.read())

Apple is sweet !!!Orange is sour!
Sky is blue! 
 Milk is White.


__Note:__ With closes file automatically, even an exception occured.

__Useful functions from os module__

import os

_os.getcwd():_ Returns current working directory

_os.mkdir():_ Creates a directory

_os.chdir():_ Change directory

_os.remove():_ Removes a file

_os.rmdir():_ Removing directory

_os.listdir('.'):_ Lists current directory

### Interview questions

1. How do you read line by line from a file in python ?<br>
```python
with open('abc.txt') as f:
    s = f.read()
```
2. What happens if ‘abc.txt’ doesn’t exist?