<font size = "5"> File Handling and I/O

When we are working in Python we dont have to import any third party library to open, read and write files. Python has its own native support to do these things. Let's see how it is done.

In [48]:
import os
os.getcwd()

'C:\\Users\\pk662\\OneDrive\\CORPNCE\\Python_Basics_CORPNCE'

In [47]:
#Check the current directory
%pwd

'C:\\Users\\pk662\\OneDrive\\CORPNCE\\Python_Basics_CORPNCE'

Python provides basic functions and methods necessary to manipulate files by default. You can do most of the file manipulation using a file object

The <b><i>OPEN </b></i> Function <br>
Before you can read or write a file, you have to open it using Python's built-in <u> open() </u> function. This function creates a file object, which would be utilized to call other support methods associated with it <br><br>
The open function takes two arguments, the name of the file and and the mode
for which we would like to open the file. 

By default, when only the filename is passed, the open function opens the file
in read mode.

Syntax <br>
file object = open(file_name , access_mode)

In [51]:
# Importing a file to Python
file = open('sample.txt', 'r')
# Here we can directly open the file as our file is in same location as our Python code
print(file)

file.close()

print(file.closed)
print(file)

<_io.TextIOWrapper name='sample.txt' mode='r' encoding='cp1252'>
True
<_io.TextIOWrapper name='sample.txt' mode='r' encoding='cp1252'>


As you can see in the above example we have passed two arguments, one is file name and the other is the mode. 'r' is the mode here and as you can guess it stands for reading the file. 'r' is also the default mode. The other modes are -

r = Reading the file<br>
w = Writing the file<br>
a = Append the file<br>
r+ = Both reading and writing file

In [52]:
print(file.name)# Print the name of the file
print(file.mode)# Print the current mode of the file
# file.close() # After you use the file, you need to close the file

sample.txt
r


Remember in the above syntax we always need to close the file manually whenever we call the open function for security reasons and clutter free file handling. To avoid these confusion it is always recommended to open a file using context manager or try/finally method.

In [74]:
file = open('sample.txt', 'r')
# print(file.tell())
# file_content = file.read()
# print(file.tell())
# file.seek(32)
# print(file.tell())
# file_content = file.readlines()
# print(file_content)
# print(file_content[1])
# file_content = file.readline()
# print(file_content, end = '')

# file_content = file.readline()
# print(file_content, end = '')

# How to check the total number of characters inside a file
# print(len(file.read()))

for line in file:
    print(line, end = '')# We can use a iterator to iterate over the new lines

file.close()

This is a sample text file. 
This line is your second line. 
Now we are in the third line.

<b>Assignment - Create a file named 'sample.txt' in your system with 6 lines and read the first 100 characters from the file.
    
Hint - Use read(100)

In [76]:
file = open('sample.txt','r')
file_50 = file.read(50)
print(file_50)
file.close()

This is a sample text file. 
This line is your sec


In [84]:
file = open('sample.txt','r')
print(file.tell())
file.seek(11)
file_50 = file.read(50)
print(file_50)
file.tell()
# file.close()

0
ample text file. 
This line is your second line. 



63

**Context Manager**
    

Context managers allow you to allocate and release resources precisely when you want to. <br>
The most widely used example of context managers is the 'with' statement. <br>
Suppose you have two related operations which you’d like to execute as a pair(in our example it is open and close), <br>
with a block of code in between. Context managers allow you to do specifically that. <br>
For example, the above code can be written as -

In [1]:
# See the position of the variable name, it is in the right instead of left
with open('sample.txt','r') as f:
    print(f.name)
    f_content = f.read(30)
    print(f_content)
    print(f.tell())
    
# Now we dont have to close the file manually as before, context manager will take care of it for us
print(f.closed)

# f.read()

sample.txt
This is a sample text file. 
T
30
True


Reading file in binary mode

In [2]:
with open('sample.txt','rb') as file:
    for line in file:
        print(line)

b'This is a sample text file. \n'
b'This line is your second line. \n'
b'Now we are in the third line.'


In [3]:
# Now lets see some basic operation that we can do with our file
with open('sample.txt', 'r') as f:
    f_content = f.read()
    print(f_content)

This is a sample text file. 
This line is your second line. 
Now we are in the third line.


Read function will load all the text inside the file. Lets say we have a very large file and we don't want to print out all the lines at once. In this case we can use 'readline'.

If we need to print out the first 30 characters of the text file then we can do that by calling read(30). For ex:

In [7]:
with open('sample.txt', 'r') as f:
    f_content = f.read(20)
    print(f_content, end = '$$')
    f_content = f.read(20)
    print(f_content, end = '$$')
#     f.seek(0) # Reset the index position to 0 
#     f_content = f.read(20)
#     print(f_content, end = '$$')

This is a sample text file. 
This line i$$

So, how this process is working behind the scene? Whenever we open a file, our python interpreter create a file object which points the beginning of the text file. If we are calling the function readline(), then the file object will go and search for the line break in the file and point the starting of new line.

tell() function can help us find the current location of file object and we can reset the file object location to beginning by  calling seek(0) function.

<b>  Read () </b><br>
The read functions contains different methods, read(),readline() and readlines()

read()		#return one big string

readline	#return one line at a time

readlines	#returns a list of lines

<b>  Write () </b><br>

This method writes a sequence of strings to the file.

write ()	#Used to write a fixed sequence of characters to a file

writelines()	#writelines can write a list of strings.

<b>  Append() </b><br>

The append function is used to append to the file instead of overwriting it.

To append to an existing file, simply open the file in append mode ("a"):

<b>  Close() </b><br>

When you’re done with a file, use close() to close it and free up any system
resources taken up by the open file

<html><img src="C:/Users/ranja/OneDrive/Python File Handling acess modes.png"  alt="Smiley face" height="42" width="42" ></html>

Let's see some examples of file writing.

In [None]:
#To write to a file, use:
# Remember if the file is not created then in write mode it will create the file for you
# And if the file is already created then it will overwrite the current file
with open("sample3.txt","w") as fh:
    fh.write("Hello World, I am learning file handling in Python today")

In [None]:
#To write to a file, use:
with open("sample4.txt", "w") as wf:
    lines_of_text = ["a line of text\n", "another line of text\n", "a third line"]
    wf.writelines(lines_of_text)

<b>Assignment - Open an existing file from your computer and copy all the contents to a new file at a time.

In [13]:
with open('sample.txt', 'r') as rf:
    with open('copy_sample.txt', 'w') as wf:
        for lines in rf:
            print(lines, end = '$')
            wf.write(lines)

This is a sample text file. 
$This line is your second line. 
$Now we are in the third line.$

In [None]:
#To append to file, use:
with open("sample.txt", "a") as rf:
    rf.write("Hello World again")

In [None]:
#How to copy a image file to a new image file
#Keep a cat image inside your folder with name 'cat1.jpg' to make this work
with open('cat1.jpg', 'rb') as rf:
    with open('cat1_copy.jpg', 'wb') as wf:
        for lines in rf:
            wf.write(lines)

In [86]:
# with open('canada_company_sales_data.xlsx') as f:
#     print(f.readlines())
import pandas as pd
# df = pd.read_csv('sample_submission.csv')
# df

# Do this first to make it work - pip install xlrd
# import xlrd
df = pd.read_excel('canada_company_sales_data.xlsx')
df.head(10)

Unnamed: 0,Order ID,Order Date,Order Priority,Order Quantity,Sales,Ship Mode,Shipping Cost,Province,Customer Segment,Product Category,Product Sub-Category,Product Container,Ship Date
0,928,2011-03-01,Low,26,390.2,Express Air,7.4,British Columbia,Consumer,Furniture,Office Furnishings,Small Box,2011-03-03
1,32323,2010-07-23,High,38,259.7175,Regular Air,5.03,Ontario,Small Business,Technology,Telephones and Communication,Medium Box,2010-07-25
2,48353,2012-12-15,Not Specified,18,71.22,Regular Air,0.7,British Columbia,Corporate,Office Supplies,Pens & Art Supplies,Wrap Bag,2012-12-17
3,10144,2011-01-02,Critical,1,192.49,Delivery Truck,30.0,British Columbia,Corporate,Furniture,Chairs & Chairmats,Jumbo Drum,2011-01-04
4,26756,2012-05-10,Medium,25,767.26,Regular Air,4.0,British Columbia,Home Office,Technology,Computer Peripherals,Small Box,2012-05-10
5,18144,2011-06-07,Critical,48,207.08,Regular Air,5.17,Northwest Territories,Corporate,Office Supplies,Paper,Small Box,2011-06-09
6,10369,2011-11-09,Low,23,683.68,Regular Air,8.99,British Columbia,Home Office,Technology,Computer Peripherals,Small Pack,2011-11-14
7,22912,2010-10-16,Low,33,10168.23,Express Air,19.99,Yukon,Corporate,Office Supplies,Binders and Binder Accessories,Small Box,2010-10-20
8,51008,2011-08-20,High,20,269.66,Regular Air,4.59,Quebec,Consumer,Office Supplies,"Scissors, Rulers and Trimmers",Wrap Bag,2011-08-22
9,18279,2009-02-26,Medium,20,10281.79,Regular Air,24.49,Alberta,Consumer,Technology,Copiers and Fax,Large Box,2009-02-27
