File Handling + Serialization & Deserialization

Some Theory    
           
Types of data used for I/O:                  
Text - '12345' as a sequence of unicode chars                
Binary - 12345 as a sequence of bytes of its binary equivalent           

Hence there are 2 file types to deal with           
Text files - All program files are text files                
Binary Files - Images,music,video,exe files      

How File I/O is done in most programming languages         
Open a file                 
Read/Write data              
Close the file                 

Writing to a file

In [3]:
# case 1 - if the file is not present
f = open('sample.txt','w')
f.write('Hello World')
f.close()
# since file is close hence this will not work
f.write('Hello')

ValueError: ignored

In [5]:
# write multiline strings
f = open('sample1.txt','w')
f.write('hello world')
f.write('\nhow are you')
f.close()

In [6]:
# case 2 - if the file is already present
f = open('sample.txt','w')
f.write('ISRO')
f.close()

In [7]:
# how exactly open() works?

In [15]:
# problem with w mode
# introducing append mode
f = open('sample1.txt','a')
f.write('\nIndian Space Research Organization')
f.close()

In [14]:
# write line
L = ['hello\n','hi\n','how are you\n','i am fine']

f = open('sample1.txt','w')
f.writelines(L)
f.close()

In [16]:
# reading from files
# -> using read()
f = open('sample1.txt','r')
s = f.read()
print(s)
f.close()

hello
hi
how are you
i am fine
Indian Space Research Organization


In [17]:
# reading upto n chars
f = open('sample1.txt','r')
s = f.read(10)
print(s)
f.close()

hello
hi
h


In [19]:
# readline() - to read line by line
f = open('sample1.txt')
print(f.readline(),end='')
print(f.readline(),end='')
f.close()

hello
hi


In [23]:
# reading entire using readline
f = open('sample1.txt','r')

while True:

  data = f.readline()

  if data == "":
    break
  else:
    print(data,end='')
f.close()

hello
hi
how are you
i am fine
Indian Space Research Organization

Using Context Manager (With)                  

It's a good idea to close a file after usage as it will free up the resources            
If we dont close it, garbage collector would close it    
with keyword closes the file as soon as the usage is over

In [30]:
# with -> you dont need to close the file for this method
with open('sample1.txt','w')as f:
  f.write('ISRO -> Indian Space Research Organization')

In [31]:
# try f.read() now

with open('sample1.txt','r') as f:
  print(f.read())

ISRO -> Indian Space Research Organization


In [34]:
# moving with file -> 10 char then 10 char
with open('sample1.txt','r')as f :
  print(f.read(10))
  print(f.read(10))
  print(f.read(10))

ISRO -> In
dian Space
 Research 


In [35]:
# benefit -> to load a big file in memory
big_L = ['hello world' for i in range(1000)]

with open('big.txt','w')as f:
  f.writelines(big_L)

In [37]:
with open('big.txt','r')as f:

  chunk_size = 100

  while len(f.read(chunk_size)) > 0:
    print(f.read(chunk_size),end='')
    f.read(chunk_size)

ello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldheo worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhelloorldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello wodhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldllo worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhel worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello rldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhlo worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhellworldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello w

In [41]:
# seek and tell function
with open('sample1.txt','r')as f :
  print(f.read(10))
  print(f.tell())
  f.seek(0)
  print(f.read(14))
  print(f.tell())
  f.seek(15)
  print(f.read(10))
  print(f.tell())

ISRO -> In
10
ISRO -> Indian
14
Space Rese
25


In [43]:
# seek during write
with open('sample1.txt','w') as f:
  f.write('Hello')
  f.seek(0)
  f.write('Xa')

Problems with working in text mode        

can't work with binary files like images                 
not good for other data types like int/float/list/tuples   

---



In [44]:
#  working with binary file
with open('pic.jpg','r')as f:
  f.read()

UnicodeDecodeError: ignored

In [46]:
# working with binary file
with open('pic.jpg','rb')as f:
  with open('pic_copy.jpg','wb')as wf:
    wf.write(f.read())

In [47]:
# working with other data type
# you cannot write any other data type - only type string
with open('sample1.txt','w')as f:
  f.write(5)

TypeError: ignored

In [51]:
with open('sample1.txt','w')as f:
  f.write('5')
# we convert int into str

In [52]:
with open('sample1.txt','r')as f:
  print(int(f.read()) +5)

10


In [54]:
# more complex data
d = {
    'name':'nitish',
     'age':33,
     'gender':'male'
}

with open('sample1.txt','w') as f:
  f.write(str(d))

In [55]:
with open('sample1.txt','r') as f:
  print(f.read())
  print(type(f.read()))

{'name': 'nitish', 'age': 33, 'gender': 'male'}
<class 'str'>


In [56]:
with open('sample1.txt','r') as f:
  print(dict(f.read()))

ValueError: ignored

Serialization and Deserialization    

Serialization - process of converting python data types to JSON format               
Deserialization - process of converting JSON to python data types               


What is JSON?



In [61]:
# serialization using json module
# list

import json

L = [1,2,3,4,5]

with open('demo.json','w')as f:
  json.dump(L,f)

In [62]:
# dict
d = {
    'name':'nitish',
     'age':33,
     'gender':'male'
}

with open('demo.json','w')as f:
  json.dump(d,f,indent=4)

In [63]:
# deserializing
import json

with open('demo.json','r') as f:
  d = json.load(f)
  print(d)
  print(type(d))

{'name': 'nitish', 'age': 33, 'gender': 'male'}
<class 'dict'>


In [70]:
# serialization and deserialization tuple
# tuple ko hum as tuple store nahi krr sakte -> always save in list

import json

t = (1,2,3,4,5)

with open('demo.json','w')as f:
  json.dump(t,f)

In [72]:
# serialize and deserialize a nested dict

d = {
    'student':'nitish',
    'marks':[23,45,85,75,25,65]
}

with open('demo.json','w')as f:
  json.dump(d,f)

Serializing and deserializing custom objects

In [78]:
class Person:

  def __init__(self,fname,lname,age,gender):
    self.fname = fname
    self.lname = lname
    self.age = age
    self.gender = gender

# format to printed in

In [79]:
person = Person('Nitish','Singh',33,'male')

In [80]:
# As a string
import json

with open('demo.json','w') as f:
  json.dump(person,f)

TypeError: ignored

In [85]:
# As a string
import json

def show_object(person):
  if isinstance(person,Person):
    return { 'name':person.fname + ' ' + person.lname,'age':person.age,'gender':person.gender}
with open('demo.json','w') as f:
  json.dump(person,f,default=show_object,indent=4)

In [88]:
# deserializing
import json

with open('demo.json','r') as f:
  d = json.load(f)
  print(d)
  print(type(d))

{'name': 'Nitish Singh', 'age': 33, 'gender': 'male'}
<class 'dict'>


Pickling                         
Pickling is the process whereby a Python object hierarchy is converted into a byte stream, and unpickling is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierarchy.

In [89]:
class Person:

  def __init__(self,name,age):
    self.name = name
    self.age = age

  def display_info(self):
    print('Hi my name is',self.name,'and I am ',self.age,'years old')

In [90]:
p = Person('nitish',33)

In [66]:
# pickle dump
import pickle
with open('person.pkl','wb') as f:
  pickle.dump(p,f)

In [67]:
# pickle load
import pickle
with open('person.pkl','rb') as f:
  p = pickle.load(f)

p.display_info()

Hi my name is nitish and I am  33 years old
