### Some Theory
##### Types of data used for I/O:
    Text- '12345' as a sequence of unicode chars
    Binary - '12345' as a sequence of bytes of its binary equivalent

Hence there are 2 file types to deal with
    
    Text files- All program files are files
    
    Binary files- Images,music,video,exe files

### How File I/O is done in most programming languages

    Open a file
    Read/Write data
    Close the file


### Writing to a file

In [3]:
# case1 - if the file is not present

f = open('sample.txt', 'w')   # open/create new file named sample.txt and write operation on
f.write('Heloo world')     # write the data with write mode operation
f.close()    # close the file

# since file is closed, hence this will not work
#f.write('Hi')

In [4]:
# write multiline strings
# syntax: open('path', 'w') path specify the path where you want to create the file i.e. in which drive

f = open('sample1.txt','w')      # open
f.write('how are you')           # write
f.write("\nwhat's going on?")    # write
f.close()                       

In [5]:
# case2 - if the file is already present

f = open('sample.txt','w')
f.write('Anand')     # will replace the old content from sample.txt and will print Anand inplace of Heloo world
f.close()

#### How exactly open() works?

Basically, this file is a simple file jo hmare computer k hard drive pr rhti hai, naki ram mein. Ab jb ye code (f = open('sample.txt', 'w')) likhte hain to python us file ko ram (buffer memory: mein file character by character read hota hai) me load kr deta hai. Ab jb tk file operation perform kr rhe hote hain tb tk wo file ram ke ander buffer memory mein hai jb tk hm operation perform kr rhe hote hain (read,write). And jb close() function call hota hai tb ye memory se nikal jati hai aur wapis store ho jati hai hard drive mein.

In [7]:
# Problem with 'w' mode
# Introducing append mode

f = open("sample1.txt",'a')    # here a means append mode used for writing to a file without previous content removal
f.write("\nIt's all good")
f.close()

In [8]:
# write many/ multiple lines

L = ['Hey','\nEveryone','\nhere we are introducing','\nall new wagonR']

f = open('sample.txt','w')
f.writelines(L)      # writelines() used to write multiple lines at same time
f.close()

#### Why close() is must?
##### Two reasons:
1.)  close() is must because file is in ram and if file size is in GBs then it takes a lot of memory in RAM. Hence to free up the space we need to close the file using close().

2.)  Second, if we don't close the file then it is harmful to the security point of view of that file's content. Anyone can read that file's content.

In [10]:
# reading from files
# using read(): ek sath poora content read kr leta hai

f = open('sample.txt','r')
s = f.read()
print(s)
f.close()

# Every time when we make read/write operation on textual files its only in string format, any other format is not understandable 
# number, dictionary, set etc everthing will be string

Hey
Everyone
here we are introducing
all new wagonR


In [11]:
# reading upto n characters

f = open('sample.txt','r')
s = f.read(14)
print(s)
f.close()

Hey
Everyone
h


In [12]:
# readline(): to read line by line
f = open('sample.txt','r')
d = f.readline()
s = f.readline()
print(d,end='')
print(s,end = '')   # end to remove the line space
f.close()

Hey
Everyone


##### When to use read() and readline()
read(): is used when we are working with small file and content is less to memory consumption.
readline(): is used when we are working with large dataset and we need to manage memory with the content. Example: agr content bahut jada hai aur hame kuch selected content pe hi kaam krna hai to hm readline() use krte hain.

In [14]:
# reading entire using readline()
# agr hme pta nhi ho ki uss file me kitni lines hain to we use while loop with custom code

f = open('sample.txt', 'r')

# while f.readline() != " ":
#     print(f.readline(),end="")

while True:
    data = f.readline()

    if data == '':
        break
    else:
        print(data,end='')

f.close()

Hey
Everyone
here we are introducing
all new wagonR

### Using context manager (with)

    1.) It's a good idea to close a file after usage as it will free up the resources
    2.) If we don't close it, garbage collector would close it
    3.) with keyword closes the file as soon as the usage is over

In [16]:
# 'with' keyword: with keyword do the open/write/close operation in a single line of code

with open('sample1.txt','w') as f:
    f.write('Anand S')

In [17]:
f.write('Hello')    # will throw an error because 'with' keyword close the file at the same time so you can't make write operation

ValueError: I/O operation on closed file.

In [None]:
# f.read() using 'with' keyword

with open('sample.txt', 'r') as f:
    print(f.read())

# same we can do with readline()

In [None]:
# moving within a file: loading a file in chunks. If we have large file in GBs then it is good practice to load it in chunks using below code

with open('sample.txt', 'r') as f:
    print(f.read(10))
    print(f.read(10))

In [None]:
# benefit?: to load a big file in memory

big_L = ['hello world' for i in range(1000)]

with open('big.txt', 'w') as f:
    f.writelines(big_L)

In [None]:
# loading above file big.txt in chunks

with open('big.txt', 'r') as f:
    chunk_size = 100

    # this code will read the file data in chunks until the it is not equal to 0.
    while len(f.read(chunk_size)) > 0:
        print(f.read(chunk_size))
        f.read(chunk_size)

#### Seek and Tell

In [None]:
# Seek and Tell

with open('sample.txt', 'r') as f:
    print(f.read(10))

# tell() is a mechanism to tell ki abhi tk kitna text process kiya hai aur next kon sa character print hoga
    print(f.tell())

# seek() has a power to move in our file anywhere where the value is provided as shown below code
    f.seek(0)
    print(f.read(10))
    print(f.tell())

    f.seek(12)
    print(f.read(10))
    print(f.tell())

In [None]:
# seek during write

with open('sample.txt','w') as f:
    f.write('Hello')
    f.seek(0)
    #f.write('X')   # prints Xello
    f.write('Xa')   # prints Xallo

### Problems with working in text mode
    1.) can't work with binary files like images
    2.) not good for other data types like int/float/list/tuples

In [None]:
# problem when working with binary file

with open('car.jpeg','r') as f:
    f.read()

In [None]:
# working with binary file

# rb as readbinary and wb as write binary operations are required to read and write the binary files
with open('car.jpeg', 'rb') as f:
    with open('car_copy.jpeg', 'wb') as wf:
        wf.write(f.read())

In [None]:
# working with other data types

with open('sample.txt','w') as f:
    f.write(5)   # throw error with integer data type

In [None]:
with open('sample.txt', 'w') as f:
    f.write('5')

In [None]:
with open('sample.txt', 'r') as f:
    print(f.read() + 5)    # will throw an error because string datatype cannot be added with int

In [None]:
with open('sample.txt', 'r') as f:
    print(int(f.read()) + 5)

In [None]:
# more complex data

d = {
    'name' : 'Anand',
    'age' : '33',
    'gender' : 'M'
}

with open('sample.txt','w') as f:
    f.write(d)        # will throw an error showing that it would be string as shown in below code

In [None]:
d = {
    'name' : 'Anand',
    'age' : '33',
    'gender' : 'M'
}

with open('sample.txt','w') as f:
    f.write(str(d))

In [None]:
with open('sample.txt','r') as f:
     print(f.read())
     print(type(f.read()))

In [None]:
# you cannot convert a string to dictionary again

with open('sample.txt','r') as f:
     print(dict(f.read()))   # will throw an error 

##### To avoid situation like above where we need to convert a string to dictionary is not possible, we need the concept of Serialization and Deserialization

### Serialization and Deserialization
    Serialization: process of converting python data types to JSON format 
    Deserialization: process of converting JSON to python data types

What is JSON?

    JSON is Java Script On Notation, and it is a kind of universal data format. It is understandable to any programming language. It is a universal data format. Every API is using JSON. 

In [None]:
# Serialization using json module
# list
import json

L = [1,2,3,4]

with open('demo.json','w') as f:
    json.dump(L,f)    

In [None]:
# dictionary

d = {
    'name' : 'Anand',
    'age' : '33',
    'gender' : 'M'
}

with open('demo.json', 'w') as f:
    json.dump(d,f)

In [None]:
# deserialization
import json

with open('demo.json', 'r') as f:
    d = json.load(f)
    print(d)
    print(type(d))

In [None]:
# serialize and deserialize tuple
import json

t = (1,2,3,4)

with open('demo.json', 'w') as f:
    json.dump(t,f)     # here tuple is also stored as list but we can change its data type

In [32]:
# serialization and deserialization a nested dictionary

d = {
    'student' : 'Anand',
    'marks' : [18,15,16,12]
}

with open('demo.json', 'w') as f:
    json.dump(d,f)

### Serialization and Deserialization custom objects

In [41]:
class Person:

    def __init__(self,fname,lname,age,gender):
        self.fname = fname
        self.lname = lname
        self.age = age
        self.gender = gender

person = Person("Anand",'S',33,'M')

In [43]:
# As a string

import json

with open('demo1.json','w') as f:
    json.dump(person,f)    # will not work; how work shown below

TypeError: Object of type Person is not JSON serializable

In [51]:
# to display as a string

import json

def show_object(perso):
    if isinstance(person,Person):
        return "Name -> {} {} Age -> {} Gender -> {}".format(person.fname,person.lname,person.age,person.gender)
        

with open('demo1.json','w') as f:
    json.dump(person,f,default=show_object)   

In [55]:
# to display as a dictionary

import json

def show_object(perso):
    if isinstance(person,Person):
        return {'Name' : person.fname +' '+ person.lname, 'Age' : person.age, 'Gender' : person.gender}        

with open('demo1.json','w') as f:
    json.dump(person,f,default=show_object)   

In [63]:
# deserialization
import json

with open('demo1.json','r') as f:
    o = json.load(f)
    print(o)
    print(type(o))

{'Name': 'Anand S', 'Age': 33, 'Gender': 'M'}
<class 'dict'>


### Pickling:
    Pickling is the process whereby a Python object hierarchy is converted into a byte stream, and Unpickling is the inverse operation, whereby a byte stream (from a binary file or bytes-like bject) is converted back into an object hierarchy.

In [72]:
class Person:

    def __init__(self,name,age):
       self.name = name
       self.age = age

    def display_info(self):
        print('Hi My name is:',self.name,'and I am:',self.age,'years old')

p = Person('Anand',33)

In [78]:
# pickle dump: converting an object to binary
import pickle

with open('person.pkl', 'wb') as f:
    pickle.dump(p,f)

In [82]:
# pickle load

import pickle

with open('person.pkl', 'rb') as f:
    p = pickle.load(f)

p.display_info()

Hi My name is: Anand and I am: 33 years old


#### Pickle vs Json

###### Pickle: Lets the user to store data in binary format.(Agr object ka functinality retain krna hai aur kisi doosri file me bhejna hai we would use pickle)

###### JSON: Lets the user to store data in human-readable text format.(agr hme apne custom object ko text representation dena hai to JSON use hoga)