### ***Types of data***
#### There are two types of data used for I/O:
#### - Text data '12345' as a sequence of unicode characters
#### - Binary data 12345 as a sequence of bytes of its binary equivalent
#### There are 2 file types to deal with
#### - Text files : All program files are text files
#### - Binary files: Images, videos, audio, exe files

### ***How File I/O is done in most programming languages***
#### There are 3 steps we need to follow to work with the files
#### - open a file
#### - read/write
#### - close the file

### ***Writing to file***

In [1]:
# case1: If the file is not present
# open() function returns a file handler object
f = open("sample.txt", "w")
# Using file handler object we can perform the file operation either read or write
f.write("Hello World")
# we should always close the file after working with them. It is a good programming practice
f.close()

In [2]:
f = open("sample.txt", "w")
f.write("Hello World")
f.close()
# since file is closed hence this will not work
f.write("hello")

ValueError: I/O operation on closed file.

In [6]:
# write multiline strings

f = open("sample1.txt", "w")
# writing first line
f.write("Hello World")
# writing second line
f.write("\nhow are you?")
f.close()

### If we are trying open a file  in write mode that is already existed, all the content will be erased and new content will be added to the file

In [7]:
# case 2: If the file is already present
f = open("sample.txt", "w")
f.write("Prabhas")
f.close()

### ***how exactly open() works?***
#### - Basically file is a simple file located on some directory(harddrive). when open() is excecuted, python will go to the file location and take the file load it on Ram.
#### - Inside Ram, file is load into buffer memory where each character of file is executed or processed by buffer
#### - file stays in the buffer memory till the completion of file operations
#### - Once file is closed. It leave the buffer memory and come back to source location

In [8]:
# problem with w mode
# Introducing append mode
f = open("sample1.txt", "a")
f.write("\nIam fine")
f.close()

In [9]:
# write lines:  write multiple lines at a time to the file
l = ["hello\n", "hi\n", "how are you?\n", "Iam fine"]
f = open("sample.txt", "w")
f.writelines(l)
f.close()# we close the file due to its safety and free up memory

In [10]:
l = ["hello\n", "hi\n", "how are you?\n", "Iam fine"]
f = open("./temp/sample.txt", "w")
f.writelines(l)
f.close()

### ***reading from files***
#### - read(): It reads all the file content at a time
#### - readline(): It reads one line of file at a time 

In [11]:
f = open("sample.txt", "r")
content = f.read()
print(content)
f.close()

hello
hi
how are you?
Iam fine


In [12]:
# read upto n characters
f = open("sample.txt", "r")
content = f.read(10)
print(content)
f.close()

hello
hi
h


In [13]:
# read line by line
f = open("sample.txt", "r")
print(f.readline(), end="")
print(f.readline(), end="")
f.close()

hello
hi


#### - read: If we are working with small files
#### - readline: If we are working with larger files and load smaller chunks to the ram to avoid burden

In [14]:
# reading entire using readline
f = open("sample.txt", "r")
while True:
    data = f.readline()
    if data == "":
        break
    else:
        print(data, end="")
f.close()

hello
hi
how are you?
Iam fine

### ***Using context manager(with)***
#### - It is a good idea to close the file after usage as it free up the resources
#### - If we dont close the file, garbage collector would close it
#### - with keyword closes the file as soon as the usage is over

In [15]:
# writing the file using with
with open("sample1.txt","w") as f:
    f.write("Prabhas")

In [16]:
f.write("hello")

ValueError: I/O operation on closed file.

In [17]:
# reading the file
with open("sample.txt", "r") as f:
    print(f.read())

hello
hi
how are you?
Iam fine


In [18]:
# moving within a file -> 10 char then 10 char
with open("sample.txt", "r") as f:
    print(f.read(10))
    print(f.read(10))

hello
hi
h
ow are you


In [19]:
# benefit ==> how to load a big file in memory
big_L = ["Hello World" for i in range(1000)]

with open("big.txt", "w") as f:
    f.writelines(big_L)

In [20]:
with open("big.txt", "r") as f:
    chunck_size = 100
    while len(f.read(chunck_size)) > 0:
        print(f.read(chunck_size), end=" ")
        f.read(chunck_size)

ello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHe o WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello orldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello Wo dHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello World llo WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHel  WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello  rldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello Wor Hello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldH lo WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHell WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello WorldHello Wor

In [21]:
with open("big.txt", "r") as f:
    chunck_size = 10
    while len(f.read(chunck_size)) > 0:
        print(f.read(chunck_size), end="***")
        f.read(chunck_size)

dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH***dHello Wor***orldHello ***o WorldHel***ello World***ldHello Wo***WorldHello***lo WorldHe***Hello Worl***rldHello W*** WorldHell***llo WorldH**

#### - seek(index_position): It grabs and take the cursor to given index position
#### - tell(): It returns the current index position of cursor

In [22]:
with open("sample.txt", "r") as f:
    print(f.read(10))
    print(f.tell())
    f.seek(0)
    print(f.read(10))
    print(f.tell())

hello
hi
h
12
hello
hi
h
12


In [23]:
with open("sample.txt", "r") as f:
    print(f.read(10))
    print(f.tell())
    f.seek(15)
    print(f.read(10))
    print(f.tell())

hello
hi
h
12
are you?
I
26


In [24]:
# seek during write
with open("sample.txt", "w") as f:
    f.write("Hello")
    f.seek(0)
    f.write("X")

In [25]:
with open("sample.txt", "w") as f:
    f.write("Hello")
    f.seek(0)
    f.write("Xa")

### ***Problems with working in text mode***
#### - can't work with binary files such as images
#### - not good for other data types such as int, float, list, tuples

In [26]:
# working with binary files
with open("screenshot.png", "r") as f:
    f.read()

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 57: character maps to <undefined>

In [27]:
# working with binary files
with open("screenshot.png", "rb") as f:
    with open("screenshot_copy.png", "wb") as wf:
        wf.write(f.read())

In [28]:
# working with other data types
with open("sample.txt", "w") as f:
    f.write(5)

TypeError: write() argument must be str, not int

In [29]:
with open("sample.txt", "w") as f:
    f.write("5")

In [30]:
with open("sample.txt", "r") as f:
    print(f.read()+5)

TypeError: can only concatenate str (not "int") to str

In [31]:
with open("sample.txt", "r") as f:
    print(int(f.read())+5)

10


In [32]:
# more complex data
d = {
    "name": "abc",
    "age" : 25,
    "gender": "male"
}
with open("sample.txt", "w") as f:
    f.write(d)

TypeError: write() argument must be str, not dict

In [33]:
with open("sample.txt", "w") as f:
    f.write(str(d))

In [34]:
with open("sample.txt", "r") as f:
    print(f.read())
    print(type(f.read()))

{'name': 'abc', 'age': 25, 'gender': 'male'}
<class 'str'>


In [35]:
with open("sample.txt", "r") as f:
    print(dict(f.read()))

ValueError: dictionary update sequence element #0 has length 1; 2 is required

### ***Serialization and Deserialization***
#### - Serialization: It is a process of converting python data types to json format
#### - Deserialization: It is a process of converting json to python data types

### ***what is JSON***
#### - JSON stands for java script on notation. It is a  Universal data format can be understand by every programming language
#### - Now-a-days every api was using json format

In [36]:
# serialization using json
# list
import json
l = [1, 2, 3, 4]
with open("demo.json", "w") as f:
    json.dump(l, f)

In [37]:
# dict
d = {
    "name": "abc",
    "age" : 25,
    "gender": "male"
}
with open("demo.json", "w") as f:
    json.dump(d, f)

In [38]:
# dict
d = {
    "name": "abc",
    "age" : 25,
    "gender": "male"
}
with open("demo.json", "w") as f:
    json.dump(d, f, indent=4)

In [39]:
# deserialization
import json
with open("demo.json", "r") as f:
    d = json.load(f)
    print(d)
    print(type(d))

{'name': 'abc', 'age': 25, 'gender': 'male'}
<class 'dict'>


In [40]:
# serialize and deserialize tuple
import json

t = (1, 2, 3, 4, 5)

with open("demo.json", "w") as f:
    json.dump(t, f)

In [41]:
import json
with open("demo.json", "r") as f:
    t = json.load(f)
    print(t)
    print(type(t))

[1, 2, 3, 4, 5]
<class 'list'>


In [42]:
# serialize and deserialize nested dictionary
import json

d = {
    "student": "abc",
    "marks": [23, 14, 34, 45, 56]
}

with open("demo.json", "w") as f:
    json.dump(d, f)

In [43]:
import json

d = {
    "student": "abc",
    "marks": [23, 14, 34, 45, 56]
}

with open("demo.json", "w") as f:
    json.dump(d, f, indent=4)

In [44]:
import json
with open("demo.json", "r") as f:
    d = json.load(f)
    print(d)
    print(type(d))

{'student': 'abc', 'marks': [23, 14, 34, 45, 56]}
<class 'dict'>


In [45]:
# serialize and deserialize custom objects
class Person:
    def __init__(self, fname, lname, age, gender):
        self.fname = fname
        self.lname = lname
        self.age   = age
        self.gender= gender

# format to printed in
# abc def age-> 25 gender -> male

In [46]:
person = Person("abc", "def", 25, "male")

In [47]:
# As a string
import json

def show_object(person):
    if isinstance(person, Person):
        return "{} {} age -> {} gender -> {}".format(person.fname, person.lname, person.age, person.gender)
with open("demo.json", "w") as f:
    json.dump(person, f, default=show_object)

In [48]:
# As a dict
import json

def show_object(person):
    if isinstance(person, Person):
        return {"name": person.fname+" "+ person.lname, "age": person.age, "gender": person.gender}
with open("demo.json", "w") as f:
    json.dump(person, f, default=show_object)

In [49]:
# As a dict
import json

def show_object(person):
    if isinstance(person, Person):
        return {"name": person.fname+" "+ person.lname, "age": person.age, "gender": person.gender}
with open("demo.json", "w") as f:
    json.dump(person, f, default=show_object, indent=4)

In [50]:
import json
with open("demo.json", "r") as f:
    d = json.load(f)
    print(d)
    print(type(d))

{'name': 'abc def', 'age': 25, 'gender': 'male'}
<class 'dict'>


### ***pickling***
#### - pickling is the process whereby a python object hierarchy is converted into a byte stream
#### - Unpickling is the inverse operation, whereby a byte stream(from a binary like file or byte-like object) is converted back into an object hierarchy

In [51]:
class Person:
    def __init__(self, name, age):
        self.name = name
        self.age = age

    def display_info(self):
        print("Hi my name is", self.name, "and Iam ", self.age, "years old")

In [52]:
p = Person("abc", 25)

In [53]:
# pickle dump
import pickle
with open("person.pkl", "wb") as f:
    pickle.dump(p, f)

In [54]:
# pickle load
import pickle
with open("person.pkl", "rb") as f:
    p = pickle.load(f)
    
p.display_info()

Hi my name is abc and Iam  25 years old


### ***pickle vs json***
#### - pickle lets the user to store data in binary format
#### - json lets the user to store data in a human readable text format