```
Types of Data:
--------------
(1) Text data ==> sequence of Unicode characters
(2) Binary data ==> sequence of bytes(i.e binary equivalents)

Types of files:
---------------
(1) Text files ==> All program files are text files
(2) Binary files ==> Images, music, video, exe files etc 

There are 3 steps, one should follow to perform file I/O in most of programming languages

==> Open
==> read/write
==> close

```

### ****Writing inside files:****

In [2]:
# if file is not present
f = open("sample.txt", "w")
f.write("hello world")
f.close()

```
==> To perform any sort of file I/o operation, the first step is open a file. we can do it through open("location+filename", "mode") function.
==> It takes two arguments i.e filename and mode
==> filename: It lets you tell other about location of file.
==> mode : It tells others about the mode that it was opened

==> In the above code snippet, first it create's a file handler object. we use the file handler object to write and read text inside the file. once, our work was done, we should close the file.

==> what if we do any operation after file was closed. It throws a value error saying that "I/o operation on closed file".
```

In [3]:
# writing multiple strings inside file
f = open("sample1.txt", "w")
f.write("Hello World")
f.write("\nhow are you?")
f.close()

In [4]:
# If file is already present

f = open("sample.txt", "w")
f.write("Tony Stark")
f.close()

```
==> whenever, we open an already existing file in write mode. It erases all the content of file and allow us to write the new content into the file.
```

```
let's understand how the open function works???
-----------------------------------------------

==> file are like normal files that are present on our hard drive(rom). At the moment, we execute the open() function, it tells python go to the particular location and load the file into ram.

==> inside ram, it loads into the buffer memory. In buffer memory, it reads the file character by character right from the beginning. so that's is called as buffer.(similar to cursor)

==> file stays inside buffer memory till all the operation of file gets done. once, file was close. it returns from the buffer memory to its original location.
```

In [5]:
# problem with "w" mode ==> use append "a" mode.

fo = open("sample1.txt", "a")
fo.write("\nI am fine")
fo.close()

```
==> In append mode, It will not remove and replacing the previously existed content with new content. Instead, It holds the existing content and adds the new content at the end of existing content. 
```

In [6]:
# To write multiline strings at a time into the file

fo = open("sample.txt", "w")
l = ["hello\n", "hi\n", "how are you\n", "I am fine"]
fo.writelines(fo)
fo.close()

UnsupportedOperation: not readable

In [7]:
fo = open("sample.txt", "w")
l = ["hello\n", "hi\n", "how are you\n", "I am fine"]
fo.writelines(l)
fo.close()

### ****reading files:****

In [8]:
# To read the entire content of file

fo = open("sample.txt", "r")
s  = fo.read()
print(s)
fo.close()

hello
hi
how are you
I am fine


In [9]:
# To read the n characters of file

fo = open("sample.txt", "r")
s  = fo.read(10)
print(s)
fo.close()

hello
hi
h


----
```
==> always remember, when we work with text files, either write or read a file. it understand everything in string format. It may be numbers, list, dictionary. It understand all of them as strings.
```
----

In [10]:
# To read line by line

fo = open("sample.txt", "r")
print(fo.readline(), end="")
print(fo.readline(), end="")
fo.close()

hello
hi


```
==> by default, readline makes a newline and print also makes line change.
==> use readline when we work with large files
==> use read when we work with small files
```

In [11]:
# read entire content using readline

fo = open("sample.txt", "r")

while True:

	data = fo.readline()
	
	if data == "":
		break
	else:
		print(data, end="")
fo.close()

hello
hi
how are you
I am fine

```
==> suppose, there might be a situation where we don't know the number of lines in a file. then we can use of above code to print it seemlesly.
```

----

### ****Using context manager(with):****

```
==> It is a good idea to close the file after uasage as it free up the resources
==> If we don't close it, garbage collector would close it
==> with keyword closes the file as soon as the usage is over
```

In [12]:
# using with and write inside file
with open("sample1.txt", "w") as f:
	f.write("selmon bhai")

In [13]:
f.write("hello") # throws error

ValueError: I/O operation on closed file.

In [14]:
# To read file using with

with open("sample.txt", "r") as f:
	print(f.read())

hello
hi
how are you
I am fine


In [15]:
#  moving inside file ==> read 10 char then 10 char

with open("sample.txt", "r") as f:
	print(f.read(10))
	print(f.read(10))

hello
hi
h
ow are you


In [16]:
# To load big file in memory

L = ["hello world" for i in range(1,10001)]
with open("big.txt", "w") as f:
	f.writelines(L)

In [17]:
with open("big.txt", "r") as f:
	chunk_size = 100
	while len(f.read(chunk_size)) > 0:
		print(f.read(chunk_size), end="***")
		f.read(chunk_size)

ello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhe***o worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello***orldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello wo***dhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello world***llo worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhel*** worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello ***rldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello wor***hello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldh***lo worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhello worldhell***worldhello worldhello worldhello worldhello worldhello worldhello worldhe

```
tell() ==> It gives the current index position of the cursor
seek() ==> It takes the cursor from the current index position to the given index position 
```

In [18]:
# reading first 10 char and again
with open("sample.txt", "r") as f:
	print(f.read(10))
	print(f.tell())
	f.seek(0)
	print(f.read(10))
	print(f.tell())

hello
hi
h
12
hello
hi
h
12


In [20]:
# seek during write

with open("sample.txt", "w") as f:
	f.write("Hello")
	# f.seek(0)
	# f.write("X")
	f.seek(0)
	f.write("Xa")

```
Problems of working with text mode:
-----------------------------------

==> can't work with binary files like images
==> not good for other data types like int/float/list/tuple
```

In [21]:
# working with binary files using text read mode it throws error

with open("Human.jpg", "r") as f:
	print(f.read())

UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 371: character maps to <undefined>

In [24]:
# working with binary files using rb & wb mode
with open("Human.jpg", "rb") as f:
    #f.read()
    print(f.read())

b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x01\x00H\x00H\x00\x00\xff\xdb\x00C\x00\x06\x04\x05\x06\x05\x04\x06\x06\x05\x06\x07\x07\x06\x08\n\x10\n\n\t\t\n\x14\x0e\x0f\x0c\x10\x17\x14\x18\x18\x17\x14\x16\x16\x1a\x1d%\x1f\x1a\x1b#\x1c\x16\x16 , #&\')*)\x19\x1f-0-(0%()(\xff\xdb\x00C\x01\x07\x07\x07\n\x08\n\x13\n\n\x13(\x1a\x16\x1a((((((((((((((((((((((((((((((((((((((((((((((((((\xff\xc2\x00\x11\x08\x078\x047\x03\x01"\x00\x02\x11\x01\x03\x11\x01\xff\xc4\x00\x1c\x00\x00\x02\x03\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x02\x03\x04\x05\x06\x07\x08\xff\xc4\x00\x18\x01\x01\x01\x01\x01\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01\x02\x03\x04\xff\xda\x00\x0c\x03\x01\x00\x02\x10\x03\x10\x00\x00\x01\xf3\x1d\xae/j\xe50@\nC@\x0cC)\r\x02`\x00\x00\x00\x00\x00\x00\x0c\x00\x00\x13\x00\x010C\x05$@\xd0\x00\x00\x10\xc4\xc0@\xc4\xc4\xc0\x00\x00\x00\x00\x00\x13JQ~j\xd0\xea\xb8\x04\xc0\x08\x13H\xc0P\x00\x00\x00\x00\x00\x00\x00\x00\x18\x80h\x00\x00\x05\x00F\x00\x00\x00\xc44\x00\x0c

In [25]:
with open("Human.jpg", "rb") as f:
    with open("Human_copy.jpg", "wb") as wf:
        wf.write(f.read())

In [26]:
# working with other data types
with open("sample.txt", "w") as f:
	f.write(5)

TypeError: write() argument must be str, not int

In [27]:
with open("sample.txt", "w") as f:
	f.write('5')

In [28]:
with open("sample.txt", "r") as f:
	print(f.read()+5)

TypeError: can only concatenate str (not "int") to str

In [29]:
with open("sample.txt", "r") as f:
	print(int(f.read())+5)

10


In [30]:
# more complex data

d = {
"name": "nitish",
"age": 33,
"gender": "male"
}

with open("sample.txt", "w") as f:
	f.write(d)

TypeError: write() argument must be str, not dict

In [31]:
with open("sample.txt", "w") as f:
	f.write(str(d))

In [32]:
with open("sample.txt", "r") as f:
	print(f.read())
	print(type(f.read()))

{'name': 'nitish', 'age': 33, 'gender': 'male'}
<class 'str'>


In [33]:
with open("sample.txt", "r") as f:
	print(dict(f.read()))

ValueError: dictionary update sequence element #0 has length 1; 2 is required

----
```
==> solution for all above problems is serialization and deserialization

what is serialization and deserialization???

serialization: converting python data types to json format
deserialization: converting json format to python data types


what is json?

==> json stands for java script object notation. It is a universal text format understand by any program language.
```

In [34]:
# serialization with json module: list

import json

l = [10, 20, 30, 40]
with open("demo.json", "w") as f:
	json.dump(l, f)

In [35]:
# deserialization with json module: list
import json
with open("demo.json", "r") as f:
    data = json.load(f)
    print(data)
    print(type(data))

[10, 20, 30, 40]
<class 'list'>


In [36]:
# serialization with json module: dict
import json

d = {
"name": "nitish",
"age": 33,
"gender": "male"
}

with open("demo.json", "w") as f:
	json.dump(d, f)


In [37]:
# deserialization with json module: dict
with open("demo.json", "r") as f:
	d = json.load(f)
	print(d)
	print(type(d))

{'name': 'nitish', 'age': 33, 'gender': 'male'}
<class 'dict'>


In [38]:
# serialization with json module: tuple

import json

t = (10, 20, 30, 40)
with open("demo.json", "w") as f:
	json.dump(t, f)

In [39]:
# deserialization with json module: tuple
import json
with open("demo.json", "r") as f:
	data = json.load(f)
	print(data)
	print(type(data))

[10, 20, 30, 40]
<class 'list'>


In [40]:
# serialization with json module: nested dictionary
import json
d = {
"name": "nitish",
"marks": [23, 14, 34, 45, 56]
}
with open("demo.json", "w") as f:
	json.dump(d, f)


In [41]:
# deserialization with json module: nested dictionary
with open("demo.json", "r") as f:
	data = json.load(f)
	print(data)
	print(type(data))

{'name': 'nitish', 'marks': [23, 14, 34, 45, 56]}
<class 'dict'>


In [42]:
# serializing and deserializing custom objects

class Person:

	def __init__(self, fname, lname, age, gender):
		self.fname = fname
		self.lname = lname
		self.age   = age
		self.gender= gender

# format to printed in
# Nitish singh
# age -> 33
# gender -> male

person = Person("Nitish", "singh", 33, "male")

In [43]:
import json

with open("demo.json", "w") as f:
	json.dump(person, f)

TypeError: Object of type Person is not JSON serializable

In [44]:
import json

def show_object(person):
	if isinstance(person, Person):
		return "{} {} age->{} gender->{}".format(person.fname, person.lname, person.age, person.gender)

with open("demo.json", "w") as f:
	json.dump(person, f, default=show_object)

In [45]:
with open("demo.json","r") as f:
    data = json.load(f)
    print(data)
    print(type(data))

Nitish singh age->33 gender->male
<class 'str'>


In [46]:
# As a dict

import json

def show_object(person):
	if isinstance(person, Person):
		return {"name": person.fname+" "+person.lname, "age": person.age, "gender": person.gender}

with open("demo.json", "w") as f:
	json.dump(person, f, default=show_object)

In [47]:
# deserializing

import json
with open("demo.json", "r") as f:
	print(json.load(f))

{'name': 'Nitish singh', 'age': 33, 'gender': 'male'}


In [48]:
import json
with open("demo.json", "r") as f:
	d = json.load(f)
	print(d)
	print(type(d))

{'name': 'Nitish singh', 'age': 33, 'gender': 'male'}
<class 'dict'>


----
```
==> we can't  JSON serialize our custom objects. But we can tell to json how to serialize through "default" argument inside dump function.

==> It is temporarily helped us to write and read info inside Json file. But It makes object to loose its properties and behaviour

==> Is there any way that help us to keep the properties and behaviour of our object and also follow serialization and deserialization????

pickling and unpickling:
------------------------

==> It is process of converting python object hierarchy to a byte str
eam
==> It is inverse opeation of converting byte stream to python object hierarchy
```

In [50]:
class Person:

	def __init__(self, name, age):
		self.name = name
		self.age  = age
	def display_info(self):
		print("Hello my name is", self.name, "and I am ", self.age, "year's old")

p = Person("Nitish", 33)

In [51]:
# pickle dump
import pickle

with open("person.pkl", "wb") as f:
	pickle.dump(p, f)

In [52]:
# pickle load
import pickle

with open("person.pkl", "rb") as f:
	p=pickle.load(f)
p.display_info()

Hello my name is Nitish and I am  33 year's old


<hr>

```
==> what is difference between pickle and JSON???

pickle vs Json:
---------------

==> pickle: It allows user to store data in binary format

==> Json: It allows user to store data in human-readable text format
```