# File Pickling
***
According to __[Python Documentation on pickle module](https://docs.python.org/3/library/pickle.html)__, *Pickling* is the process whereby a Python object hierachy is converted into a byte stream, and *unpickling* is the inverse operation, whereby a byte stream (from a binary file or bytes-like object) is converted back into an object hierachy.

<div class="alert alert-block alert-danger">
    <b>WARNING:</b> The pickle module is not secure. Only unpickle data you trust.
</div>

Consider signing data with __[Keyed-Hashing for Message Authentication](https://docs.python.org/3/library/hmac.html#module-hmac)__ simply abbreviated as **hmac**. This provides for a way to check the integrity of information transmitted over or stored in an unreliable medium. It provides for for mechanism to check integrity of a file based on a secret key referred to as *message authentication codes (MAC)*

## Why Use Object Serialization?

In order to understand the importance of serialization, we will demonstrate it with an example. We will follow the following steps:
1. Create a nested dictionary, a dictionary of dictionaries.
2. Write the dictionary data as a .txt file without **serialization**
3. Load the .txt file
4. Try accessing elements of the dictionary from the loaded .txt file

In [3]:
# STEP 1: Creating the nested dictionary of domestic employees to XY
employees = {
    'employee_1' : {
        'name': 'Alice', 'age':32, 'role':'Chef'
    },
    'employee_2' : {
        'name': 'Liza', 'age':37, 'role':'Nanny'
    },
    'employee_3' : {
        'name': 'John', 'age':35, 'role':'Gardener'
    },
    'employee_4' : {
        'name': 'Bobby', 'age':28, 'role':'Security'
    },
    'employee_5' : {
        'name': 'Akello', 'age':29, 'role':'Teacher'
    },
    
}

employees

{'employee_1': {'name': 'Alice', 'age': 32, 'role': 'Chef'},
 'employee_2': {'name': 'Liza', 'age': 37, 'role': 'Nanny'},
 'employee_3': {'name': 'John', 'age': 35, 'role': 'Gardener'},
 'employee_4': {'name': 'Bobby', 'age': 28, 'role': 'Security'},
 'employee_5': {'name': 'Akello', 'age': 29, 'role': 'Teacher'}}

In [4]:
# Checking the object type
type(employees)

dict

In [5]:
# STEP 2: Writing the file into a .txt file without serialization
with open('employees.txt','w') as data:
    data.write(str(employees))

<div class="alert alert-block alert-info">
<b>NOTE:</b> The str() function is has been used to convert the employees dictionary into text because the write() method can only write strings to a file.
</div>

In [6]:
# STEP 3: Lading the employees.txt file
with open('employees.txt','r') as f:
    # Printing the content of the file
    for employee in f:
        print(employee)

{'employee_1': {'name': 'Alice', 'age': 32, 'role': 'Chef'}, 'employee_2': {'name': 'Liza', 'age': 37, 'role': 'Nanny'}, 'employee_3': {'name': 'John', 'age': 35, 'role': 'Gardener'}, 'employee_4': {'name': 'Bobby', 'age': 28, 'role': 'Security'}, 'employee_5': {'name': 'Akello', 'age': 29, 'role': 'Teacher'}}


In [7]:
print(f)

<_io.TextIOWrapper name='employees.txt' mode='r' encoding='cp1252'>


In [8]:
# Trying to access a dictionary from the main container dictionary, 
# i.e. accessing employee_1 dictionary from the employees dictionary
f['employee_1']

TypeError: '_io.TextIOWrapper' object is not subscriptable

<div class="alert alert-block alert-danger">
    <b>TypeError:</b> '_io.TextIOWrapper' object is not subscriptable:
</div>

This error is thrown when we try accessing an element from the dictionary. This error occurs when we try slicing or indexing an object (data type) that does not support such operations. In this case, the object is not identified as type dictionary.

In [9]:
# Confirming the type
type(f)

_io.TextIOWrapper

**_io.TextIOWrapper** is a string file that represents contents of the entire string file object. File f cannot be accessed as a dictionary since it is not one, it is a string containing the contents of the file that had been read.

In our case above, the nested dictionary is now being printed as a string. And is on this that the importance of file pickling comes up! **How do we preserve the state of a file/object?**

<div class="alert alert-block alert-success">
<b>Importance of Serialization:</b> Serialization allows for preservation of objects in their original state without loosing any information. In Python, we use the pickle module to serialize and deserialize data types. Note that this format cannot be loaded using any other languages since it is native to Python.
</div>

### Comparison of Pickle and JSON
The **comparisons** between the pickle protocol and JSON __[are](https://docs.python.org/3/library/pickle.html)__:

|Pickle|JSON|
|:--|:--|
|Binary serialization format|Text Serialization _(usually utf-8)_|
|Not human readeable|Human readable|
|Python specific|Interoperable with other languages|
|Only represents one Python data type/structure|Can represent various Python data structures/types|
|Deserializing untrusted JSON does not create an arbitrary code execution vulnerability|Deserializing untrusted pickle creates an abitrary code execution vulnerability|


```Python

```

In [None]:
##