# Lecture 9 (Part b)

## Public, Protected, Private

In most objected oriented languages, data encapsulation is achieved by enabling classes to declair protected and private data members in addition to the public ones. Quick recap:

* Public members are accessible to everyone
* Private members are only accessible to the class itself
* Protected members are accessiblve to the all instances of the same class (including child classes) 

Python's implementation of data encapsulation is not strictly enforced by the langauge and is mostly a convention. Data members starting with one underscore (`_`) are protected. Data members starting with two underscores (`__`) are private.

Consider the following example:

In [None]:
class parent:
    def __init__(self):
        self.public="I'm Public"
        self._protected="I'm Protected"
        self.__private="I'm Private"
        
    def set_private_parent(self,v):
        self.__private=v
        
    def get_private_parent(self):
        return self.__private
        
class child(parent):
    def __init__(self, init_parent=True):
        if init_parent:
            super(child,self).__init__()
    
    def set_public(self,v):
        self.public=v
        
    def set_protected(self,v):
        self._protected=v

    def set_private(self,v):
        self.__private=v
        
    def get_public(self):
        return self.public
        
    def get_protected(self):
        return self._protected

    def get_private(self):
        return self.__private


First note that because we declared the data members in the parent constructor, we have to make sure that the child calls the parent constructor.

In [None]:
child_instance = child(init_parent=False)
child_instance.public

Even though we wrote accessors (setters and getters), we can directly access and set the data:

In [None]:
child_instance = child()
print(child_instance.public)
child_instance.public="Changed Public"
print(child_instance.public)

Of couse the accessors also work:

In [None]:
child_instance = child()
print(child_instance.get_public())
child_instance.set_public("Changed Public")
print(child_instance.get_public())

How about the protected?

In [None]:
print(child_instance._protected)
child_instance._protected="Changed Protected"
print(child_instance._protected)

In [None]:
child_instance = child()
print(child_instance.get_protected())
child_instance.set_protected("Changed Protected")
print(child_instance.get_protected())

So there isn't any difference between public and protected in python. We can just adopt the convention that data members starting with a single underscore will be only accessed via accessors. 

How about private?

In [None]:
print(child_instance.__private)
child_instance.__private="Changed Private"
print(child_instance.__private)

It appears that we finally have some protection. A closer look shows how its done:

In [None]:
dir(child_instance)

Note that instead of a data member `__private` we have a data member `_parent__private`. All python does is to replace anything that has the pattern `<class_name>.__<data_name>` to `<class_name>._<class_name>__<data_name>`. So in fact, we can still change this data member and there is no real protection:

In [None]:
print(child_instance._parent__private)
child_instance._parent__private="Changed Private"
print(child_instance._parent__private)

Just to make sure we understand, here what the parent class looks like:

In [None]:
parent_instance=parent()
dir(parent_instance)

Note that a child cannot change a parent's private data:

In [None]:
child_instance = child()
print(child_instance.get_private())
child_instance.set_private("Changed Private")
print(child_instance.get_private())

In [None]:
child_instance = child()
print(child_instance.get_private_parent())
child_instance.set_private_parent("Changed Private")
print(child_instance.get_private_parent())

Why don't I get an error in the following case? 

In [None]:
child_instance = child()
print(child_instance.get_private_parent())
child_instance.set_private("Changed Private")
print(child_instance.get_private_parent())

Note that the code doesn't work as intended... why? 

See if you can figure it out by looking at:

In [None]:
dir(child_instance)

## Data Serialization

Data serialization refers to the process of converting data (usually in memory) that may have complex structure (e.g. a tree), into a linear sequence that can be use to reconstitute the original data structure. Such a sequence can be stored in a file or transmitted over a network. 

For example consider the following "simple" data structure:

In [None]:
# Simple Data Type

data_dict = { "A": 1, 
              "B": "Foo"}

### Python `repr`

The python `repr` method of build-ins and classes you implement can be used as a means of serialization. Take any python built in and you can see it's string representation, which is essentially a string of python code that can evaluates to the object:

In [None]:
repr(data_dict)

This representation can be easily written to a file:

In [None]:
with open('file.py',"w") as f: 
    f.write(repr(data_dict))

In [None]:
!cat file.py

And reconstituted by evaluating the contents of the file:

In [None]:
with open('file.py', 'r') as f: 
    data_dict_reloaded = eval(f.read())

data_dict_reloaded

Note that `eval` uses the python interpreter to execute python expressions stored in strings:

In [None]:
eval("print('Hello World')")

In [None]:
x=eval("1+1")
x

### YAML

There are other standard formats for storing simple data types. For example YAML:

In [None]:
import yaml
yaml.dump(data_dict)

In [None]:
with open('file.yaml',"w") as f: 
    f.write(yaml.dump(data_dict))

In [None]:
!cat file.yaml

In [None]:
!ls 

In [None]:
with open('file.yaml', 'r') as f: 
    data_dict_reloaded = yaml.safe_load(f.read())

data_dict_reloaded

### JSON

[JSON](https://www.json.org/json-en.html) is commonly used to transmit data on the web:

In [None]:
import json
json.dumps(data_dict)

In [None]:
with open('file.json',"w") as f: 
    json.dump(data_dict,f)

In [None]:
!cat file.json

In [None]:
with open('file.json', 'r') as f: 
    data_dict_reloaded = json.load(f)

data_dict_reloaded

### XML

XML is another format commonly used for storing data. It allows a bit more structure and there are python tools for creating XML representations of data, but it's a bit more complicated than the example above, so we'll skip it for now.

### pickle

[pickle](https://docs.python.org/3/library/pickle.html) is python's method of serialing objects. Some advantages are that it is a binary format, so it is more compact, and that it can store full python objects, not just simple built-ins. Lets look at the [pickle documentation](https://docs.python.org/3/library/pickle.html) first.

Here is an example:

In [None]:
import pickle
pickle.dumps(data_dict,protocol=2)

In [None]:
with open('file.pickle',"wb") as f: 
    pickle.dump(data_dict,f)

In [None]:
!cat file.pickle

In [None]:
with open('file.pickle', 'rb') as f: 
    data_dict_reloaded = pickle.load(f)

data_dict_reloaded

## Python classes

Imagine you have data stored in a python object:

In [None]:
# Instance of a python class with data

class data_class:
    def __init__(self):
        self._data = dict()
    
    def add(self,key,value):
        self._data[key]=value
        
    def get(self,key):
        return self._data[key]
    
    def __repr__(self):
        return self._data.__repr__()

data_class_instance = data_class()
data_class_instance.add("A",1)
data_class_instance.add("B","Foo")

print("Value of A:", data_class_instance.get("A"))
print("Value of B:", data_class_instance.get("B"))

Since we implemented `__repr__`, I should be able to store the data using `repr`:

In [None]:
with open('file.py',"w") as f: 
    f.write(repr(data_class_instance))

In [None]:
with open('file.py', 'r') as f: 
    data_class_instance_reloaded = eval(f.read())

data_class_instance_reloaded

But what I get back is not the original object reconstituted, but a dictionary holding the data:

In [None]:
type(data_class_instance_reloaded)

In [None]:
data_class_instance_reloaded.add("C",2)

In [None]:
data_class_instance_reloaded

### pickle

Pickle allows me to store the object:

In [None]:
with open('file.pickle',"wb") as f: 
    pickle.dump(data_class_instance,f)

In [None]:
with open('file.pickle', 'rb') as f: 
    data_class_instance_reloaded = pickle.load(f)

data_class_instance_reloaded

In [None]:
type(data_class_instance_reloaded)

In [None]:
data_class_instance_reloaded.add("C",2)

## Storing Multiple Objects into Pickle

Use a dictionary.

In [None]:
data_class_instance_2 = data_class()
data_class_instance_2.add("C",2)
data_class_instance_2.add("D","Bar")

In [None]:
with open('file.pickle',"wb") as f: 
    pickle.dump({"my_class":data_class_instance,
                 "my_class_2":data_class_instance_2},
                f)

In [None]:
with open('file.pickle', 'rb') as f: 
    loaded_data = pickle.load(f)

data_class_instance_reloaded = loaded_data["my_class"]
data_class_instance_reloaded_2 = loaded_data["my_class_2"]

## Pickling Data

In [None]:
import numpy as np
M = np.random.random((1000,1000))

In [None]:
with open('M.pickle',"wb") as f: 
    pickle.dump(M, f)

In [None]:
np.save("M.npy",M)

In [None]:
!ls -lh

In [None]:
!ls -l

In [None]:
M_list=M.tolist()

In [None]:
with open('M_list.pickle',"wb") as f: 
    pickle.dump(M, f)

In [None]:
!ls -lh

In [None]:
!ls -l

In [None]:
!rm *.pickle *.yaml *.json file.py