# Lecture 10 (Part a)

## Public, Protected, Private

In most objected oriented languages, data encapsulation is achieved by enabling classes to declair protected and private data members in addition to the public ones. Quick recap:

* Public members are accessible to everyone
* Private members are only accessible to the class itself
* Protected members are accessiblve to the all instances of the same class (including child classes) 

Python's implementation of data encapsulation is not strictly enforced by the langauge and is mostly a convention. Data members starting with one underscore (`_`) are protected. Data members starting with two underscores (`__`) are private.

Consider the following example:

In [1]:
class parent:
    def __init__(self):
        self.public="I'm Public"
        self._protected="I'm Protected"
        self.__private="I'm Private"
        
    def set_private_parent(self,v):
        self.__private=v
        
    def get_private_parent(self):
        return self.__private
        
class child(parent):
    def __init__(self, init_parent=True):
        if init_parent:
            super(child,self).__init__()
    
    def set_public(self,v):
        self.public=v
        
    def set_protected(self,v):
        self._protected=v

    def set_private(self,v):
        self.__private=v
        
    def get_public(self):
        return self.public
        
    def get_protected(self):
        return self._protected

    def get_private(self):
        return self.__private


Lets see what we can do with the private attributes:

In [2]:
# Create an instance of parent:
my_parent = parent()
print(my_parent.__private)

AttributeError: 'parent' object has no attribute '__private'

We can't access the private attributes from outside of the class. It should be the same with the protected:

In [3]:
print(my_parent._protected)

I'm Protected


In [4]:
my_parent._protected = 'abc'
print(my_parent._protected)

abc


But its not! We can access protected data!

Finally, lets check out public attributes:

In [5]:
print(my_parent.public)
my_parent.public = 5
print(my_parent.public)

I'm Public
5


Note that we can set attributes of the class instead of the instance. 

In [6]:
parent.public = 22
print(parent.public)
print(my_parent.public)

22
5


But it doesn't make sense to do so. 

The proper way to access/change a private attribute is using the accessors set setters provided by the class:

In [7]:
my_parent.get_private_parent()
my_parent.set_private_parent(42)
my_parent.get_private_parent()

42

Note that because we declared the data members in the parent constructor, we have to make sure that the child calls the parent constructor.

In [8]:
child_instance = child(init_parent=False)
child_instance.public

22

We set this data member earlier, that's why its value is 22. It should have been "I'm public", but we didn't call the constructor of the parent class. 

If everything is done correctly:

In [9]:
child_instance = child()
child_instance.public

"I'm Public"

Even though we wrote accessors (setters and getters), we can directly access and set the data:

In [10]:
child_instance = child()
print(child_instance.public)
child_instance.public="Changed Public"
print(child_instance.public)

I'm Public
Changed Public


Of couse the accessors also work:

In [11]:
child_instance = child()
print(child_instance.get_public())
child_instance.set_public("Changed Public")
print(child_instance.get_public())

I'm Public
Changed Public


How about the protected?

In [12]:
print(child_instance._protected)
child_instance._protected="Changed Protected"
print(child_instance._protected)

I'm Protected
Changed Protected


In [13]:
child_instance = child()
print(child_instance.get_protected())
child_instance.set_protected("Changed Protected")
print(child_instance.get_protected())

I'm Protected
Changed Protected


So there isn't any difference between public and protected in python. We can just adopt the convention that data members starting with a single underscore will be only accessed via accessors. 

How about private?

In [14]:
print(child_instance.__private)
child_instance.__private="Changed Private"
print(child_instance.__private)

AttributeError: 'child' object has no attribute '__private'

It appears that we finally have some protection. A closer look shows how its done:

In [15]:
dir(child_instance)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_parent__private',
 '_protected',
 'get_private',
 'get_private_parent',
 'get_protected',
 'get_public',
 'public',
 'set_private',
 'set_private_parent',
 'set_protected',
 'set_public']

Note that instead of a data member `__private` we have a data member `_parent__private`. All python does is to replace anything that has the pattern `<class_name>.__<data_name>` to `<class_name>._<class_name>__<data_name>`. So in fact, we can still change this data member and there is no real protection:

In [16]:
print(child_instance._parent__private)
child_instance._parent__private="Changed Private"
print(child_instance._parent__private)

I'm Private
Changed Private


Just to make sure we understand, here what the parent class looks like:

In [17]:
parent_instance=parent()
dir(parent_instance)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_parent__private',
 '_protected',
 'get_private_parent',
 'public',
 'set_private_parent']

Note that a child cannot change a parent's private data:

In [18]:
child_instance = child()
print(child_instance.get_private())
child_instance.set_private("Changed Private")
print(child_instance.get_private())

AttributeError: 'child' object has no attribute '_child__private'

In [19]:
child_instance = child()
print(child_instance.get_private_parent())
child_instance.set_private_parent("Changed Private")
print(child_instance.get_private_parent())

I'm Private
Changed Private


Why don't I get an error in the following case? 

In [20]:
child_instance = child()
print(child_instance.get_private_parent())
child_instance.set_private("Changed Private")
print(child_instance.get_private_parent())

I'm Private
I'm Private


Note that the code doesn't work as intended... why? 

See if you can figure it out by looking at:

In [21]:
dir(child_instance)

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_child__private',
 '_parent__private',
 '_protected',
 'get_private',
 'get_private_parent',
 'get_protected',
 'get_public',
 'public',
 'set_private',
 'set_private_parent',
 'set_protected',
 'set_public']

## Data Serialization

Data serialization refers to the process of converting data (usually in memory) that may have complex structure (e.g. a tree), into a linear sequence that can be use to reconstitute the original data structure. Such a sequence can be stored in a file or transmitted over a network. 

For example consider the following "simple" data structure:

In [22]:
# Simple Data Type

data_dict = { "A": 1, 
              "B": "Foo"}

### Python `repr`

The python `repr` method of build-ins and classes you implement can be used as a means of serialization. Take any python built in and you can see it's string representation, which is essentially a string of python code that can evaluates to the object:

In [23]:
repr(data_dict)

"{'A': 1, 'B': 'Foo'}"

This representation can be easily written to a file:

In [24]:
with open('file.py',"w") as f: 
    f.write(repr(data_dict))

In [25]:
!cat file.py

{'A': 1, 'B': 'Foo'}

And reconstituted by evaluating the contents of the file:

In [26]:
with open('file.py', 'r') as f: 
    data_dict_reloaded = eval(f.read())

data_dict_reloaded

{'A': 1, 'B': 'Foo'}

Note that `eval` uses the python interpreter to execute python expressions stored in strings:

In [27]:
eval("print('Hello World')")

Hello World


In [28]:
x=eval("1+1")
x

2

### YAML

There are other standard formats for storing simple data types. For example YAML:

In [29]:
import yaml
yaml.dump(data_dict)

'A: 1\nB: Foo\n'

In [30]:
with open('file.yaml',"w") as f: 
    f.write(yaml.dump(data_dict))

In [31]:
!cat file.yaml

A: 1
B: Foo


In [32]:
!ls 

[31mLecture.10.a.ipynb[m[m M.pickle           file.json          file.yaml
[31mLecture.10.b.ipynb[m[m M_list.pickle      file.pickle
M.npy              [31mScores.csv[m[m         file.py


In [33]:
with open('file.yaml', 'r') as f: 
    data_dict_reloaded = yaml.safe_load(f.read())

data_dict_reloaded

{'A': 1, 'B': 'Foo'}

### JSON

[JSON](https://www.json.org/json-en.html) is commonly used to transmit data on the web:

In [34]:
import json
json.dumps(data_dict)

'{"A": 1, "B": "Foo"}'

In [35]:
with open('file.json',"w") as f: 
    json.dump(data_dict,f)

In [36]:
!cat file.json

{"A": 1, "B": "Foo"}

In [37]:
with open('file.json', 'r') as f: 
    data_dict_reloaded = json.load(f)

data_dict_reloaded

{'A': 1, 'B': 'Foo'}

### XML

XML is another format commonly used for storing data. It allows a bit more structure and there are python tools for creating XML representations of data, but it's a bit more complicated than the example above, so we'll skip it for now.

### pickle

[pickle](https://docs.python.org/3/library/pickle.html) is python's method of serialing objects. Some advantages are that it is a binary format, so it is more compact, and that it can store full python objects, not just simple built-ins. Lets look at the [pickle documentation](https://docs.python.org/3/library/pickle.html) first.

Here is an example:

In [38]:
import pickle
pickle.dumps(data_dict,protocol=2)

b'\x80\x02}q\x00(X\x01\x00\x00\x00Aq\x01K\x01X\x01\x00\x00\x00Bq\x02X\x03\x00\x00\x00Fooq\x03u.'

In [39]:
with open('file.pickle',"wb") as f: 
    pickle.dump(data_dict,f)

In [40]:
!cat file.pickle

��       }�(�A�K�B��Foo�u.

In [41]:
with open('file.pickle', 'rb') as f: 
    data_dict_reloaded = pickle.load(f)

data_dict_reloaded

{'A': 1, 'B': 'Foo'}

## Python classes

Imagine you have data stored in a python object:

In [42]:
# Instance of a python class with data

class data_class:
    def __init__(self):
        self._data = dict()
    
    def add(self,key,value):
        self._data[key]=value
        
    def get(self,key):
        return self._data[key]
    
    def __repr__(self):
        return self._data.__repr__()

data_class_instance = data_class()
data_class_instance.add("A",1)
data_class_instance.add("B","Foo")

print("Value of A:", data_class_instance.get("A"))
print("Value of B:", data_class_instance.get("B"))

Value of A: 1
Value of B: Foo


Since we implemented `__repr__`, I should be able to store the data using `repr`:

In [43]:
with open('file.py',"w") as f: 
    f.write(repr(data_class_instance))

In [44]:
!cat file.py

{'A': 1, 'B': 'Foo'}

In [45]:
!ls -l

total 47272
-rwxr--r--@ 1 afarbin  staff    36258 Feb 17 13:40 [31mLecture.10.a.ipynb[m[m
-rwxr-xr-x@ 1 afarbin  staff   134721 Feb 17 11:36 [31mLecture.10.b.ipynb[m[m
-rw-r--r--@ 1 afarbin  staff  8000128 Feb  7 10:37 M.npy
-rw-r--r--@ 1 afarbin  staff  8000163 Feb  7 10:37 M.pickle
-rw-r--r--@ 1 afarbin  staff  8000163 Feb  7 10:37 M_list.pickle
-rwxr--r--@ 1 afarbin  staff     2428 Feb  7 10:37 [31mScores.csv[m[m
-rw-r--r--@ 1 afarbin  staff       20 Feb 17 13:32 file.json
-rw-r--r--@ 1 afarbin  staff       32 Feb 17 13:38 file.pickle
-rw-r--r--@ 1 afarbin  staff       20 Feb 17 13:41 file.py
-rw-r--r--@ 1 afarbin  staff       12 Feb 17 13:30 file.yaml


In [46]:
with open('file.py', 'r') as f: 
    data_class_instance_reloaded = eval(f.read())

data_class_instance_reloaded

{'A': 1, 'B': 'Foo'}

But what I get back is not the original object reconstituted, but a dictionary holding the data:

In [47]:
type(data_class_instance_reloaded)

dict

In [48]:
data_class_instance_reloaded.add("C",2)

AttributeError: 'dict' object has no attribute 'add'

In [49]:
data_class_instance_reloaded

{'A': 1, 'B': 'Foo'}

We can modify the class to work.

In [50]:
# Instance of a python class with data

class data_class:
    def __init__(self,d=None):
        if d:
            self._data=d
        else:
            self._data = dict()
    
    def add(self,key,value):
        self._data[key]=value
        
    def get(self,key):
        return self._data[key]
    
    def __repr__(self):
        return "data_class("+self._data.__repr__()+")"

data_class_instance = data_class()
data_class_instance.add("A",1)
data_class_instance.add("B","Foo")

print("Value of A:", data_class_instance.get("A"))
print("Value of B:", data_class_instance.get("B"))

Value of A: 1
Value of B: Foo


In [51]:
with open('file.py',"w") as f: 
    f.write(repr(data_class_instance))

with open('file.py', 'r') as f: 
    data_class_instance_reloaded = eval(f.read())

data_class_instance_reloaded

data_class({'A': 1, 'B': 'Foo'})

In [52]:
!cat file.py

data_class({'A': 1, 'B': 'Foo'})

In [53]:
data_class_instance_reloaded.add("C",2)

### pickle

Pickle allows me to store the object:

In [54]:
with open('file.pickle',"wb") as f: 
    pickle.dump(data_class_instance,f)

In [55]:
with open('file.pickle', 'rb') as f: 
    data_class_instance_reloaded = pickle.load(f)

data_class_instance_reloaded

data_class({'A': 1, 'B': 'Foo'})

In [56]:
type(data_class_instance_reloaded)

__main__.data_class

In [57]:
data_class_instance_reloaded.add("C",2)

## Storing Multiple Objects into Pickle

Use a dictionary.

In [58]:
data_class_instance_2 = data_class()
data_class_instance_2.add("C",2)
data_class_instance_2.add("D","Bar")

In [59]:
with open('file.pickle',"wb") as f: 
    pickle.dump({"my_class":data_class_instance,
                 "my_class_2":data_class_instance_2},
                f)

In [60]:
with open('file.pickle', 'rb') as f: 
    loaded_data = pickle.load(f)

data_class_instance_reloaded = loaded_data["my_class"]
data_class_instance_reloaded_2 = loaded_data["my_class_2"]

## Pickling Data

In [61]:
import numpy as np
M = np.random.random((1000,1000))

In [62]:
with open('M.pickle',"wb") as f: 
    pickle.dump(M, f)

In [63]:
np.save("M.npy",M)

In [64]:
!ls -lh

total 48040
-rwxr--r--@ 1 afarbin  staff    38K Feb 17 13:42 [31mLecture.10.a.ipynb[m[m
-rwxr-xr-x@ 1 afarbin  staff   132K Feb 17 11:36 [31mLecture.10.b.ipynb[m[m
-rw-r--r--@ 1 afarbin  staff   7.6M Feb 17 13:44 M.npy
-rw-r--r--@ 1 afarbin  staff   7.6M Feb 17 13:44 M.pickle
-rw-r--r--@ 1 afarbin  staff   7.6M Feb  7 10:37 M_list.pickle
-rwxr--r--@ 1 afarbin  staff   2.4K Feb  7 10:37 [31mScores.csv[m[m
-rw-r--r--@ 1 afarbin  staff    20B Feb 17 13:32 file.json
-rw-r--r--@ 1 afarbin  staff   132B Feb 17 13:43 file.pickle
-rw-r--r--@ 1 afarbin  staff    32B Feb 17 13:42 file.py
-rw-r--r--@ 1 afarbin  staff    12B Feb 17 13:30 file.yaml


In [65]:
!ls -l

total 48040
-rwxr--r--@ 1 afarbin  staff    40360 Feb 17 13:44 [31mLecture.10.a.ipynb[m[m
-rwxr-xr-x@ 1 afarbin  staff   134721 Feb 17 11:36 [31mLecture.10.b.ipynb[m[m
-rw-r--r--@ 1 afarbin  staff  8000128 Feb 17 13:44 M.npy
-rw-r--r--@ 1 afarbin  staff  8000163 Feb 17 13:44 M.pickle
-rw-r--r--@ 1 afarbin  staff  8000163 Feb  7 10:37 M_list.pickle
-rwxr--r--@ 1 afarbin  staff     2428 Feb  7 10:37 [31mScores.csv[m[m
-rw-r--r--@ 1 afarbin  staff       20 Feb 17 13:32 file.json
-rw-r--r--@ 1 afarbin  staff      132 Feb 17 13:43 file.pickle
-rw-r--r--@ 1 afarbin  staff       32 Feb 17 13:42 file.py
-rw-r--r--@ 1 afarbin  staff       12 Feb 17 13:30 file.yaml


In [66]:
M_list=M.tolist()

In [67]:
with open('M_list.pickle',"wb") as f: 
    pickle.dump(M, f)

In [68]:
!ls -lh

total 48040
-rwxr--r--@ 1 afarbin  staff    39K Feb 17 13:44 [31mLecture.10.a.ipynb[m[m
-rwxr-xr-x@ 1 afarbin  staff   132K Feb 17 11:36 [31mLecture.10.b.ipynb[m[m
-rw-r--r--@ 1 afarbin  staff   7.6M Feb 17 13:44 M.npy
-rw-r--r--@ 1 afarbin  staff   7.6M Feb 17 13:44 M.pickle
-rw-r--r--@ 1 afarbin  staff   7.6M Feb 17 13:45 M_list.pickle
-rwxr--r--@ 1 afarbin  staff   2.4K Feb  7 10:37 [31mScores.csv[m[m
-rw-r--r--@ 1 afarbin  staff    20B Feb 17 13:32 file.json
-rw-r--r--@ 1 afarbin  staff   132B Feb 17 13:43 file.pickle
-rw-r--r--@ 1 afarbin  staff    32B Feb 17 13:42 file.py
-rw-r--r--@ 1 afarbin  staff    12B Feb 17 13:30 file.yaml


Bad pipe message: %s [b'W\x15\x91\x87\x1f\x99\x8a\xf0\xe2,\x7f\xbb\xd1\xc5n\x02\x9bu']
Bad pipe message: %s [b'\xa7\x18\xa1-\xfc\x82qA\xad7\x87z\xf2\xf7\xf1\xac;\xb6\x00\x01|\x00\x00\x00\x01\x00\x02\x00\x03\x00\x04\x00\x05\x00\x06\x00\x07\x00\x08\x00\t\x00\n\x00\x0b\x00\x0c\x00\r\x00\x0e\x00\x0f\x00\x10\x00\x11\x00\x12\x00\x13\x00\x14\x00\x15\x00\x16\x00\x17\x00\x18\x00\x19\x00\x1a\x00\x1b\x00/\x000\x001\x002\x003\x004\x005\x006\x007\x008\x009\x00:\x00;\x00<\x00=\x00>\x00?\x00@\x00A\x00B\x00C\x00D\x00E\x00F\x00g\x00h\x00i\x00j\x00k\x00l\x00m\x00\x84\x00\x85\x00\x86\x00\x87\x00\x88\x00\x89\x00\x96\x00\x97\x00\x98\x00\x99\x00\x9a\x00\x9b\x00\x9c\x00\x9d\x00\x9e\x00\x9f\x00\xa0\x00\xa1\x00\xa2\x00\xa3\x00\xa4\x00\xa5\x00\xa6\x00\xa7\x00\xba\x00\xbb\x00\xbc\x00\xbd\x00\xbe\x00\xbf\x00\xc0\x00\xc1\x00\xc2\x00\xc3\x00\xc4\x00\xc5\x13\x01\x13\x02\x13', b'\x04\x13', b'\x01\xc0\x02\xc0']
Bad pipe message: %s [b'\x04\xc0']
Bad pipe message: %s [b'\x06\xc0\x07\xc0']
Bad pipe message: %s [b'\xdf\x

In [None]:
!ls -l

In [None]:
!rm *.pickle *.yaml *.json file.py