## Data Storage

In [None]:
%matplotlib inline

In [None]:
#required imports
import numpy as np

While there are many, many different data storage formats, we will only cover the basics, then the formats most common inside of AES.

#### Basic Data Storage Formats

In [None]:
# Let's create a 50 x 50 array of floats:
test_arr = np.random.rand(50, 50)
test_arr

Let's try to save this 50 x 50 grid. 

In [None]:
# save as comma separated
test_arr.tofile(open("./test_out_arr.txt", "w"), sep=',')

In [None]:
!ls -lh

Okay, okay, I know. 50 kB? Who cares? But, with much larger arrays, this gets worse. Now, let's look into saving as pure binary.

In [None]:
test_arr.tofile(open("./test_out_arr_binary.binary", "wb"), sep='')

In [None]:
!ls -lh

Python has a module called Pickle that saves the entire in-memory representation. 

In [None]:
import pickle

In [None]:
pickle.dump(test_arr, open("./test_out_arr.p", 'wb'))

In [None]:
!ls -lh

### Basic Compression

Let's try compressing our two files from before. 

In [None]:
import zipfile

In [None]:
with zipfile.ZipFile("test_out_arr_txt.zip", 'w', compression=zipfile.ZIP_DEFLATED) as txt_zip: 
    txt_zip.write("test_out_arr.txt")

In [None]:
!ls -lh

In [None]:
with zipfile.ZipFile("test_out_arr_bin.zip", 'w', compression=zipfile.ZIP_DEFLATED) as txt_zip: 
    txt_zip.write("test_out_arr_binary.binary")

Huh. Maybe we weren't expecting that result. Let's try a different array

In [None]:
test_arr = np.ones((50, 50))
test_arr

In [None]:
# save as comma separated, then zip
test_arr.tofile(open("./test_out_arr_ones.txt", "w"), sep=',')
with zipfile.ZipFile("test_out_arr_ones_txt.zip", 'w', compression=zipfile.ZIP_DEFLATED) as txt_zip: 
    txt_zip.write("test_out_arr_ones.txt")

In [None]:
!ls -lh

In [None]:
# save as binary, then zip
test_arr.tofile(open("./test_out_arr_ones.binary", "w"), sep='')
with zipfile.ZipFile("test_out_arr_ones_bin.zip", 'w', compression=zipfile.ZIP_DEFLATED) as txt_zip: 
    txt_zip.write("test_out_arr_ones.binary")

In [None]:
!ls -lh

What does this tell us about the impact of compression?