### Learn to use HDFStore to save and read data in Pandas

- `to_hdf` needs to indicate `key`, like a Dictionary
- `complib=blosc:lz4` and `complevel=9` results in 700MB file, no compression results in 808MB file.
- default value of `complib` is `zlib`, it is a bit slower than `blosc:lz4`, but `complevel=9` results in a 675MB file
- `complib=bzip2` and `complevel=9` results in much longer time and a 696MB file
- `format='fixed'` uses less space than `format='table'`
- Overall `complib=zlib` and `complevel=9` is the best

In [1]:
import pandas as pd
import numpy as np

In [2]:
df = pd.DataFrame(np.random.rand(1000000, 100))

In [3]:
df.to_hdf('test.hdf', 'df')

In [4]:
df = pd.read_hdf('test.hdf', 'df')

In [5]:
df.shape

(1000000, 100)

In [15]:
df.to_hdf('test_compression.hdf', 'df', complib='blosc:lz4', complevel=9, format='fixed')

In [16]:
df.to_hdf('test_compression.hdf', 'df', complevel=9, format='table', complib='zlib')

739MB

In [10]:
# df.to_hdf('test_compression_bzip2.hdf', 'df', complib='bzip2', complevel=9)