Logs as append-only source: write your ML training results in Python without having to worry about crashes. Loading is a breeze: the logs are native Python code. The package supports unstructured data. The data can easily be imported into Jupyter Notebooks or elsewhere.
To install using pip, use:
pip install laaos
To run the tests, use:
python setup.py test
Storing training results as Python dictionaries or JSON files is problematic because the formats are not append-only, which means that you have to rewrite the file every time something changes. (Or you only write results at the end, which does not play well with interruptions or intermediate failures.)
Alternatively, we can simply write the operations that create a structure to a file in an append-only fashion. If the data structure itself is growing and not mutated, this only increases file-size by a constant factor.
The advantage of this library is that the file format is very simple: it's valid Python code.
The only requirement is that you only store primitive types, lists, sets, dicts and immutable types.
Custom wrappers can be added by registering TypeHandler
s when creating a Store
. See WeakEnumHandler
and StrEnumHandler
.
from laaos import open_file_store, safe_load
store = open_file_store("test", suffix="", truncate=True)
print("Output file: ", store.uri)
store['losses'] = []
losses = store["losses"]
for i in range(10):
losses.append(1/(i+1))
store.close()
The resulting file laaos/test.py
contains valid Python code:
store = {}
store['losses']=[]
store['losses'].append(1.0)
store['losses'].append(0.5)
store['losses'].append(0.3333333333333333)
store['losses'].append(0.25)
store['losses'].append(0.2)
store['losses'].append(0.16666666666666666)
store['losses'].append(0.14285714285714285)
store['losses'].append(0.125)
store['losses'].append(0.1111111111111111)
It can be loaded either with:
form laaos.test import store
or with the more secure:
safe_load('laaos/test.py')
from laaos import open_file_store
initial_data = dict(config=dict(dataset="MNIST", learning_rate=1e-4, seed=1337), losses=[])
store = open_file_store("experiment_result", suffix="", initial_data=initial_data)
if store["config"] != initial_data["config"]:
raise ValueError("Experiment mismatch!")
print("Output file: ", store.uri)
losses = store["losses"]
for i in range(len(losses), 10):
print("Epoch ", i)
losses.append(1 / (i + 1))
if i % 3 == 0:
raise SystemError("Preemption!")
store.close()