Feature: Storing associated blobs and data with Entries #23

gavento · 2020-02-27T19:05:47Z

Some computations have multiple outputs, and some of those are naturally files. E.g. training a neural net outputs: the model parameters (data or file), resulting stats (data), TF summarywriter logs (file), sometimes graphs or images (files), stout/stderr captures (data). It would be great if some of those types could be also displayed in the browser (images, text files, logs, ...)

Table / properties

Add a table for storing blobs, every currently valid Entry has associated blobs. It would make sense to include the serialized output value (for consistency or e.g. external blob storage).

Every blob has:

id
entry - reference to entry, M:1 (TODO: update to match the current schema)
data (blob)
name - filename (relative to the workdir) or empty (for pickled returned value) or any name withut slash (just a blob, may still be instantiated as a file)
Some notion of kind/type/intent - which should be displayed in browser, which are images, which are (viewable) text files, how to highlight the text, etc. Plugins may define more (e.g. tensorboard).
- Mime seems to be too much and insufficient (e.g. TF logs)? (But good for browser open/download)
- We can just have tag field mixing role (full, thumbnail, ...) and type (text, json, png, jpg)
- Or we can have both mime for type and tags for role/intent/plugin (for distinguishing e.g. TF logs ..).

API

Managed through context for creation (see #22):

ctx.add_blob(data, name, mimetype, tags=()) - add data blob
ctx.add_file(path, name=None, mimetype, tags=()) - add an existing file
And some type-specific functions (more for text/logs, etc.)
ctx.add_figure(fig, name, tags=('thumbnail', )) - render and insert Matplotlib/plotly/bokeh/... image
ctx.add_pickled(obj, name, tags=(pickled)) - pickle and add object

Properties and methods on Entry:

Entry.files - dictionary name: EntryFile

EntryFile (bikesheddable) has similar properties to the table above. In addition, it has methods:

EntryFile.write_file(filename=None) - write as real file, returns Path object
EntryFile.as_file() - return a readable file-like object (SQLite supports this)
EntryFile.data() - return binary data

The text was updated successfully, but these errors were encountered:

spirali · 2020-02-27T23:32:23Z

I agree with the idea. I do not have any objections. If add_figure would be implemented in a way that it does not enforces strong dependancy on matplotlib, etc, it is ok for me.

FYI entry has a composite primary key (builder_name, key)
it is "id" necessary in the table? It seems that (entry, name) should be unique.
I like the idea of tags, but I would rather separate it from content type.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Storing associated blobs and data with Entries #23

Feature: Storing associated blobs and data with Entries #23

gavento commented Feb 27, 2020

spirali commented Feb 27, 2020

Feature: Storing associated blobs and data with Entries #23

Feature: Storing associated blobs and data with Entries #23

Comments

gavento commented Feb 27, 2020

Table / properties

API

spirali commented Feb 27, 2020