Proposal implementation for handling model attachments #119

dnouri · 2019-10-21T09:36:31Z

The idea is that we sometimes want to attach files to models, such as
HTML reports or the like, and in the backend, these files should be
stored separately, to allow easy access.

Here we're implemeting this idea for the 'FileLike' model persister
and testing it for the 'File' subclass. This should work for 'Rest'
and 'S3' as well, but I thought it's best to add tests when we all
agreed on the idea.

Usage is demonstrated in 'TestFileAttachments'. The contract is as
follows: Use 'palladium.util.annotate' to add an arbitrary number of
attachments to the model, like so:

annotate(model1, {'attachments/myatt.txt': 'aGV5',
                  'attachments/my2ndatt.txt': 'aG8='})

Note that the keys of such attachments must start with 'attachments/',
with the rest indicating a filename. The values must be base64
encoded but converted from bytes to strings. This is arguably a bit
awkward, but we do this because the attachments dictionary must in
general be JSON serializable, and using bytes would violate this.

When 'model1' is persisted, 'FileLike' will create one file for each
attachment and call them 'model-1-myatt.txt' and
'model-1-my2ndatt.txt'. The implementation chooses to use flat files
rather than a folder to hold all attachments for a given model. This
is done so that we do not need to add extra methods to
'FileLikeIO' (such as mkdir), which means we should get support for
other 'FileLike' implementations such as 'Rest' and 'S3' for free.

Moreover, the attachments will be removed from the model's pickle and
from the metadata files, in order not to blow up the size of those.
When the model is loaded back through the model persister, the
attachments are loaded and put back into the model's metadata
dictionary.

What's a good time to add the attachments to the model? Use the
'write_model_decorators' pluggable decorator hook to add a decorator
that adds your attachment just before it's persisted. A toy example:

def my_write_model_decorator(self, model):
    report = my_make_report(model)  # assume returns an HTML string
    report_encoded = b64encode(report.encode('utf-8')).decode('ascii')
    annotate(model, {'attachments/report.html': report_encoded})

Let me know what you think. Once we've settled on the right way to do
this, we'll put this into proper docs and examples.

The idea is that we sometimes want to attach files to models, such as HTML reports or the like, and in the backend, these files should be stored separately, to allow easy access. Here we're implemeting this idea for the 'FileLike' model persister and testing it for the 'File' subclass. This should work for 'Rest' and 'S3' as well, but I thought it's best to add tests when we all agreed on the idea. Usage is demonstrated in 'TestFileAttachments'. The contract is as follows: Use 'palladium.util.annotate' to add an arbitrary number of attachments to the model, like so: ``` annotate(model1, {'attachments/myatt.txt': 'aGV5', 'attachments/my2ndatt.txt': 'aG8='}) ``` Note that the keys of such attachments must start with 'attachments/', with the rest indicating a filename. The values must be base64 encoded but converted from bytes to strings. This is arguably a bit awkward, but we do this because the attachments dictionary must in general be JSON serializable, and using bytes would violate this. When 'model1' is persisted, 'FileLike' will create one file for each attachment and call them 'model-1-myatt.txt' and 'model-1-my2ndatt.txt'. The implementation chooses to use flat files rather than a folder to hold all attachments for a given model. This is done so that we do not need to add extra methods to 'FileLikeIO' (such as mkdir), which means we should get support for other 'FileLike' implementations such as 'Rest' and 'S3' for free. Moreover, the attachments will be removed from the model's pickle and from the metadata files, in order not to blow up the size of those. When the model is loaded back through the model persister, the attachments are loaded and put back into the model's metadata dictionary. What's a good time to add the attachments to the model? Use the 'write_model_decorators' pluggable decorator hook to add a decorator that adds your attachment just before it's persisted. A toy example: ``` def my_write_model_decorator(self, model): report = my_make_report(model) # assume returns an HTML string report_encoded = b64encode(report.encode('utf-8')).decode('ascii') annotate(model, {'attachments/report.html': report_encoded}) ``` Let me know what you think. Once we've settled on the right way to do this, we'll put this into proper docs and examples.

coveralls · 2019-11-05T09:46:36Z

Coverage increased (+0.2%) to 89.754% when pulling 60a7ded on feature/model-attachments into 28fa35d on develop.

BenjaminBossan · 2019-11-09T11:48:44Z

I have looked at this only superficially. Generally, I think it's very helpful for palladium to offer this. I assume Python objects can be pickle.dumps-ed into the attachments.

Two questions:

Sometimes it would make more sense to attach to another object, e.g. the data loader. Is that supported?
How difficult would it be to support other/change the file format for the user?

dnouri · 2019-11-18T13:22:32Z

I have looked at this only superficially. Generally, I think it's very helpful for palladium to offer this. I assume Python objects can be pickle.dumps-ed into the attachments.

Well, you'll have to dumps it and encodes it with b64. The reason is that the metadata is supposed to be json serializable... 🙄

Two questions:

Sometimes it would make more sense to attach to another object, e.g. the data loader. Is that supported?

Not really. It's only supported to add these attachments to the model, because that's what's persisted. Let me know if you have any ideas.

How difficult would it be to support other/change the file format for the user?

Not sure I understand. The file format is pretty much open to the user. The contents of those special metadata entries are decoded and saved to file as binary, so they can contain anything you want, and the filename and extension can be anything you want.

BenjaminBossan · 2019-11-21T21:45:36Z

Okay, thanks for clarifying.

dnouri force-pushed the feature/model-attachments branch from 3c2be64 to 60a7ded Compare November 5, 2019 09:34

alattner merged commit a4efa8a into develop May 6, 2020

alattner deleted the feature/model-attachments branch May 31, 2022 14:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal implementation for handling model attachments #119

Proposal implementation for handling model attachments #119

dnouri commented Oct 21, 2019

coveralls commented Nov 5, 2019

BenjaminBossan commented Nov 9, 2019

dnouri commented Nov 18, 2019

BenjaminBossan commented Nov 21, 2019

Proposal implementation for handling model attachments #119

Proposal implementation for handling model attachments #119

Conversation

dnouri commented Oct 21, 2019

coveralls commented Nov 5, 2019

BenjaminBossan commented Nov 9, 2019

dnouri commented Nov 18, 2019

BenjaminBossan commented Nov 21, 2019