New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Develop #127
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Fix Dask hyperparam search example to work with latest sklearn&dask
Update example on how to use Dask to do grid search
An attempt to fix an issue on Travis where rpy2 is installed with pip (in addition to conda) and appears to be buggy.
Remove julia and rpy2 from docs extra requirements
The idea is that we sometimes want to attach files to models, such as HTML reports or the like, and in the backend, these files should be stored separately, to allow easy access. Here we're implemeting this idea for the 'FileLike' model persister and testing it for the 'File' subclass. This should work for 'Rest' and 'S3' as well, but I thought it's best to add tests when we all agreed on the idea. Usage is demonstrated in 'TestFileAttachments'. The contract is as follows: Use 'palladium.util.annotate' to add an arbitrary number of attachments to the model, like so: ``` annotate(model1, {'attachments/myatt.txt': 'aGV5', 'attachments/my2ndatt.txt': 'aG8='}) ``` Note that the keys of such attachments must start with 'attachments/', with the rest indicating a filename. The values must be base64 encoded but converted from bytes to strings. This is arguably a bit awkward, but we do this because the attachments dictionary must in general be JSON serializable, and using bytes would violate this. When 'model1' is persisted, 'FileLike' will create one file for each attachment and call them 'model-1-myatt.txt' and 'model-1-my2ndatt.txt'. The implementation chooses to use flat files rather than a folder to hold all attachments for a given model. This is done so that we do not need to add extra methods to 'FileLikeIO' (such as mkdir), which means we should get support for other 'FileLike' implementations such as 'Rest' and 'S3' for free. Moreover, the attachments will be removed from the model's pickle and from the metadata files, in order not to blow up the size of those. When the model is loaded back through the model persister, the attachments are loaded and put back into the model's metadata dictionary. What's a good time to add the attachments to the model? Use the 'write_model_decorators' pluggable decorator hook to add a decorator that adds your attachment just before it's persisted. A toy example: ``` def my_write_model_decorator(self, model): report = my_make_report(model) # assume returns an HTML string report_encoded = b64encode(report.encode('utf-8')).decode('ascii') annotate(model, {'attachments/report.html': report_encoded}) ``` Let me know what you think. Once we've settled on the right way to do this, we'll put this into proper docs and examples.
…k-as-factory In configuration, use exclamation mark '!' instead of '__factory__'
…ples Examples on how to use Keras and XGBoost with Palladium
Proposal implementation for handling model attachments
avoid loading stale metadata in S3 persister
As suggested by yv in #124
Used https://pypi.org/project/pur/ for update for requirements.
Feature/update dependencies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.