Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Capture lineage / sourcing of data so that repeated calculations can be avoided #21

Open
MSeal opened this issue Feb 4, 2019 · 2 comments
Labels
enhancement New feature or request idea

Comments

@MSeal
Copy link
Member

MSeal commented Feb 4, 2019

Capturing the source sha or other requirements to recompute or read scaps when calculating data would be helpful.

@betatim
Copy link
Member

betatim commented Feb 5, 2019

Could you explain a bit what use case you have in mind? Something like telling the user the scrap they just retrieved from a notebook needs recomputing?

For caching of results during computations we should checkout https://joblib.readthedocs.io/en/latest/memory.html which is well used and maintained by someone else (yay!).

@MSeal MSeal modified the milestone: beta-complete Feb 11, 2019
@MSeal MSeal added enhancement New feature or request idea labels Feb 11, 2019
@MSeal
Copy link
Member Author

MSeal commented Feb 11, 2019

So the core intention here would be to allow for the glue action against a particular ref to not push any data if the contents were identical. I don't think it's necessary at first, but having a path for success when a user wants to prevent expensive computation / pushes might be helpful. Another pattern may be to provide additional wrapping that allows the user to compute_and_glue data that will glue a reference without compute if the source data is considered equivalent by some registered function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request idea
Projects
None yet
Development

No branches or pull requests

2 participants