New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Application context objects should not be kept by default + add way to supply fit context #86
Comments
@carbonleakage does this sound interesting? |
@shaypal5 sounds pretty interesting, ill start working on this! |
Great. I'll share code with the latest use case I had for application context, so you have some context yourself. :) |
See the pipeline stage here: It gets a product-review dataframe as context, used to enrich some of the rows in the input dataframe with product sentiment features. In another file, on fit-transform the pipeline is provided with the train review dataframe: And on transform it is provided with the rollout/holdout review dataframe: Now here I did something sensible and provided it as a path, buy a very intuitive thing to do is to provide the dataframe object itself, in which case it will be kept(!) inside the pipeline object (because the context parameter currently also updates the |
Hey @carbonleakage :) Had a chance to take a look at this yet? |
Hey @shaypal5 I started looking into this last week. I have not made much progress due to other commitments. I plan to do some pull requests in the coming weeks. |
Great. :) Just wanted to touch base. |
Released in |
pdpipe uses
PdpApplicationContext
objects in two ways:fit_context
that should be kept as-is after a fit, and used by stages to pass to one another parameters that should also be used on transform time.application_context
that should be discarded after a specific application is done, and is used by stages to feed consecutive stages with context. It can be added to by supplyingapply(context={})
,fit_transform(context={})
ortransform(context={})
with adict
that will be used to update the application context.Two changes are required:
context
parameter to application functions that is used to update both the fit and the application context. I think they should be two, one for each type of context.application_context
is not discarded when the application is done. It's as simple asself.application_context = None
expression added at thePdPipeline
level in the couple of right cases.The text was updated successfully, but these errors were encountered: