Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Predictor should be able to return predictions from DB #112

Closed
thcrock opened this issue Apr 18, 2017 · 0 comments
Closed

Predictor should be able to return predictions from DB #112

thcrock opened this issue Apr 18, 2017 · 0 comments
Assignees
Milestone

Comments

@thcrock
Copy link
Contributor

thcrock commented Apr 18, 2017

Being able to reuse results comes in handy. But it should convert from the DB to the matrix order for consistency.

@thcrock thcrock added this to the v0.4 milestone Apr 18, 2017
@thcrock thcrock self-assigned this Apr 18, 2017
thcrock added a commit that referenced this issue Apr 18, 2017
…113, #114]

The original goal here was to implement a consistent 'replace' kwarg throughout the pipeline, so it can be restarted later more easily. A few other things made it in here. In general they made this change easier, so they aren't totally unrelated, but are different enough that they are helped by being called out. And some are simple performance updates.

Directly related [#99]:
- Create pipeline test that makes sure skip-if-present functionality works, rename existing generic pipeline test for clarity
- Add retrieve from database functionality to Predictor class that ensures order is the same as the passed-in matrix, call if replace kwarg is False [#112]
- Add replace kwarg to FeatureGenerator constructor that will skip creating table tasks if the feature table exists
- Flip ModelTrainer replace default kwarg to True for conformity
- Add replace kwarg to PipelineBase to pass through to all components

Tangentially or not related:
- Add sqlalchemy-postgres-copy, use in predictions writing for speed upgrade [#111]
- Remove model.predict() [#45]
- Move much functionality in pipelines to smaller methods, to allow both code reuse between them and inspection before running lengthy tasks [Resolves #93]
- Use new matrix_store.empty instead of matrix_store.matrix.empty to allow the empty check without loading the entire matrix into memory [#113]
- Use matrix_store internally in Predictor to reduce number of items that need to be passed around
- Create trained models directory if it isn't there
- Close DB sessions properly [#114]
ecsalomon added a commit that referenced this issue Apr 18, 2017
Workflow and performance updates [Resolves #45, #93, #99, #111, #112,…
@thcrock thcrock closed this as completed Apr 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant