DL: Remove fit final multiple and improve serialization of image_count+weights #485

kaknikhil · 2020-03-03T23:46:00Z

Based on our experiments, we noticed that at the end of each hop, fit
multiple would start using a lot of memory (from 80 GB to ~ 140 GB on
the master host of our gcp cluster) and we attributed that memory spike
to the final function.
Removing the final function brought down the memory to around 30-40 GB
and also made the places10 query 3 times faster
Improvements to "serializing weights with image count"

orhankislal · 2020-03-04T20:04:32Z

src/ports/postgres/modules/deep_learning/madlib_keras_fit_multiple_model.sql_in

 )(
    STYPE=BYTEA,
-    SFUNC=MADLIB_SCHEMA.fit_transition_multiple_model,
-    FINALFUNC=MADLIB_SCHEMA.fit_final_multiple_model


It seems the sql and python functions for fit_final_multiple_model are still in code even though we don't call them anymore. Is there a particular reason for keeping them?

No reason, I will remove them.

orhankislal

LGTM

JIRA: MADLIB-1416 Based on our experiments, we noticed that at the end of each hop, fit multiple would start using a lot of memory (from 80 GB to ~ 140 GB on the master host of our gcp cluster) and we attributed that memory spike to the final function. Removing the final function brought down the memory to around 30-40 GB and also made the places10 query 3 times faster places10 with 20 msts on a gpdb6 cluster with 20 segments ``` With fit multiple final function ~ 2 hours per iteration with memory hovering around 70-80gb and peaking at around 130-140GB at the end of each hop. Without fit multiple final function ~ 35-40 mins per iteration with memory hovering around 30-40 gb and peaking at around 40-45 GB. ``` Co-authored-by: Ekta Khanna <ekhanna@pivotal.io>

JIRA: MADLIB-1416 We need to add the image count to the model weights to get the final state for fit_transition. Previously we would use np.concatenate to join the image count and model weights together. A cleaner and faster way is to append image count to the weights list and get rid of the np.concatenate call. Ran unit test `test_serialize_image_nd_weights_valid_output` with large model_weights `[np.array([1]*100000000), np.array([1]*100000000)]` to confirm the speed improvements. test_serialize_image_nd_weights_valid_output with old code and large weights: 17.8 s test_serialize_image_nd_weights_valid_output with new code and large weights: 16.4 s

kaknikhil changed the title ~~Dl/remove fit final serializer~~ DL: Remove fit final multiple and improve serialization of image_count+weights Mar 3, 2020

kaknikhil force-pushed the dl/remove_fit_final_serializer branch from 2dcab61 to 1d5528e Compare March 3, 2020 23:52

orhankislal reviewed Mar 4, 2020

View reviewed changes

orhankislal approved these changes Mar 5, 2020

View reviewed changes

kaknikhil and others added 2 commits March 6, 2020 10:17

kaknikhil force-pushed the dl/remove_fit_final_serializer branch from 23f8342 to 5dc38e3 Compare March 6, 2020 18:17

kaknikhil merged commit ebb6b3c into apache:master Mar 6, 2020

kaknikhil deleted the dl/remove_fit_final_serializer branch March 6, 2020 18:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

DL: Remove fit final multiple and improve serialization of image_count+weights #485

DL: Remove fit final multiple and improve serialization of image_count+weights #485

Uh oh!

kaknikhil commented Mar 3, 2020 •

edited

Loading

Uh oh!

orhankislal Mar 4, 2020

Uh oh!

kaknikhil Mar 4, 2020

Uh oh!

orhankislal left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

DL: Remove fit final multiple and improve serialization of image_count+weights #485

DL: Remove fit final multiple and improve serialization of image_count+weights #485

Uh oh!

Conversation

kaknikhil commented Mar 3, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

orhankislal Mar 4, 2020

Choose a reason for hiding this comment

Uh oh!

kaknikhil Mar 4, 2020

Choose a reason for hiding this comment

Uh oh!

orhankislal left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kaknikhil commented Mar 3, 2020 •

edited

Loading