Skip to content
This repository has been archived by the owner on Nov 3, 2022. It is now read-only.

Keras inference time optimizer #264

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
Open

Conversation

ZFTurbo
Copy link

@ZFTurbo ZFTurbo commented May 30, 2018

Reduce neural net structure (Conv + BN -> Conv)
Also works:
DepthwiseConv2D + BN -> DepthwiseConv2D
SeparableConv2D + BN -> SeparableConv2D

This code takes on input trained Keras model and optimize layer structure and weights in such a way that model became much faster (~30%), but works identically to initial model. It can be extremely useful in case you need to process large amount of images with trained model. Reduce operation was tested on all Keras models zoo. See comparison table and full description by link:
https://github.com/ZFTurbo/Keras-inference-time-optimizer

Reduce neural net structure (Conv + BN -> Conv)
Also works:
DepthwiseConv2D + BN -> DepthwiseConv2D
SeparableConv2D + BN -> SeparableConv2D

This code takes on input trained Keras model and optimize layer structure and weights in such a way
that model became much faster (~30%), but works identically to initial model. It can be extremely
useful in case you need to process large amount of images with trained model. Reduce operation was
tested on all Keras models zoo. See comparison table and full description by link:
https://github.com/ZFTurbo/Keras-inference-time-optimizer
Keras inference time optimizer
@ahundt
Copy link
Collaborator

ahundt commented Jun 20, 2018

The style here doesn't seem to match that of other keras code. Could you fix that?

@titu1994 titu1994 closed this Jun 20, 2018
@titu1994 titu1994 reopened this Jun 20, 2018
@titu1994
Copy link
Contributor

Are there no tests other than the benchmark test from your repo that can be performed ? A benchmark test isn't what would be feasible inside Travis, especially with such large models.

Once the style fixes are finished, this can be merged if it includes at least some form of test and all tests pass.

@ZFTurbo
Copy link
Author

ZFTurbo commented Jun 20, 2018

@titu1994 I added "mobilenet_small" benchmark. It run several seconds on my machine. I think it can be used as test for kito.py code.

Should I include tests directly in kito.py file?

@ahundt: Regarding style. I saw other files in "utils" folder and didn't see any specific style pattern. Could you please clarify what exactly should I change? Do you mean comments in functions or something else?

@ahundt
Copy link
Collaborator

ahundt commented Jun 20, 2018

tests should go in the tests folder, take a look how those are written.

@titu1994
Copy link
Contributor

titu1994 commented Jun 22, 2018

@ZFTurbo The test is short on Tensorflow, but I doubt its fast on Theano. In any case, it's better than no test at all.

As @ahundt mentioned, tests should go into the tests directory, specifically tests/keras_contrib/utils/*_test.py where you can replace * with some name (i suggest kito to be same with this file). For the format, please see other tests or tests from core Keras.

@ZFTurbo
Copy link
Author

ZFTurbo commented Jun 26, 2018

I added test.

@ahundt
Copy link
Collaborator

ahundt commented Sep 30, 2018

@titu1994 @ZFTurbo this looks pretty good to me now, think it should be merged?

@ZFTurbo
Copy link
Author

ZFTurbo commented Oct 2, 2018

@ahundt I'm totally fine with merge. )

Copy link
Collaborator

@ahundt ahundt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through this again and had a couple additional questions


# DeepLabV3+ non-standard layer
if layer.__class__.__name__ == 'BilinearUpsampling':
from neural_nets.deeplab_v3_plus_model import BilinearUpsampling
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed this, there is now a bilinear upsampling layer in keras itself.
keras-team/keras#10994

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has other name in Keras. This code made specially for DeepLab V3+. It still uses it's own implementation.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/cc @bonlime

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't have specialized code for things not included in keras-contrib. You'll need to come up with a way to parameterize this, such as by passing functions, or a dictionary.

print('Reduced model number layers: {}'.format(len(model_reduced.layers)))
print('Compare models...')
if model_name in ['nasnetlarge', 'deeplab_v3plus_mobile', 'deeplab_v3plus_xception']:
max_error = compare_two_models_results(model, model_reduced, test_number=10000, max_batch=128)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how long does this test take to run?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Several seconds

return outbound_layers


def get_copy_of_layer(layer, verbose=False):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do users do this for their own non-standard layers?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They need to change the code ( Like here:

# DeepLabV3+ non-standard layer
if layer.__class__.__name__ == 'BilinearUpsampling':
    from neural_nets.deeplab_v3_plus_model import BilinearUpsampling

I'm not sure how to make it universal, since parameters are different.

Copy link
Collaborator

@ahundt ahundt Oct 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could pass a dict mapping from names to functions/object, so if it is in the dictionary you call that, otherwise you do some default behavior, and else raise an error saying what they need to do to fix the problem.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also have a look at keras itself registers functions. If you look at the python file for loss, at the very bottom there is some registration code.

@ZFTurbo
Copy link
Author

ZFTurbo commented Oct 12, 2018

  1. Added more tests
  2. Removed non-standard layers. Probably need to add custom_objects parameter. Will do it later.
  3. Added support for layers of type "Model".

@gabrieldemarmiesse
Copy link
Contributor

Thank you @ZFTurbo for your work.

I've looked at the result presented in your page and the number are impressive.
Since you're using tensorflow, I might add that there is XLA (maybe you already know about it) which does a similar job.

If it's not too much trouble, can I ask you to run your benchmarks with XLA enabled?

Also as a side note. I started using keras for the first time with your scripts on kaggle a few years ago. Thanks for sharing! (remember the fish competition?) :)

@ZFTurbo
Copy link
Author

ZFTurbo commented Dec 17, 2018

Thank you @ZFTurbo for your work.

I've looked at the result presented in your page and the number are impressive.
Since you're using tensorflow, I might add that there is XLA (maybe you already know about it) which does a similar job.

If it's not too much trouble, can I ask you to run your benchmarks with XLA enabled?

I heard about XLA but not tried it yet. Probably XLA will make KITO useless. I'll try it and post result here.

Also as a side note. I started using keras for the first time with your scripts on kaggle a few years ago. Thanks for sharing! (remember the fish competition?) :)

Good to know it )) That's bad, but looks like PyTorch becoming mainstream now, instead of Keras.

@gabrieldemarmiesse
Copy link
Contributor

gabrieldemarmiesse commented Dec 17, 2018

I heard about XLA but not tried it yet. Probably XLA will make KITO useless. I'll try it and post result here.

Every feature has a maintenance cost. The more complicated the feature, the higher the cost. We have to choose wisely depending on the manpower that we have (currently not much). I can see how this tool can be useful, but since it works on keras models with a known structure, it can be the job of the backend to optimise it (and it should, even if this is not currently the case with many backends).
So if XLA does a similar job, I'm not sure we should merge this. The maintenance cost would be high compare to, let's say, a layer or a callback.

That's bad, but looks like PyTorch becoming mainstream now, instead of Keras.

That's not bad, diversity is cool, I'm pretty sure we wouldn't have tf eager without pytorch. Furthermore, maybe there will be a pytorch backend for keras poping up somewhere on the internet. Pytorch and keras are not necessarily competitors.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants