Re-factoring of the clustering example #562

benkli01 · 2020-10-01T12:09:25Z

Re-factor the clustering example for a better readability and overview. These changes were originally meant for #508 which had to be abandoned because of technical issues.

Ruomei · 2020-10-01T13:26:12Z

tensorflow_model_optimization/python/examples/clustering/keras/mnist/mnist_cnn.py

+  # Train model
+  model = train_model(model, x_train, y_train, x_test, y_test)
+  # Cluster and fine-tune model
+  clustered_model = cluster_model(model, x_train, y_train, x_test, y_test)


More of a question than a comment:
Have you tried to apply the clustering wrapper on the inference graph rather than the training graph of a small example like mnist? If yes, what does the curve of the training loss look like?

I'm not sure what you mean by "inference graph" and "training graph". Could you explain this a bit further?

Hi, yes. An inference graph usually refers to a training graph after modifications targeted at inference. For instance, you can take a look at a TensorFlow legacy tool explaining some typical graph transforms (e.g. fold_batch_norms, merge_duplicate_nodes, remove_control_dependencies). I will write a summary on the internal site soon.

In our application context, I was just curious whether clustering behaves differently on a toy model, when tested with an inference graph rather than a training graph, in a similar way to a realistic model. If so, it could help our future debugging for upcoming new features with all the nice visualization you have done. We can check this later and this is not essential for this PR.

akarmi · 2020-10-02T08:39:55Z

Just a note of status: while testing this, discovered that clustering_callbacks.py is not included into the pip package. I will review this again after the fix for that problem is available.

wwwind · 2020-10-02T09:12:17Z

We don't demonstrate in this example the benefits of clustering: the model size has been reduced.

benkli01 · 2020-10-02T10:50:25Z

We don't demonstrate in this example the benefits of clustering: the model size has been reduced.

Unless the model is really extremely small clustering should still be beneficial.

I just ran a small test with the example. It looks like we still get a decent compression ratio despite the reduced model size. Here are some model sizes at different stages of the example (note: clustered_model is the unstripped clustered model):

362832 clustered_model.h5
247566 clustered_model.zip
271400 model.h5
233869 model.zip
98144 stripped_model.h5
13804 stripped_model.zip

wwwind · 2020-10-02T11:25:57Z

@benkli01 Thanks!
Sorry, I didn't explain myself clearly in my previous comment - I think it is worth to add model size metrics to the example, because it shows the main purpose of the clustering: model vs stripped_model.
The mentioned numbers above are very good !

benkli01 · 2020-10-02T13:44:57Z

@wwwind Ah ok, sorry. I misunderstood your message. Presenting the benefits in the example makes sense to me. I would add this in a separate PR though? What do you think @akarmi ?

akarmi

Agree. Showing the compression benefits of clustering would add a new "feature" to this sample, and is not the purpose of this PR, if I understand it correctly.

On the other hand, I think in pruning samples a dedicated end-to-end example is used to show off the effect on the model size. However, if I recall it correctly, it has been replaced by the end-to-end tutorial and should probably be deprecated/removed. @alanchiao, what are your thoughts - should we remove the end-to-end example from pruning, or add a similar one to clustering?

alanchiao · 2020-10-05T17:48:57Z

I'd agree that the end-to-end tutorials and comprehensive guides (for whatever model optimization technique) mostly replace the need for the python scripts.

If you're an end user trying to play with it, it's much easier to use the Colab / Jupyter environment (which also provides a way of downloading the resulting models).
If you're a developer, creating or modifying unit tests can also give a sense of how changes to the library affect behavior.

There are cases when a script may be easier for a developer (to share early stage examples such as the current pruning mnist_e2e.py for sparse TFLite kernels, or to iterate on a non-easily testable feature such as a Tensorboard visualization).

It would make sense to remove certain aspects of the examples (e.g. there isn't much of a benefit in pruning to enumerate all model types) and perhaps rename examples/ to scripts/ to deprecate them as things for end users to use and mostly keep them for the convenience of developers in some cases.

What do you all think? It's been a while since I've personally used anything under examples/ but I know there has been a minor amount of usage elsewhere.

wwwind · 2020-10-06T14:38:44Z

Hi @alanchiao,

Thanks for your comment.

It would be nice to have a uniform experience across the techniques, because currently if I install tfmot, then I have a directory examples/ with only quantization/ subdirectory with scripts. It will be good to have none or scripts for all techniques.
If we include all these scripts in the package, then they should be included for CI, so that these scripts are not out-of-dated scripts - it increases the cost of their maintaining.

As a developer, for debugging purpose, personally, I use integration tests as they are much faster.
For experiments, as you pointed out, Jupyter notebooks are more convenient.

Re: discussion regarding examples in the tfmot package:
@akarmi @Ruomei @psunn @benkli01

alanchiao · 2020-10-06T18:59:10Z

@wwwind: yes the experience should be uniform. Including the examples/quantization was a mistake - people are looking to install the library as opposed to the examples themselves. core/api doesn't seem to import any of the symbols from there, so there is probably some accidental extra BUILD line (or spurious dependency) on examples/quantization.

Re-factoring of the clustering example.

1dafd57

googlebot added the cla: yes PR contributor has signed CLA label Oct 1, 2020

github-actions bot added the technique:clustering Regarding tfmot.clustering.keras APIs and docs label Oct 1, 2020

Ruomei reviewed Oct 1, 2020

View reviewed changes

akarmi self-requested a review October 2, 2020 08:40

akarmi approved these changes Oct 2, 2020

View reviewed changes

akarmi added the ready to pull Working to get PR submitted to internal repository, after which merging to Github happens. label Oct 2, 2020

copybara-service bot merged commit b894380 into tensorflow:master Oct 5, 2020

alanchiao mentioned this pull request Oct 13, 2020

Enable prune preserve quantization aware training (pqat) #568

Merged

alanchiao mentioned this pull request Oct 21, 2020

Add weight clustering implementation using compression API. #579

Merged

Re-factoring of the clustering example #562

Re-factoring of the clustering example #562

Uh oh!

Conversation

benkli01 commented Oct 1, 2020

Uh oh!

Ruomei Oct 1, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benkli01 Oct 2, 2020

Choose a reason for hiding this comment

Uh oh!

Ruomei Oct 6, 2020

Choose a reason for hiding this comment

Uh oh!

akarmi commented Oct 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wwwind commented Oct 2, 2020

Uh oh!

benkli01 commented Oct 2, 2020

Uh oh!

wwwind commented Oct 2, 2020

Uh oh!

benkli01 commented Oct 2, 2020

Uh oh!

akarmi left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alanchiao commented Oct 5, 2020

Uh oh!

wwwind commented Oct 6, 2020

Uh oh!

alanchiao commented Oct 6, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Ruomei Oct 1, 2020 •

edited

Loading

akarmi commented Oct 2, 2020 •

edited

Loading

akarmi left a comment •

edited

Loading

alanchiao commented Oct 6, 2020 •

edited

Loading