[Collaborative optimization] Pruning-Clustering-preserving Quantization Aware Training #699

Ruomei · 2021-04-30T08:30:50Z

This PR adds support for Pruning-Clustering-preserving Quantization Aware Training (PCQAT). Aiming at preserving the sparsity and unique weights in the output optimized model, fixed pruning masks and the stochastic updates of clustering training variables are enabled during quantization-aware training.

User API:

  preserve_sparsity = True
  quant_aware_annotate_model = quantize.quantize_annotate_model(pruned_clustered_model)
  pcqat_model = quantize.quantize_apply(
      quant_aware_annotate_model,
      scheme=default_8bit_cluster_preserve_quantize_scheme
      .Default8BitClusterPreserveQuantizeScheme(preserve_sparsity))

Main changes:

cluster_preserve_integration_test: edge cases (e.g. passing non-pruned models, passing models with uniform weights, etc.), checks for the updates of trainable variables between epochs
mnist_prune_cluster_preserve_qat_test: minimal e2e mnist example to show benefits of PCQAT with the most common two types of configurations
cluster_preserve_quantize_registry: main implementation of PCQAT algo.

Initial results (pruning sparsity: 50%, number of clusters: 8 (DS-CNN-L), 16 (Mobilenet_v1)):

Model	Items	Baseline	Pruned Model	QATed	Pruned_Clustered Model	PCQATed Model
DS-CNN-L	FP32 Top1 Accuracy	95.06%	94.07%	(Fake INT8) 94.85%	93.76%	(Fake INT8) 94.28%
	INT8 full integer quantization	94.35%	93.80%	94.82%	93.21%	94.06%
	INT8 .tflite gzip compression (bytes)	506400 -> 425006	506400 -> 317937	507296 -> 424368	506400 -> 205333	507296 -> 201744
Mobilenet_v1 (ImageNet)	FP32 Top1 Accuracy	70.98%	70.49%	(Fake INT8) 70.88%	67.64%	(Fake INT8) 67.80%
	INT8 full integer quantization	70.37%	69.85%	70.87%	66.89%	68.63%
	INT8 .tflite gzip compression (bytes)	4665552 -> 3886236	4665552 -> 2909148	4569416 -> 3808781	4665552 -> 2013010	4569472 -> 1943957

akarmi

Thanks, a minor request below.

tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py

akarmi

Thank you.

Ruomei · 2021-06-03T09:58:52Z

Hi @daverim, recently we have removed the dependency of this PR, so it is ready for review anytime. Could you please take a look when you have time?
Thanks!

daverim

Some small linting issues.

...ion/keras/collaborative_optimizations/cluster_preserve/cluster_preserve_quantize_registry.py

...ollaborative_optimizations/cluster_preserve/default_8bit_cluster_preserve_quantize_scheme.py

…on Aware Training (PCQAT)

Xhark · 2021-06-10T17:50:37Z

Hi, Just curious why Pruned_Clustered Model - Mobilenet_v1 (ImageNet) - INT8 .tflite gzip compression (bytes) is so small? PCQAT model is larger than PC model? or missed a digit for this case?

wwwind · 2021-06-10T18:05:39Z

Hi @Xhark Yes, this is the mistake, last digit is missing. The compression ratio was around 2.3 in our experiments.
Thanks for noticing.

Ruomei · 2021-06-10T20:06:20Z

Thanks, @Xhark, the number is now updated.

Ruomei · 2021-06-10T20:39:13Z

Hi, @daverim and @Xhark, could you please also let us know whether there is anything we can do to help with the failed internal checks shown in this PR?
@akarmi @wwwind for visibility
Thanks all!

daverim · 2021-06-11T03:54:59Z

Merging was blocked by build file strict dependencies -- resubmitting with it fixed myself now, should be merged today.

…

On Fri, Jun 11, 2021 at 5:39 AM Ruomei Yan ***@***.***> wrote: Hi, @daverim <https://github.com/daverim> and @Xhark <https://github.com/Xhark>, could you please also let us know if there is anything we can do to help with the failed internal checks shown in this PR? @akarmi <https://github.com/akarmi> @wwwind <https://github.com/wwwind> for visibility Thanks all! — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#699 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AASV4JJOCS5IS2OIGPIZV53TSEPIDANCNFSM433XELBA> .

Ruomei · 2021-06-11T16:56:41Z

Merging was blocked by build file strict dependencies -- resubmitting with it fixed myself now, should be merged today.

Brill, thanks a lot, David.

google-cla bot added the cla: yes PR contributor has signed CLA label Apr 30, 2021

github-actions bot added technique:clustering Regarding tfmot.clustering.keras APIs and docs technique:qat Regarding tfmot.quantization.keras (for quantization-aware training) APIs and docs labels Apr 30, 2021

akarmi suggested changes May 24, 2021

View reviewed changes

tensorflow_model_optimization/python/core/clustering/keras/cluster_wrapper.py Outdated Show resolved Hide resolved

Ruomei force-pushed the toupstream/pcqat branch from 3268faf to ea49b27 Compare May 24, 2021 16:23

akarmi self-requested a review May 25, 2021 08:45

akarmi approved these changes May 25, 2021

View reviewed changes

akarmi requested a review from daverim May 27, 2021 09:15

daverim reviewed Jun 4, 2021

View reviewed changes

[Collaborative optimization] Pruning-Clustering-preserving Quantizati…

0cbc0c2

…on Aware Training (PCQAT)

Ruomei force-pushed the toupstream/pcqat branch from ea49b27 to 0cbc0c2 Compare June 4, 2021 15:01

daverim approved these changes Jun 7, 2021

View reviewed changes

daverim added the ready to pull Working to get PR submitted to internal repository, after which merging to Github happens. label Jun 7, 2021

copybara-service bot merged commit 812ea04 into tensorflow:master Jun 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Collaborative optimization] Pruning-Clustering-preserving Quantization Aware Training #699

[Collaborative optimization] Pruning-Clustering-preserving Quantization Aware Training #699

Uh oh!

Ruomei commented Apr 30, 2021 •

edited

Loading

Uh oh!

akarmi left a comment

Uh oh!

Uh oh!

akarmi left a comment

Uh oh!

Ruomei commented Jun 3, 2021

Uh oh!

daverim left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xhark commented Jun 10, 2021

Uh oh!

wwwind commented Jun 10, 2021

Uh oh!

Ruomei commented Jun 10, 2021

Uh oh!

Ruomei commented Jun 10, 2021 •

edited

Loading

Uh oh!

daverim commented Jun 11, 2021 via email

Uh oh!

Ruomei commented Jun 11, 2021

Uh oh!

Uh oh!

[Collaborative optimization] Pruning-Clustering-preserving Quantization Aware Training #699

[Collaborative optimization] Pruning-Clustering-preserving Quantization Aware Training #699

Uh oh!

Conversation

Ruomei commented Apr 30, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

akarmi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akarmi left a comment

Choose a reason for hiding this comment

Uh oh!

Ruomei commented Jun 3, 2021

Uh oh!

daverim left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Xhark commented Jun 10, 2021

Uh oh!

wwwind commented Jun 10, 2021

Uh oh!

Ruomei commented Jun 10, 2021

Uh oh!

Ruomei commented Jun 10, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daverim commented Jun 11, 2021 via email

Uh oh!

Ruomei commented Jun 11, 2021

Uh oh!

Uh oh!

Ruomei commented Apr 30, 2021 •

edited

Loading

Ruomei commented Jun 10, 2021 •

edited

Loading