Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regarding paper and codes #7

Open
yzhang93 opened this issue Sep 23, 2019 · 7 comments
Open

Regarding paper and codes #7

yzhang93 opened this issue Sep 23, 2019 · 7 comments

Comments

@yzhang93
Copy link

By diving deep into the codes and the paper, I have two questions.

  1. I've read from the paper that "If the current policy exceeds our resource budget (on latency, energy or model size), we will sequentially decrease the bitwidth of each layer until the
    constraint is finally satisfied." Where in the codes correspond to this statement "decrease the bitwidth of the layer when the current policy exceeds budget?"

  2. Why don't you use the k-means quantization for latency/energy constraint experiments? Will you release codes for linear quantization?

@haibao-yu
Copy link

Hi, I also find the second question.
And Did you reappear the quantization method? I reappear the quantization method based on cifar10+resner20 as 3.4 of the paper; however, this linear quantization method didn't work.

@lydiaji
Copy link

lydiaji commented Dec 23, 2019

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

@mepeichun
Copy link

I find that the codes using k-means quantization while in the paper it says find the optimal clip value to minimize the KL divergence between non-quantized and quantized weight/activation, in the paper it means the linear quantization, which is different as shown in the codes.

This confuses me as well. The paper uses linear quantization, but the code provides k-means quantization (similar to the "deep compression"). After k-means quantization, we cannot guarantee that the weights are fixed point arithmetic units.

@lcmeng
Copy link

lcmeng commented Mar 3, 2020

It's quite unfortunate that the main novelty claimed by the paper, i.e., the use of direct hardware feedback, is conveniently missing in this repo. In fact, even the paper failed to provide a clear explanation on that claim.

@kuan-wang
Copy link
Collaborator

We have updated the linear quantization as well as the hardware resource-constrained part in this repo. Please let us know if you have any questions.

@lcmeng
Copy link

lcmeng commented May 5, 2020

Can you please point to the part where the direct HW feedback is used? Thanks. Without that, the repo is still quite limited in significance.

@kuan-wang
Copy link
Collaborator

Thanks for your feedback! You can view the related code refer to

def _get_lookuptable(self):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants