Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is AmoebaNet-D? #130

Closed
mrteera opened this issue May 24, 2018 · 10 comments
Closed

What is AmoebaNet-D? #130

mrteera opened this issue May 24, 2018 · 10 comments
Assignees

Comments

@mrteera
Copy link

mrteera commented May 24, 2018

I check the paper and google search but I could not find any information about AmoebaNet-D. In the paper I found only AmoebaNet-A, AmoebaNet-B, and AmoebaNet-C. What is AmoebaNet-D?

@sb2nov
Copy link
Contributor

sb2nov commented May 24, 2018

From one of the authors:

"AmoebaNet-D was obtained by evolving on the ImageNet training set, starting with AmoebaNet-B, and then manually extrapolating the evolutionary process and tuning it for low training cost. It was the version submitted to the Stanford DAWNBench competition. More details may follow in subsequent publications."

@sb2nov sb2nov closed this as completed May 24, 2018
@xmfbit
Copy link

xmfbit commented Jun 7, 2018

@sb2nov Sorry for bothering you. In the implementation of AmoebaNet, I am confused in the code (see https://github.com/tensorflow/tpu/blob/master/models/experimental/amoeba_net/network_utils.py#L321):

net.append(prev_layer)

Why not insert prev_layer at the beginning of the list? It is the h_0 in the paper, right?

And I want to know why the used_hiddenstates of AmoebaNet-A is [1, 0, 1, 0, 0, 1, 0]. See https://github.com/tensorflow/tpu/blob/master/models/experimental/amoeba_net/model_specs.py#L32
There are only 3 state will be concated from the paper. So why there are 4 zeros in the list?

@sb2nov
Copy link
Contributor

sb2nov commented Jun 7, 2018

@bignamehyp could you answer the above?

@bignamehyp
Copy link
Member

Normal cell diagram of amoeba_net_d: https://goo.gl/gKt4fL
reduction cell diagram: https://goo.gl/a57tsX

@xmfbit
Copy link

xmfbit commented Jun 8, 2018

@bignamehyp Thanks for the reply. I am confused why used_hiddenstates = [0, 1, 1, 0, 0, 1, 0] for the normal cell of amoeba_net_d?(See https://github.com/tensorflow/tpu/blob/master/models/experimental/amoeba_net/model_specs.py#L48)

From the diagram, the output of the cell is concated by 3 hidden states. But there are 4 zeros in the used_hiddenstates, which means not used and will be concated to get the output of the cell.

@bignamehyp
Copy link
Member

There are 7 elements in used_hiddenstates. The first two elements are input hidden states h0 and h1, which will not be used for concat. The last 5 elements indicates whose outputs were used for concat.

@xmfbit
Copy link

xmfbit commented Jun 8, 2018

Thanks for your patience! Yes, I got what you said above. But I am more confused actually. I think the first two hidden states are not skipped. Refer to https://github.com/tensorflow/tpu/blob/master/models/experimental/amoeba_net/network_utils.py#L490

Another question: h0 should be the first element in the list of the hidden state, right? But in the code, net (which is h1 I think) is the first element. Refer to https://github.com/tensorflow/tpu/blob/master/models/experimental/amoeba_net/network_utils.py#L321 Why not net.insert(0, prev_layer)?

@karandwivedi42
Copy link

I agree with @xmfbit. The used_hiddenstates don't match the description. @bignamehyp can you please answer the questions raised above :)

@xmfbit
Copy link

xmfbit commented Jun 21, 2018

@karandwivedi42 Actually, I set the first two elements of used_hiddenstates to 1.

@bignamehyp
Copy link
Member

Thank you very much for digging into the code. There are actually bugs in the model builder:

  1. prev_layer should be in position 0 instead 1.
  2. used_hiddenstates were not computed correctly. The first two elements should be 1.

Our visualization of cell architecture was based on what we believed instead of what code produced. We have updated the cell architecture on the paper to matched the code output. Please see figure 2 on https://arxiv.org/pdf/1802.01548.pdf for the latest diagrams.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants