Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` #8

bkj · 2018-04-03T20:03:55Z

There are a number of differences between _fixed_layer and _enas_layer in cifar10/micro_child.py.

Are you able to give some insight on why the code works like this? It seems that when a fixed architecture is specified, the resulting model is not necessarily exactly the same as during the RL training. It seems to me like the easiest way to fix the child architecture is to have an alternate "dummy controller", that just keeps normal_arc and reduce_arc fixed at the desired architecture.

Thanks
Ben

The text was updated successfully, but these errors were encountered:

hyhieu · 2018-04-03T21:00:10Z

Hi Ben,

Thanks for the questions. I'll try.

The point of layer_base, which is just a 1x1 convolution, is to standardize the number of output channels to out_filters before performing the main operation in a convolutional cell or a normal cell. In _enas_layer, we do this in final_conv. The effect is almost the same, but we found it easier to implement this way.
I don't understand this point of yours. Both _fixed_layer and _enas_layer use both convolutions and pooling. For fixed_layer, I hope the code is quite straightforward. For _enas_layer, since we need to implement a somewhat dynamic graph, we separate the process into the function _enas_cell.
The purpose of _factorized_reduction is to reduce both spatial dimensions (width and height) by a factor of 2, and potentially to change the number of output filters. Where you mention it, this function is used to make sure that the outputs of all operations in a convolutional cell or a reduction cell will have the same spatial dimensions, so that they can be concatenated along the depth dimension.

The reason why we cannot just fix normal_arc and reduce_arc and use the same code for both the search process and fixed-architecture process is efficiency. Dynamic graphs in TF, at least the way we implement them, are slow and very memory inefficient.

Let us know if you still have more questions 😃

bkj · 2018-04-03T21:13:50Z

For number 2, the point was that you're using pooling w/ stride > 1 in the fixed architecture, but a combination of _factorized_reduction and pooling w/ stride = 1 in the ENAS cells.

Makes sense about the dynamic graphs being slow.

Thanks for the quick response. (And thanks for releasing the code! I've been working on a similar project for a little while, so am very excited to compare what I've done to your code.)

~ Ben

hyhieu · 2018-04-03T22:29:47Z

For number 2, the point was that you're using pooling w/ stride > 1 in the fixed architecture, but a combination of _factorized_reduction and pooling w/ stride = 1 in the ENAS cells.

I think it's just because we couldn't figure out how to syntactically make _factorized_reduction run with the output of a dynamic operation, such as tf.case.

stanstarks · 2018-05-18T16:03:34Z

@hyhieu I am wondering if the reduction cell in _fixed_layer and _enas_layer have the same previous layers
result of _factorized_reduction is appended to the layers

If I understand it correctly, to make the previous layers consistent, this line should be

layers = [layers[0], x]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` #8

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` #8

bkj commented Apr 3, 2018

hyhieu commented Apr 3, 2018 •

edited

bkj commented Apr 3, 2018

hyhieu commented Apr 3, 2018

stanstarks commented May 18, 2018

Difference between _fixed_layer and _enas_layer in cifar10/micro_child.py #8

Difference between _fixed_layer and _enas_layer in cifar10/micro_child.py #8

Comments

bkj commented Apr 3, 2018

hyhieu commented Apr 3, 2018 • edited

bkj commented Apr 3, 2018

hyhieu commented Apr 3, 2018

stanstarks commented May 18, 2018

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` #8

Difference between `_fixed_layer` and `_enas_layer` in `cifar10/micro_child.py` #8

hyhieu commented Apr 3, 2018 •

edited