Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gluon feature request: proper registration/initialization of layers inside a list (container) for custom (Hybrid)Blocks #10101

Closed
feevos opened this issue Mar 14, 2018 · 8 comments

Comments

Projects
None yet
4 participants
@feevos
Copy link
Contributor

commented Mar 14, 2018

Dear all, it would be very useful if one could add NN layers of a gluon custom model inside a list, similar to torch.nn.ModuleList, something like:

class CustomNet(HybridBlock):

    def __init__(self,**kwards):
        HybridBlock.__init__(self,**kwards)
        with self.name_scope():
           layers_list = []
           for i in range(5):
               layers_list += [gluon.nn.Conv2D( SomeArguments )]


    def hybrid_forward(self,F,_x):
         # Some manipulation of layers_list elements
         out = ... 
 
        return out

I can think of many use cases, but one important one is indexing for neuroevolution problems, i.e. using a variable architecture of a specified set of layers.

Thank you very much for the great work you put into gluon/mxnet.

@szha

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

how do you intend to use layers_list in your example? it is possible to use Sequential/HybridSequential just as containers without using its forward functionality.

@feevos

This comment has been minimized.

Copy link
Contributor Author

commented Mar 14, 2018

Hi @szha, thank you for your reply. I've done so in simpler architectures as you describe but now I want to try something more advanced.

The basic idea is that one can have a set of layers that live in a list, layers_list. Then one can form a sparse connectivity matrix, Sij where each row corresponds to the connections of layer_i to layer_j. The connectivity matrix will be an individual inside an evolutionary algorithm. The architecture of the network is defined by Sij. For example, a simple Sequential module, where one stacks 4 layers

net = Sequential()
for i in range(3):
    net.add(Dense(5))

can be represented with the following connectivity matrix:

   | 1   2   3   4
------------------- 
1  | 0  1   0   0
2  | 0  0   1   0
3  | 0  0   0   1 
4  | 0  0   0   0 

Starting from row, layer_1 connects to layer_2, layer_2 to layer_3 and so on. Layer 4 has no connectivy( last layer). But if we want more advanced topology of the network (like layer_1 connecting with layer_2 and layer_3)

   | 1   2   3   4
------------------- 
1  | 0  1   1   0
2  | 0  0   1   0
3  | 0  0   0   1 
4  | 0  0   0   0 

the Sequential breaks down. It is possible again to formulate it with Sequential but I think it lacks the flexibility of indexing.

Now assuming one has the layers in a container (a list in this example, I can think of dictionary usage as well), layers_list, and Sij is a sparse matrix, one can formulate a forward function (design prototype, not the true solution, here is an example of iterating over sparse matrix):

def hybrid_forward(self, F, input):
    out = self.first_layer(input)
    cx = Sij.tocoo()    
    # This for loop iterates over non zero elements. 
    for i,j,_ in itertools.izip(cx.row, cx.col, cx.data):
        out = self.layer_list[j](self.layer_list[i](out))
        out = F.relu(out)
    return out

The basic idea is to create a DAG on the fly, using lists and a connectivity matrix is the first way that comes into my mind of implemeting this (I may be wrong, am pretty sure there are perhaps better ways of doing so, but I don't know any). I think this functionality, in combination with the flexibility of gluon imperative style, can help a lot of people play with variable architectures.

@feevos feevos changed the title gluon feature request: proper registration/initialization of layers inside a list for custom (Hybrid)Blocks gluon feature request: proper registration/initialization of layers inside a list (container) for custom (Hybrid)Blocks Mar 14, 2018

@szha

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

In [1]: import mxnet as mx

In [2]: net = mx.gluon.model_zoo.vision.alexnet()

In [3]: net
Out[3]:
AlexNet(
  (features): HybridSequential(
    (0): Conv2D(None -> 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
    (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (2): Conv2D(None -> 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
    (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (4): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (5): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (6): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (7): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
    (8): Flatten
    (9): Dense(None -> 4096, Activation(relu))
    (10): Dropout(p = 0.5, axes=())
    (11): Dense(None -> 4096, Activation(relu))
    (12): Dropout(p = 0.5, axes=())
  )
  (output): Dense(None -> 1000, linear)
)

In [4]: net.features
Out[4]:
HybridSequential(
  (0): Conv2D(None -> 64, kernel_size=(11, 11), stride=(4, 4), padding=(2, 2))
  (1): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (2): Conv2D(None -> 192, kernel_size=(5, 5), stride=(1, 1), padding=(2, 2))
  (3): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (4): Conv2D(None -> 384, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (5): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (6): Conv2D(None -> 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (7): MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
  (8): Flatten
  (9): Dense(None -> 4096, Activation(relu))
  (10): Dropout(p = 0.5, axes=())
  (11): Dense(None -> 4096, Activation(relu))
  (12): Dropout(p = 0.5, axes=())
)

In [5]: net.features[3]
Out[5]: MaxPool2D(size=(3, 3), stride=(2, 2), padding=(0, 0), ceil_mode=False)
@sxjscience

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

Refer to #9264 .

@sxjscience

This comment has been minimized.

Copy link
Member

commented Mar 14, 2018

@feevos Currently the HybridSequential and Sequential have the same functionality as ModuleList. Thus we previously decide to not add an additional ModuleList. We can bring it to the table again.

@feevos

This comment has been minimized.

Copy link
Contributor Author

commented Mar 15, 2018

Hi @szha and @sxjscience thank you very much for your reply. So if I understand correctly, (Hybrid)Sequential can also be used as a container of the various layers and indexed just like a list so I can use the contained layers in any order I want (without the stacked sequential forward functionality). If I understand correctly, I can use something like:

class CustomNet(HybridBlock):

    def __init__(self,**kwards):
        HybridBlock.__init__(self,**kwards)

        with self.name_scope():
            self.net = HybridSequential()
            # Add some convolution operators
            for i in range(3):
                net.add(Conv2D(....))

    # Change the order of the layers in the self.net. 
    # This is not equivalent to self.net(input)
    def hybrid_forward(self,F, input):
        out = self.net[2]( input)
        out = self.net[0] (out) 
        out = self.net[1] (out) 

       return out

if my understanding is correct then yes, there is no need for something similar to ModuleList. I haven't seen anything like what you just described in the documentation (it would be nice to add it in the gluon book and API).

Thank you very much!

@feevos feevos closed this Mar 15, 2018

@szha

This comment has been minimized.

Copy link
Member

commented Mar 15, 2018

Good point on making the feature known. cc'd @zackchase, @astonzhang, @mli

@jacksonloper

This comment has been minimized.

Copy link

commented May 8, 2018

This solution is kind of weird. Sequential feels like it ought to be composed of things that can feed into one another. But if you are just using it as a list, the shapes might not even be right for that.

I admit it isn't a high priority, but just for sugar it might be nice to implement a separate blocklist class.

@sxjscience sxjscience referenced this issue Jun 13, 2018

Closed

add blocklist #11254

0 of 7 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.