Skip to content
This repository has been archived by the owner on Jan 3, 2023. It is now read-only.

Missing some layers #381

Open
guoxuesong opened this issue Jul 17, 2017 · 5 comments
Open

Missing some layers #381

guoxuesong opened this issue Jul 17, 2017 · 5 comments
Assignees

Comments

@guoxuesong
Copy link

guoxuesong commented Jul 17, 2017

I'm porting my model from lasagne/theano .

I found there are some layers I need not exist in neon.

The first question I want to ask is " Is there any tutorials about creating custom layers ?"

I looked into the history issues, Create custom layer #100
was closed, but I think there still should be some kind of tutorials, at least to tell what kind of requests is doable and what not.

Some layers looks simple but missed, for example something like SliceLayer to slice the input at a special axis, something was said simple in history issues like DimshuffleLayer to transpose input.

I think if these are really simple and NervanaSystems do not want to support them officially, maybe you can teach us howto do it by ourselves.

But, the real challenge for me is to implement Goroshin's argmax, as referenced in STACKED WHAT-WHERE AUTO-ENCODERS, I implemented it using Theano just like this:

def goroshin_argmax(z,shape,axis=(1,),beta=3,epsilon=0.0001):
    z=z/(abs(T.max(z))+floatXconst(epsilon))
    a=()
    for t in axis:
        a+=(slice(0,shape[t]),)
    xyshape=list(shape)+[]
    for i in range(len(shape)):
        if i not in axis:
            xyshape[i]=1
    xy=T.mgrid[a]
    b=T.exp(beta*z)/T.exp(beta*z).sum(axis,keepdims=True)
    res=[]
    for i in range(len(axis)):
        x=((xy[i].astype(floatX)).reshape(xyshape)*b).sum(axis=axis)
        res+=[x]
    return T.stack(res,axis=1)

It seems that neon has something named Autodiff, can I use it to calculate the gradient or should I do some math work ?

I know you have another thing named ngraph , can I use ngraph to write custom layers for neon ? I don't want use ngraph to do the whole job.

@guoxuesong
Copy link
Author

I just make my project public: deepstacks

deepstacks is: A build_network() for Lasagne and noen. Define your network model in a datacheet with stack machine mechanisms. Support reuse part of model as function, and share parameters.

Please have a look at deepstacks/deepstacks/neon/implement.py

To complete the implement for neon, I need ( in the words of Lasagne): ElemwiseMergeLayer,SliceLayer,Upscale[123]DLayer,LocallyConnected[123]DLayer,DimshuffleLayer,GaussianNoiseLayer,ExpressionLayer. Leave them not implemented is ok, but I want to complete it if posible.

I wish my project can help more peaple to take advantage of neon. Though, myself is new to neon, so if any part of may code is wrong, just let me known.

@chengchingwen
Copy link

there is a neon tutorial
maybe you can take a look at this one

@guoxuesong
Copy link
Author

@chengchingwen would you please explain these lines for me ? in bprop:

    if self.deltas:
        self.be.compound_dot(A=self.W.T, B=error, C=self.deltas, alpha=alpha, beta=beta)
    self.be.compound_dot(A=error, B=self.inputs.T, C=self.dW)

@guoxuesong
Copy link
Author

I tried to implement a GaussianNoiseLayer, following is my code. I'm not sure whether this is correctly, I does not really understand the alpha, beta things, just copied them from SkipNode:


class GaussianNoiseLayer(Layer):
    def __init__(self, sigma=0.1, name=None):
        super(GaussianNoiseLayer, self).__init__(name)
        self.sigma = sigma
        self.owns_delta = True
        self.is_mklop = True

    def fprop(self, inputs=None, inference=False, beta=0):
        self.be.fill_normal(self.noisebuf, stdv=self.sigma)
        self.be.fprop_skipnode(inputs, self.outputs, beta)
        self.outputs[:] = self.outputs + self.noisebuf
        return self.outputs

    def configure(self, in_obj):
        super(GaussianNoiseLayer, self).configure(in_obj)
        self.out_shape = self.in_shape

        self.noisebuf = self.be.iobuf(self.in_shape, dtype=np.float32)
        # self.noisebuf = self.be.iobuf(self.in_shape)
        return self

    def bprop(self, error, alpha=1.0, beta=0.0):
        # for better performance, mkl do nothing
        # otherwise, convert back and deal with beta and alpha.
        self.be.bprop_skipnode(error, self.deltas, alpha, beta)
        return self.deltas

@chengchingwen
Copy link

@guoxuesong
in bprop, self.deltas is the error need to be back prop to the previous layer, alpha & beta is just parameters of self.be.compound_dot, you may want to take a look at the doc.
I guess it just say take the dot product of self.W.T & error and assign to self.deltas if self.deltas is not set, and compute the dW every time the bprop is called.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants