Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Did you tried your net on imagenet? #1

Open
qingzhouzhen opened this issue Sep 21, 2017 · 7 comments
Open

Did you tried your net on imagenet? #1

qingzhouzhen opened this issue Sep 21, 2017 · 7 comments

Comments

@qingzhouzhen
Copy link

Did you tried your net on imagenet? I tried your shufflenet on mxnet with serveal lr, but it did not converge. I want implement shufflenet with gluon.

@ZiyueHuang
Copy link
Owner

ZiyueHuang commented Sep 28, 2017

@qingzhouzhen Sorry for slow response because I didn't notice the issues on my repo. I don't have imagenet dataset now. But I trained it on mnist and cifar. The acc seems normally increasing. I notice that if the initializer is xavier, it cannot converge on cifar. It can converge if using uniform initializer.
You can verify it on cifar now by

git clone https://github.com/ZiyueHuang/MXShuffleNet.git
cd MXShuffleNet/image-classification
python train_cifar10.py --network shufflenet --batch-size 128 --gpus 0 --lr 0.01

Or on mnist by

python test_mnist.py

I am not sure this implementation is totally correct.
Could you please train it on imagenet for a few epochs again? If it cannot converge, please let me know.
I'll be very appreciated for your responses.

Here is the commit which add shufflenet, fdb1e77

@ZiyueHuang
Copy link
Owner

@qingzhouzhen By the way, how about your shufflenet in gluon? Did you reproduce the paper?

@qingzhouzhen
Copy link
Author

Ok, but now I am doing somework else, I will tell your result if I test you net.
No, exactly, I do not know how to use gluon, such as 'cancate' operation is not defined, I need sometime to understand how to use it.

@ZiyueHuang
Copy link
Owner

Thanks. Feel free to contact me if there are some problems.
To use operators like contact which are not in gluon.nn, there are two ways,
Use Block, internally use ndarray

from mxnet.gluon import Block
from mxnet import ndarray as F

class Net(Block):
    def __init__(self, **kwargs):
        ...
    def forward(self, x):
        # x is an ndarray
        return F.concat(x, ...)

Use HybridBlock, internally use ndarrayor symbol

from mxnet.gluon import HybridBlock

class Net(HybridBlock):
    def __init__(self, **kwargs):
        ...
    def forward(self, F, x):
        # x is an ndarray or symbol
        return F.concat(x, ...)

@qingzhouzhen
Copy link
Author

Thanks a lot, would you show me a detailed example about how to use operators which are not in gluon.nn if you know?

@ZiyueHuang
Copy link
Owner

For example, transpose is not in gluon.nn,

class Net(Block):
    def __init__(self, num_class, **kwargs):
         super(Net, self).__init__(**kwargs)
         self.dense = nn.Dense(num_class, flatten=False)

    def forward(self, x):
        out = F.transpose(x, axes=(1, 0, 2))
        out = self.dense(out)
        return out

net = Net(num_class=11)
net.initialize()
out = net(F.zeros((2, 3, 4), ctx=mx.cpu(0)))

@HaoLiuHust
Copy link

@qingzhouzhen have you reproduce the result?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants