Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alexnet only get 48.5% Top 1 accuracy #4

Open
yiqianglee opened this issue Sep 28, 2017 · 11 comments
Open

Alexnet only get 48.5% Top 1 accuracy #4

yiqianglee opened this issue Sep 28, 2017 · 11 comments

Comments

@yiqianglee
Copy link

Hi,
After download the chainer alexnet pre-trained model, we only get 48.5% Top 1 accuracy, is it expected?

@knorth55
Copy link
Collaborator

After download the chainer alexnet pre-trained model, we only get 48.5% Top 1 accuracy, is it expected?

What do you mean by Top 1 accuracy?
Can you put a code which can reproduct your issue?

@yiqianglee
Copy link
Author

We just use Imagenet's validation set to count the accuracy, totally, it should be 50,000 images, maybe too large for me to upload, but i think you can download it from official imagenet website.

One more question, is this chainer model trained by chainer directly? What hyper-parameter do you use? Liker learning rate, momentum rate, batch size, epoch ??

@knorth55
Copy link
Collaborator

this chainer model is converted from alexnet model trained by caffe,
and the original caffe model is downloaded from http://dl.caffe.berkeleyvision.org/bvlc_alexnet.caffemodel

@knorth55
Copy link
Collaborator

knorth55 commented Sep 28, 2017

We just use Imagenet's validation set to count the accuracy, totally, it should be 50,000 images, maybe too large for me to upload, but i think you can download it from official imagenet website.

this is weird. let me have a look to compare the chainermodel is correctly converted from original caffemodel.
can you put your validation code for imagenet?
you dont need to upload imagenet dataset, just the code you wrote.

@wkentaro
Copy link
Owner

Maybe we need evaluate.py to reproduce the evaluation in the original work.

@yiqianglee
Copy link
Author

Thanks for you reply, and hope for your result.

Actually, i'm more interesting to find a chainer model trained by chainer directly, not converted from caffe, but still not find any model zoo, is there any comment from your guys? Thanks!

@knorth55
Copy link
Collaborator

Actually, i'm more interesting to find a chainer model trained by chainer directly, not converted from caffe, but still not find any model zoo, is there any comment from your guys? Thanks!

I've never seen Alexnet model trained by chainer because chainer can load caffemodel by chainer.links.caffe.CaffeFunction.
But if there is a big difference between the original caffemodel and model trained by chainer, it would be quite interesting.

1 similar comment
@knorth55
Copy link
Collaborator

Actually, i'm more interesting to find a chainer model trained by chainer directly, not converted from caffe, but still not find any model zoo, is there any comment from your guys? Thanks!

I've never seen Alexnet model trained by chainer because chainer can load caffemodel by chainer.links.caffe.CaffeFunction.
But if there is a big difference between the original caffemodel and model trained by chainer, it would be quite interesting.

@knorth55
Copy link
Collaborator

knorth55 commented Sep 29, 2017

this is weird. let me have a look to compare the chainermodel is correctly converted from original caffemodel.

I compared weight of each layers and calculated diff of original caffemodel and original chainermodel.
result is as below and there seems no big diff between the two models.

linkname: conv1
    diff norm: 0.0
linkname: conv2
    diff norm: 0.0
linkname: conv3
    diff norm: 0.0
linkname: conv4
    diff norm: 0.0
linkname: conv5
    diff norm: 0.0
linkname: fc6
    diff norm: 0.0
linkname: fc7
    diff norm: 0.0
linkname: fc8
    diff norm: 0.0

checking script is as below:

import chainer
from chainer.links import caffe
from model import AlexNet
import numpy as np
import os.path as osp

filepath = osp.abspath(osp.dirname(__file__))


caffemodel_path = osp.join(filepath, './data/bvlc_alexnet.caffemodel')
chainermodel_path = osp.join(filepath, './data/bvlc_alexnet.chainermodel')

# chainer
chainermodel = AlexNet()
chainer.serializers.load_hdf5(chainermodel_path, chainermodel)


# caffe
caffemodel = caffe.CaffeFunction(caffemodel_path)

linknames = sorted([x.name for x in caffemodel.children()])

for linkname in linknames:
    print('linkname: {}'.format(linkname))
    caffe_layer = getattr(caffemodel, linkname)
    chainer_layer = getattr(chainermodel, linkname)
    diff_norm = np.linalg.norm(caffe_layer.W.data - chainer_layer.W.data) 
    print('    diff norm: {}'.format(diff_norm))

@yiqianglee
Copy link
Author

Thanks for your kindly reply.

We use the similar script to load chainer model. Have you tried in whole validation set, and get the overall accuracy?

@knorth55
Copy link
Collaborator

We use the similar script to load chainer model. Have you tried in whole validation set, and get the overall accuracy?

We've never done before.
Please check if you dont forget to preprocess images before you infer the data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants