Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounding box sizes too large #9

Open
simon-rob opened this issue May 4, 2018 · 9 comments
Open

Bounding box sizes too large #9

simon-rob opened this issue May 4, 2018 · 9 comments

Comments

@simon-rob
Copy link

Robert many thanks for your great work!

I am having trouble understanding why I am getting larger than expected bounding boxes for Pelee detections.

The heights and widths are not as closely cropped when compared to mobilenet-SSD implementations. I have read that you trained the model with pytorch, could the conv padding be a problem? Or is there something else I have missed?

Many Thanks,

Simon

I am using the following python script to for my test:

net_file= 'pelee.prototxt'  
caffe_model='pelee_304x304_acc7637.caffemodel' 
test_dir = "images"

if not os.path.exists(caffe_model):
    print("caffemodel does not exist")
    exit()
net = caffe.Net(net_file,caffe_model,caffe.TEST)  

CLASSES = ('background',
           'aeroplane', 'bicycle', 'bird', 'boat',
           'bottle', 'bus', 'car', 'cat', 'chair',
           'cow', 'diningtable', 'dog', 'horse',
           'motorbike', 'person', 'pottedplant',
           'sheep', 'sofa', 'train', 'tvmonitor')

def preprocess(src):
    img = cv2.resize(src, (304,304))
    img_mean = np.array([103.94, 116.78, 123.68], dtype=np.float32)
    img = img.astype(np.float32, copy=True) - img_mean
    img = img * 0.017
    return img

def postprocess(img, out):   
    h = img.shape[0]
    w = img.shape[1]
    box = out['detection_out'][0,0,:,3:7] * np.array([w, h, w, h])
    cls = out['detection_out'][0,0,:,1]
    conf = out['detection_out'][0,0,:,2]
    return (box.astype(np.int32), conf, cls)

def detect(imgfile, thresh):
    origimg = cv2.imread(imgfile)
    img = preprocess(origimg)
    img = img.astype(np.float32)
    img = img.transpose((2, 0, 1))

    net.blobs['data'].data[...] = img
    out = net.forward()  
    box, conf, cls = postprocess(origimg, out)
    for i in range(len(box)):
       if conf[i] > thresh :
          p1 = (box[i][0], box[i][1])
          p2 = (box[i][2], box[i][3])
          cv2.rectangle(origimg, p1, p2, (0,255,0))
          p3 = (max(p1[0], 15), max(p1[1], 15))
          title = "%s:%.2f" % (CLASSES[int(cls[i])], conf[i])
          cv2.putText(origimg, title, p3, cv2.FONT_ITALIC, 0.6, (0, 255, 0), 1)
    cv2.imshow("Pelee", origimg)
 
    k = cv2.waitKey(0) & 0xff
    if k == 27 : return False
    return True

for f in os.listdir(test_dir):
    if detect(test_dir + "/" + f, 0.2) == False:
       break

ssd_screenshot_04 05 2018

@simon-rob
Copy link
Author

I have just downloaded your latest prototxt and it's fixed!

ssd_screenshot_04 05 2018-fixed

@MrWhiteHomeman
Copy link

@simon-rob , Thank you for your python script, your python script use cpu or gpu to test a picture?
I use the python script to test a image in 0.3 seconds, so I don't know cpu or gpu I used.
Can you give me some advice?

@simon-rob
Copy link
Author

@MrWhiteHomeman,

It depends if you successfully compiled the GPU version of Caffe and you didn't disable the GPU by uncommenting CPU_ONLY := 1 in the Makefile.config

If you do have a GPU version installed, you can switch between CPU and GPU by using:

caffe.set_mode_gpu() or caffe.set_mode_cpu()

Otherwise it should default to using the GPU.

You could try putting caffe.set_mode_cpu() in the python code to see if the performance differs.

@MrWhiteHomeman
Copy link

@simon-rob
Thank you for your excellent advice! I know how to switch between cpu and gpu from you .
I spend 0.1 seconds to detecting a image by gpu, 0.3 seconds by cpu. How many seconds take you to detect a image by cpu or gpu? And I have another question, the first line of your python script, the net_file is pelee.prototxt, do you mean the pelee.prototxt is the deploy.prototxt?

@simon-rob
Copy link
Author

simon-rob commented May 22, 2018

@MrWhiteHomeman

I haven't bench-marked the speed yet as I am not interested in using PC GPU speed. I am interested in CPU/GPU inference on mobile/embedded. But I have got 45-50ms on snapdragon 820 for mobileNet-SSD v1 so I am hoping PeeleNet will be about the same or faster.

So 0.1 or 100ms seems a bit slow, but it depends on how/when you are measuring the speed to/from. I normally measure just the inference time and not the image load or pre-processing as that is the same for whatever network and will vary with CPU type and original image size.

As for pelee.prototxt, yes it is the same as deploty.prototxt - deploy.prototxt is too generic and I get confused to easily!

@shudct
Copy link

shudct commented May 23, 2018

@simon-rob
i notice when testing, img = img * 0.017. can you explain why?

@simon-rob
Copy link
Author

@shudct

That code is normalising the inputs in the same way that the author trained the network.

See https://www.coursera.org/learn/deep-neural-network/lecture/lXv6U/normalizing-inputs for a mathematical explaination.

Have you tried taking the code out to see what happens?

@shudct
Copy link

shudct commented Jun 21, 2018

@simon-rob I have tested without img = img * 0.017, the result is completely wrong. Usually, the code to normalize the input is by 1/255. I'm confused to the exact meaning of 0.017.

@simon-rob
Copy link
Author

It is scaling the input as described in the video with the same scaling that Robert used during the training. Have a look at the scale parameter in train_merged.prototxt:

 transform_param {

    scale: 0.0170000009239

    mirror: true

    mean_value: 103.940002441

    mean_value: 116.779998779

    mean_value: 123.680000305

    resize_param {

      prob: 1.0

      resize_mode: WARP

      height: 304

      width: 304

      interp_mode: LINEAR

      interp_mode: AREA

      interp_mode: NEAREST

      interp_mode: CUBIC

      interp_mode: LANCZOS4

    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants