Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC'22 Multi-tasking computer vision model: object detection, object segmentation and human pose detection #76

Closed
wants to merge 20 commits into from

Conversation

obtx
Copy link

@obtx obtx commented Jul 29, 2022

No description provided.

@fengyuentau fengyuentau changed the title New branch GSoC'22 Multi-tasking computer vision model: object detection, object segmentation and human pose detection Aug 2, 2022
@fengyuentau
Copy link
Member

Please read comments carefully below:

  1. The pull request multitask-centernet #74 will be closed. Update your code at this pull request.
  2. Please use Git properly. You need to use git-lfs to push the your model into this pull request.
  3. Please read Contribution Guidelines. Based on what you have in this pull request, I can say there are many things done in a wrong way. First of all, the model directory is put in a wrong location; the model is not named properly; there is a lot of unrelated code. Please, read the guideline and learn from previous pull requests.

@fengyuentau fengyuentau self-assigned this Aug 8, 2022
@fengyuentau fengyuentau mentioned this pull request Aug 8, 2022
Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to move 模型/multitask_center/README.md to models/multitask_centernet and remove 模型/multitask_center.

By the way, please use git in terminal instead of web pages.

And, please, see my comments below.

models/multitask_centernet/demo.py Outdated Show resolved Hide resolved
@fengyuentau fengyuentau added the GSoC Google Summer of Code projected related label Aug 8, 2022
Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you invite me as a collaborator in your fork of opencv_zoo? I will try to upload the model for you on my side.

models/multitask_centernet/LICENSE Outdated Show resolved Hide resolved
models/multitask_centernet/demo.py Outdated Show resolved Hide resolved
Comment on lines 14 to 30
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument('--imgpath', type=str, default='images/d2645891.jpg', help="image path")
parser.add_argument('--modelpath', type=str, default='MCN.onnx')
args = parser.parse_args()

mcn = MCN(args.modelpath)
srcimg = cv2.imread(args.imgpath)
srcimg = mcn.detect(srcimg)
cv2.imwrite('result.png', srcimg)


# winName = 'using MCN in OpenCV'
# cv2.namedWindow(winName, 0)
# cv2.imshow(winName, srcimg)
# cv2.waitKey(0)
# cv2.destroyAllWindows()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A demo with webcam stream as input should be provided as well. You can find a example here.

models/multitask_centernet/multitask_centernet.py Outdated Show resolved Hide resolved
Comment on lines +176 to +179
img, newh, neww, padh, padw = self.resize_image(srcimg)
blob = cv2.dnn.blobFromImage(img, scalefactor=1 / 255.0, swapRB=True)
# blob = cv2.dnn.blobFromImage(self.preprocess(img))
# Sets the input to the network
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these lines into preprocess and call preprocess.

Comment on lines +185 to +202
# inference output
row_ind = 0
for i in range(self.nl):
h, w = int(self.inpHeight / self.stride[i]), int(self.inpWidth / self.stride[i])
length = int(self.na * h * w)
if self.grid[i].shape[2:4] != (h, w):
self.grid[i] = self._make_grid(w, h)

outs[row_ind:row_ind + length, 0:2] = (outs[row_ind:row_ind + length, 0:2] * 2. - 0.5 + np.tile(
self.grid[i], (self.na, 1))) * int(self.stride[i])
outs[row_ind:row_ind + length, 2:4] = (outs[row_ind:row_ind + length, 2:4] * 2) ** 2 * np.repeat(
self.anchor_grid[i], h * w, axis=0)

self.num_coords = outs.shape[1] - self.last_ind
outs[row_ind:row_ind + length, self.last_ind:] = outs[row_ind:row_ind + length, self.last_ind:] * 4. - 2.
outs[row_ind:row_ind + length, self.last_ind:] *= np.tile(np.repeat(self.anchor_grid[i], h * w, axis=0), (1, self.num_coords//2))
outs[row_ind:row_ind + length, self.last_ind:] += np.tile(np.tile(self.grid[i], (self.na, 1)) * int(self.stride[i]), (1, self.num_coords//2))
row_ind += length
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these lines into postprocess

cv2.putText(frame, label, (left, top - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 255, 0), thickness=1)
return frame

def detect(self, srcimg):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rename this function to infer and keep it simple like the following:

def infer(self, image):
    input_blob = self.preprocess(image)

    self.model.setInput(input_blob)
    output_blob = self.model.forward(self.model.getUnconnectedOutLayersNames())

    results = self.postprocess(output_blob)
    return results

Here is another example for reference.

Copy link
Member

@fengyuentau fengyuentau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make demo work as soon as possible. I got the file missing error:

Traceback (most recent call last):
  File "/Some/Path/opencv_zoo/models/multitask_centernet/demo.py", line 20, in <module>
    mcn = MCN(args.modelpath)
  File "/Some/Path/opencv_zoo/models/multitask_centernet/multitask_centernet.py", line 13, in __init__
    with open('crowd_class.names', 'rt') as f:
FileNotFoundError: [Errno 2] No such file or directory: 'crowd_class.names'

Comment on lines +109 to +112
person_indices = cv2.dnn.NMSBoxes(person_boxes, person_confidences, config['person_conf_thres'],
config['person_iou_thres']).flatten()
kp_indices = cv2.dnn.NMSBoxes(kp_boxes, kp_confidences, config['kp_conf_thres'],
config['kp_iou_thres']).flatten()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NMSBoxes returns a empty tuple if no person is detected. Calling flatten on an empty tuple triggers error.
Traceback (most recent call last):

  File "/path/opencv_zoo/models/multitask_centernet/demo.py", line 14, in <module>
    srcimg = mcn.detect(srcimg)
  File "/path/opencv_zoo/models/multitask_centernet/multitask_centernet.py", line 207, in detect
    srcimg = self.postprocess(srcimg, outs, padsize=(newh, neww, padh, padw))
  File "/path/opencv_zoo/models/multitask_centernet/multitask_centernet.py", line 112, in postprocess
    kp_indices = cv2.dnn.NMSBoxes(kp_boxes, kp_confidences, config['kp_conf_thres'],
AttributeError: 'tuple' object has no attribute 'flatten'

Please doublecheck with images with no person and even no objects at all.

@fengyuentau
Copy link
Member

You need to properly handle the case when there is no person in the image. Currently your script does not produce any boxes if there is no person in the image, such as the image below.

test5

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GSoC Google Summer of Code projected related
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants