New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confuse of input and label #12
Comments
Input(weights) and output are not same. Output only consider one person, every channel has only one peak. Input consider the overlap of several person, each channel may have several peak. |
@xiongzihua Thank you for the clarification. @xiaoxiaoSummer In addition to what is mentioned above;
This Repo includes the last part of the MultiPoseNet, in other words, Pose Residual Network. For more info have a look at #4 comment A Sample of Input-Output Pairs; Input has 17 channels, each channel represents one keypoint. The background image is set to a better understanding. The first channel(nose) has three different Gaussian peaks because in the bounding box there are three human noses. Output has 17 channels same as the input. In the example, there is only one Gaussian peak in the first channel. It also belongs to the bounding-box owner. Feel free for fresh questions otherwise please close this issue. |
@salihkaragoz so this network only work as a role for matching one person pose from whole pose components given at previous stages like keypoints detection and human detections? |
Yes exactly. PRN takes the output of Bounding Box results and Bottom-up keypoint estimator results as an input. Right now, it is working with Coco ground truth data. Theoretically, it can work with any output of Bottom-up Estimator results plus Bounding-box results. But it doesn't work with current Repo adjustment. We need to configure it. |
The dataloader of coco dataset shows the details of the work how to exploit in network training. But I check the dataloader function, the weight and output actually is same "dataloader.py---- line 43-64 and line 75--95", the input is the weighs variable from gendata function, why the input is coming from the known keypoints position information rather than the true image data? It is definitely different from what u said in the paper. It means that u use the known label to predict the known label? Does it make sense? If I have some misunderstanding about the code, please let me know.
for j in range(17):
if kpv[j] > 0:
x0 = int((kpx[j] - x) * x_scale)
y0 = int((kpy[j] - y) * y_scale)
The text was updated successfully, but these errors were encountered: