-
Notifications
You must be signed in to change notification settings - Fork 138
Extending towards Hrnet as 2D joint detector #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello! Any source of 2D joints can be used "out of the box" as long as it has the following Joints : Right now the use case was a real-time demo of a single person, but due to the fast-enough evaluation speed on the 2D to 3D part multiple persons could be handled using iterative runs for every skeleton detected ( framerate should be ok for 1-3 persons but gradually slower ). The easiest way to do a conversion from an arbitrary 2D joint estimator I think is by using the CSV file format -> https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/master/dataset/sample.csv If the output is dumped in a csv file using this format then output can be very quickly tested through MocapNET using : ./MocapNETJSON --from YourDataset.csv --visualize The CSV file format is very easy to write and parse (especially from python), the only caveat and possible pitfall is that the csv file has normalized coordinates that are expected to have a 1.777* aspect ratio since the original cameras I am targeting are GoPro cameras configured for 1920x1080@120fps+. If you have a different video input resolution the normalization step will have to respect this aspect ratio. Of course the code that I use to preserve the aspect ratio regardless of input is included in the repository and can be used for reference https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/master/MocapNETLib/jsonMocapNETHelpers.cpp#L498 and using the normalizeWhileAlsoMatchingTrainingAspectRatio call https://github.com/FORTH-ModelBasedTracker/MocapNET/blob/master/MocapNETLib/jsonMocapNETHelpers.cpp#L174 That being said, I will clone HRNET and try it out :) |
Thank you for the reply. I am trying out different permutations as well. Not really a pro , learning by trying. I must mention , with the |
Last version of yolo I had checked out was yolov2, but only for detection of objects and not persons. In any case testing with hrnet would be initially more like an offline experiment especially since hrnet is python/pytorch while this repo is C++/tensorflow |
Yes i totally agree , as a start it should be done in locally saved videos .If i understand correctly, i might be wrong , you need the |
Yes, you need at least the hip, neck, head, rshoulder, relbow, rhand, lshoulder, lelbow, lhand, rhip, rknee, rfoot, lhip, lknee, lfoot joint 2D positions organized as 2DXhip,2DYhip,Vhip , ... where V is a visibility flag that is 1 when the joint is visible and 0 when joint is invisible. The sample CSV file shows the full joint list received from OpenPose Body+Hands 2D Output The full list of input has 171 elements ( 57 triplets of X2D,Y2D,VisibilityFlag ) by populating an std::vector with 171 values with the correct order and running the runMocapNET call you get back another vector with the full body BVH configuration that needs no inverse kinematics and cant be directly used to animate a model. This can be also visualized from the main application of course |
Thanks a lot for the infromation. I still couldnt manage to extend it to the simple hrnet as 2D detector. I am doing everything off line at this moment |
Hello, if you have a sample small CSV file you generated ( like this ) I can take a look at it to maybe help you resolve the problem.. |
I have given a CSV file example, that can be used to package any 2D estimator output and enable its processing by MocapNET, adding native support for multiple 2D estimators is beyond the scope of this repository so I am closing to this issue! :) |
Hi , first of all great work. I was wondering if it could be extended to
HrNET
as it is supposed to highly accurate ? Here is an implementation of it . I think it is possible , to dump thejson
file per frame format in for the keypoints. It is based oncoco
keypoints . Link to the reposimpleHRNET
There is a demo script here demo_script
The keypoints are outputted here keypoints
The keypoint is array type of
Nx17x3
where N is number of persons. Please let me know what you think about it ?The text was updated successfully, but these errors were encountered: