Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use my own dataset? #25

Closed
zqq-judy opened this issue Oct 17, 2020 · 18 comments
Closed

How to use my own dataset? #25

zqq-judy opened this issue Oct 17, 2020 · 18 comments

Comments

@zqq-judy
Copy link

Hello!
I want to use other datasets to achieve some of my own tasks (not tasks related to hands and bodies). How to use my own dataset? What preparations need to be done? Can you give specific guidance?
Thank you in advance for your reply.

@mks0601
Copy link
Owner

mks0601 commented Oct 17, 2020

What you should is just add another data loading code.
Add data/YOUR_DB/YOUR_DB.py and make similar code with data/any_db/any_db.py
Then, open main/config.py and set trainset_3d=['YOUR_DB'] or trainset_2d=['YOUR_DB']
I can't give you more detailed guidance because I know nothing about your DB :(

@zqq-judy
Copy link
Author

  • Can you tell me what you did when dealing with inconsistent dataset formats? Perhaps it can be said, what preprocessing did you do to the dataset?
  • For example: in the directory data/xx/xx, there are some bbox_root_xx_output.json, xx_train.json, etc. How should I get these files based on my own dataset? Based on the general steps you took when processing the dataset in this project, can you tell me a general process?
  • Maybe you can use one of the datasets used in your project as an example?

@mks0601
Copy link
Owner

mks0601 commented Oct 17, 2020

I preprocessed all datasets to MSCOCO format. You can refer to that site and annotation of MSCOCO dataset.

@zqq-judy
Copy link
Author

Hello, I have one more question.
How did you get these files, such as J_regressor_h36m_correct.npy and J_regressor_coco_hip_smpl.npy?

@mks0601
Copy link
Owner

mks0601 commented Oct 31, 2020

h36m is from here and coco is from smpl joint regressor

@wangzheallen
Copy link

wangzheallen commented Dec 6, 2020

  1. Do we need to have our dataset-specific regressor like H36M? 'J_regressor_h36m_correct.npy'
  2. I saw COCO has 'J_regressor_coco_hip_smpl.npy' and 'coco_smplifyx_train.json', does the pseudo GT on COCO helps performance converge? or what will happen if we disable the loss for 2d dataset like COCO?
  3. How did you get the J-regressor for COCO dataset, will that be the same if we want to fit MPII 2d dataset?

@mks0601
Copy link
Owner

mks0601 commented Dec 7, 2020

  1. The dataset-specific regressor is used in the evaluation stage. The most commonly used evaluation metric is 3D distance between GT 3D joint coordinates and 3D joint coordinates from the predicted mesh. The 3D joint coordinates from the predicted mesh can be obtained by multiplying the dataset-specific joint regressor.

  2. The 2D dataset like COCO is necessary because images in multi-view datasets (e.g., H36M) have very different image appearance compared with that of in-the-wild images. Your model may fail to generalize if you train it on only multi-view datasets.

  3. The definition of some joints, especially hips, can be different for each dataset. You can somehow interpolate the given joint regressor in SMPL.

@zqq-judy
Copy link
Author

zqq-judy commented Dec 7, 2020

Hello!
In config.py, what does this parameter (bbox_3d_size ) mean?
bbox_3d_size = 2

smpl_coord_img[:, 2] = (smpl_coord_img[:, 2] / (cfg.bbox_3d_size / 2) + 1) / 2. * cfg.output_hm_shape[0]
Thanks!

@mks0601
Copy link
Owner

mks0601 commented Dec 7, 2020

smpl_coord_img[:,2] is a root joint-relative depth value in meter.
To convert it to 0~64 (heatmap coordinate), I need to divide the depth value by its pre-defined max value (bbox_3d_size).

@zqq-judy
Copy link
Author

Hello, I still have some questions.

Training can get 7 losses, which are: loss['joint_fit'], loss['joint_orig'], loss['mesh_fit'], loss['mesh_joint_orig'], loss['mesh_joint_fit'], loss['mesh_normal '], loss['mesh_edge'].

I want to calculate the total loss.

  •                      L = L_poseNet + L_meshNet + L_vertex +λL_normal + L_edge
    

I think,

  •                    L_normal==loss['mesh_normal']
    
  •                    L_edge==loss['mesh_edge']
    

    there are 5 losses left, then how should L_poseNet, L_meshNet, L_vertex be calculated?

Thank you very much for your reply!

@mks0601
Copy link
Owner

mks0601 commented Dec 11, 2020

fit means the prediction targets are from fitted mesh (GPs obtained by running SMPLify-X on GT 2D/3D pose of the dataset).
orig means the prediction targets are from GT 2D/3D pose of the dataset.

L_pose^posenet == loss['joint_fit'] + loss['joint_orig']
L_pose^meshnet == loss['mesh_joint_fit'] + loss['mesh_joint_orig']
L_vertex == loss['mesh_fit']

@wangzheallen
Copy link

The rotation loss is defined on 6D, https://github.com/mks0601/I2L-MeshNet_RELEASE/blob/master/common/nets/module.py#L125
If we try to work on our own dataset, how do we get the 6D representation? Can you provide the code for converting?
Is the 6D representation https://arxiv.org/abs/1812.07035?

@mks0601
Copy link
Owner

mks0601 commented Dec 14, 2020

Q. If we try to work on our own dataset, how do we get the 6D representation? Can you provide the code for converting?
A. The predicted 6D rotations are converted to 3D axis-angle, and the loss is calculated on the 3D axis-angle.

Q. Is the 6D representation https://arxiv.org/abs/1812.07035?
A. Yes

@zqq-judy
Copy link
Author

I took nine photos in the MSCOCO data set for training, but the final result out['mesh_coord_img''], after visualization is a bunch of messy 3D points, after vis_mesh(), there is only one red point on the photo , Not a human body mesh. The training results of the 9 photos are incorrect. Why?

Using input.jpg in the demo, the test results are as follows:
output

@mks0601
Copy link
Owner

mks0601 commented Dec 14, 2020

I cannot understand your question.. Did you use only 9 images for the training? why? How did you use only those images?

@zqq-judy
Copy link
Author

Yeah, I selected nine pictures in MSCOCO train2017 for training,
000000391895.jpg,
000000522418.jpg,
000000184613.jpg,
000000318219.jpg,
000000554625.jpg,
000000574769.jpg,
000000060623.jpg,
000000005802.jpg,
000000222564.jpg,
Because I used 10 photos when training my own dataset task, but out['mesh_coord_img'] is some messy mesh points. Therefore, I used your original MSCOCO dataset 9 photos for training, and the results are also messy points.

I used 9 photos of the training model, tested input.jpg, and the results of visualization out['mesh_coord_img'] are as follows:
1

@mks0601
Copy link
Owner

mks0601 commented Dec 14, 2020

First of all, using only 9 images with the provided learning schedule (13 epoch) will absolutely make the model not converged.
In general, at least thousands of images are required to train deep neural networks.

@mks0601 mks0601 closed this as completed Mar 29, 2021
@MhmdAliChamass99
Copy link

@zqq-judy do you remember how can you get these files: bbox_root_xx_output.json, xx_train.json because I need them for my dataset

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants