New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question on coordinate frames for pose data #11
Comments
Yes, it is in camera frame. In this particular case, the transformation is the relative pose (as opposed to absolute pose world to camera) between the two cameras. |
Oh yeah, of course. Thank you! |
@alexklwong is the absolute pose stored in the VOID dataset, the pose of the world frame in the camera frame, or the pose of the camera in the world frame? i.e. in the first case, Pose * Xw = Xcamera if you take homogeneous points , and pose as an affine transform I apologize for the repetitive nature of the question, but the naming convention is always confusing to me, especially when it says 'absolute' |
So absolute pose in our case refers to the transformation from camera to world frame coordinates: i.e. following the notation in the above screenshot g_{\tau t} refers to the transformation from t to \tau in the camera frame, but in this context it would be written as g_{world t} where g takes us from the camera frame at time t to the world frame. |
Thank you for the confirmation. I assumed the above, and trained the network with the absolute poses in the void dataset, and the valuation results were actually worse than with posenet. Attaching the results.txt file from the training if it is of any interest to you. |
Looks the opposite right? Using absolute pose is better than using PoseNet (#12 (comment)): PoseNet on VOID is worse by about 5mm (also around 10%) than using pose from VIO. These are the results from your results.txt using absolute pose
whereas I got the following using PoseNet
We do have internal code to test it out on VIO, but did not release that version since it is coupled to a few internal tools. Thanks, |
So, I ran the validation script with 1.The pre-trained model (with posenet), and 2. The best result model from training with absolute pose data. Evaluation results:
With the best model with absolute pose data:
This is what I meant when I said
Alright, I'm a bit busy this week, but I'll clean up my code a bit and submit a PR sometime next week. |
Your results for the pretrained model, was that one that was released on the repo? Or did you re-train your own? You got these numbers (incredibly good, on par with top supervised method https://github.com/alexklwong/awesome-state-of-depth-completion)
but I recall that the pretrained model released should give
so if it is the same pretrained weights, then that suggests that you may not have evaluated on the same data or something was off in the evaluation script. Also, does the number you got when running https://github.com/alexklwong/calibrated-backprojection-network/blob/master/bash/void/run_kbnet_void1500.sh match the number shown during validation? |
I downloaded the pre-trained model weights, and ran the run_kbnet_void1500.sh script using it.
For the network trained on absolute poses, it seems the same (or very similar). Isn't the validation shown during the training, and the evaluation on run_kbnet_void1500.sh run on the same set of data? The split for me is 35917 training images(data), and 534 testing images(data) |
Ah looks like you are missing some parts of the dataset. This might be because gdown intermittently fails. For the training set on VOID1500: 44888 samples You may want to download it manually from the links in |
Hello,
In the above, the relative pose g(tau)(t) belonging to SE(3), refers to the transformation from the world frame to the camera frame right? That is, the pose is wrt the camera frame.
The text was updated successfully, but these errors were encountered: