-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to generate scale.txt and tuples_dso_optimization_windows.txt files for custom data? #58
Comments
execuse me. Have you resolved these problems yet? I also encounter the same problems as yours. I've been trying to train on other datasets. But these two problems really confuse me a lot. |
Hi @louie-luo, I haven't fully resolved these problems yet, but I believe I've made some progress and maybe sharing that progress here will 1) help others with their own efforts or 2) help others point out mistakes in my efforts. I should also note a lot of what I note below at times required some random, usually small change in the repo's code somewhere, and I may not explicitly include that below just because I've forgotten at this point (this past week has been a blur for me). Regardless, if you run into something like that and post the error message, I should be able to help. The tuples_dso_optimization_windows.txt and similarly named filesThis one is a bit tricky and I definitely haven't fully resolved yet. It should be noted that providing this file is effectively done in place of enabling and using thie below lines in the tandem/cva_mvsnet/configs/default.yaml Lines 27 to 30 in f8816c7
If you didn't have the
On your sequences, run something like from the This will generate a Your generated
Follow the same steps as before, except note the last few lines of the gist I last pointed out here. You'll want The scale.txt fileThis is a lot simpler and I'm reasonably confident about having done this the correct way at this point, at least if you know the range of depth. Let's say you know the range of depth in your depth data (usually available somewhere in corresponding public dataset information) and it's 0 to 100mm (using this as an example from the dataset I used, C3VD). In your grayscale depth map, the maximum value is 255 and is loaded as such into TANDEM, so that maximum value should correspond to 100mm (or 0.1 meters). 0.1/255 will yield your global depth scale value that should go into Feel free to use the below code to easily do this for the first time or to change a scale value in an existing
If you don't know your real-world range of depth (even if it's clamped as it is in some medical datasets), I'm not sure on the best way to calculate the global depth scale or how the authors even calculated it. You could maybe try searching around and seeing if there's some method that could give you a real-world range of depth on self-captured data. In the past I've calculated global depth scale using ground truth camera poses, which got me somewhat close to the method using a range of depth that I mentioned above, but ended up being too inaccurate. |
Also, please take everything in my previous reply with a grain of salt, especially since I haven't quite gotten an optimal baseline (as far as I know) with the data I'm working with (colonoscopy data). I will try to do a better job of documenting my final approach and better distilling some of the above information once I achieve that optimal baseline. |
Hi @louie-luo, Happy to try to help. Few things:
tandem/cva_mvsnet/models/datasets.py Line 402 in f8816c7
It's possible that the range of your depth map isn't actually 0 to 255, and that you may be reading the depth incorrectly in your own code to establish the global depth scale value incorrectly. I would double-check that with the depth images in the tandem_replica dataset as well. It's possible my reasoning isn't sound and there's something more to do this that isn't documented as well, but I've been able to get reasonable (albeit less consistent from frame-to-frame) depth map results thus far with a much smaller range of depth (0 to 100mm) and 16-bit depth images (maximum possible value is 65535).
tandem/cva_mvsnet/models/module.py Lines 857 to 859 in f8816c7
Double-checking for any hard-coded values of importance gets even more complicated with the full TANDEM pipeline (which involves the C++ implementation), and is something that I haven't exhaustively done for that matter. Sorry that I can't be of much help aside from the above, I'll check back into this thread if I have any more ideas or happen to re-visit using aspects of the TANDEM pipeline. Personally, I was interested in consistent depth prediction with this pipeline and ended up finding some monocular methods (U-NET, DPT-Hybrid) that performed much more consistently given my data (which does have fairly few frames compared to datasets like the Replica one, as well as significantly less camera motion). |
Hi @yahskapar , Really thanks for your explicit and useful comments! Firstly, I recheck my depth map and find that it's indeed not 8-bit but 16-bit. That is to say the range should go as 0 to 65535. (Tandem_replica dataset is the same, 16-bit. ) Secondly, the values that have to be changed are mainly in "module.py". Besides you mention above, I find three as follows: line 1184, line 1205 and line 1221. All in "module.py". If you find other hard-coded values need changing, please tell me. Now my training gets much better. The loss has dropped to a normal value. Thanks again for your help. I cannot train my data successfully without your comments! |
No problem, happy to help! I will let you know if I find anything else that needs to be changed. I'm not currently working with this code at the moment, but I had trouble with TSDF volume initialization in this project's code on my 0 to 100mm depth range data before, so I will have to re-visit that sometime soon. |
Hi folks,
I've been able to get some interesting results using this pipeline which I'm grateful that the authors were able to make publicly available. I have two questions that I've been scratching my head about, however:
How does one generate a tuples_dso_optimization_windows.txt file per sequence for a given custom dataset, similar to the appearance of this file in the TANDEM-format Replica dataset provided? This file, which appears to do with MVS configuration, appears to be important. Even with the ScanNet pre-trained model evaluated on the Replica dataset, there's a significant drop in performance when generating tuples without providing the aforementioned .txt file (absolute relative error goes from 0.0384 to 0.0706).
How were the scale.txt files generated per set of depth images for a sequence? These scale.txt files seem to contain only a single floating point value, which is a bit confusing since I'd expect the depth scale to have some slight variation at least from sequence to sequence. Currently, I've been computing the depth scale using the camera pose information and ultimately taking the mean to also get a single floating point value, but I'm not sure if this is a reasonable approach given how bad my results ended up being after training with my custom dataset and testing on an eval split of it.
I'd appreciate any and all help regarding these two files if anyone has any pointers. Thanks!
The text was updated successfully, but these errors were encountered: