Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hloc fix for image_ids (camera per image) #3081

Merged
merged 10 commits into from
Apr 24, 2024

Conversation

dberga
Copy link
Contributor

@dberga dberga commented Apr 16, 2024

Hloc gets a single image_ids here so that it is unable to get the matching between images. Thus it gets an error on not finding images "frame_XXX not found". This little fix in hloc_utils.py is solving this issue #3059

Found this error on using --no_same_dimensions with disk features and disk+lightglue matching

@dberga
Copy link
Contributor Author

dberga commented Apr 16, 2024

This was a dependency problem

error: Failed to download: protobuf==3.12.1
Caused by: HTTP status server error (503 Service Unavailable) for url (https://files.pythonhosted.org/packages/e0/6f/076967445a5c45d01b75b1a873e8e8c99a26fc54c279b7536eb3be726196/protobuf-3.12.1-cp38-cp38-manylinux1_x86_64.whl.metadata)

Not related to the commit.

Can you re-run the Build Core test?

@jb-ye
Copy link
Collaborator

jb-ye commented Apr 16, 2024

Let me recap my understanding of the issue. Are you capturing horizontal pictures and portrait pictures with the same resolution? In the original intention, --no_same_dimensions was intended for rotating portrait mode images by 90 degrees such that all images are of the same resolution for colmap/hloc.

To my knowledge, the current preprocessing in nerfstudio doesn't support distinctive image resolutions.

@dberga
Copy link
Contributor Author

dberga commented Apr 17, 2024

Let me recap my understanding of the issue. Are you capturing horizontal pictures and portrait pictures with the same resolution? In the original intention, --no_same_dimensions was intended for rotating portrait mode images by 90 degrees such that all images are of the same resolution for colmap/hloc.

To my knowledge, the current preprocessing in nerfstudio doesn't support distinctive image resolutions.

I see you are mentioning this code
5b4abc4#diff-8a4850f0b967abd3438e57444fe498b2e0b98f64800b02f1bdb76f3f0ba5b5bb. In the implementation with ffmpeg is only rotating. The --no-same-dims also works for preprocessing images with different scale? A possible solution to avoid the dimensions problem can be to keep all the images with the same resolution by padding the borders, right?

If the camera parameters are defined for every image (frame) the nerfstudio parser can read each camera params independently as in (https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/data/dataparsers/nerfstudio_dataparser.py#L83) defined as a bool distort_fixed that detects that. The transforms.json then should look like https://github.com/InternLandMark/LandMark?tab=readme-ov-file#prepare-dataset for the multifocal case.
In the colmap_to_json.py I am working on making the https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L390 export multifocal in case there are distinct cameras. At the current official implementation only accepts single camera otherwise raises error https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/process_data/colmap_utils.py#L461.

In the hloc_utils.py (2a9b86f#diff-f6c061630b26a6080b7a03cb99f1dfcd3831b10ceaed217afd958a4a0bbcff26R150) by default the image_list is not defined in the code so that it gets error on looking for the images. The commit also fixes that mainly.

How can the Core fail be fixed? The core error is related to a network/pip problem from the github side, not related to the pull-request :(

@dberga
Copy link
Contributor Author

dberga commented Apr 17, 2024

Can the Core build be re-run?

Now it's pip who couldn't fetch another wheel: botocore==1.34.85 ...

@jb-ye
Copy link
Collaborator

jb-ye commented Apr 17, 2024

okay, I think my main concern here is if the user actually have images from the same camera model, would it makes sense to use camera_mode=pycolmap.CameraMode.SINGLE? Could you add a parameter to run_hloc() as well as class ColmapConverterToNerfstudioDataset to allow pycolmap.CameraMode.PER_IMAGE be an option rather than a default choice?

@jb-ye
Copy link
Collaborator

jb-ye commented Apr 17, 2024

Can the Core build be re-run?

CI is not super stable, I guess. Don't worry about it.

@dberga
Copy link
Contributor Author

dberga commented Apr 17, 2024

okay, I think my main concern here is if the user actually have images from the same camera model, would it makes sense to use camera_mode=pycolmap.CameraMode.SINGLE? Could you add a parameter to run_hloc() as well as class ColmapConverterToNerfstudioDataset to allow pycolmap.CameraMode.PER_IMAGE be an option rather than a default choice?

Yes, it should be added as a parameter in hloc_utils.py and in https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/process_data/colmap_converter_to_nerfstudio_dataset.py#L29

…n hloc. Also updated colmap_to_json to detect if there is one or several cameras per frame while keeping the camera type specifications
@dberga
Copy link
Contributor Author

dberga commented Apr 19, 2024

@jb-ye just pushed a commit with the required argument. also added the necessary code to export the multifocal case (similarly as in the nerfstudio dataparser distortion_fixed bool) in the colmap_to_json. Check this: 1f73743

image

@@ -418,6 +418,14 @@ def colmap_to_json(
# im_id_to_image = recon.images
cam_id_to_camera = read_cameras_binary(recon_dir / "cameras.bin")
im_id_to_image = read_images_binary(recon_dir / "images.bin")
multifocal = False # one camera for all frames (distort_fixed=True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use use_single_camera_mode = True

@jb-ye
Copy link
Collaborator

jb-ye commented Apr 19, 2024

@dberga LGTM, I just have some minor comments about the change.

@dberga
Copy link
Contributor Author

dberga commented Apr 20, 2024

@dberga LGTM, I just have some minor comments about the change.

Right, will do the corrections asap during the next week :)

@dberga
Copy link
Contributor Author

dberga commented Apr 23, 2024

@dberga LGTM, I just have some minor comments about the change.

Just made all the changes, check
image

nerfstudio/process_data/colmap_utils.py Outdated Show resolved Hide resolved
@dberga dberga requested a review from jb-ye April 23, 2024 16:18
@jb-ye
Copy link
Collaborator

jb-ye commented Apr 23, 2024

@dberga I couldn't push my changes to your feature branch, could you either change the permission to allow upstream authors (or just me) or make the suggested change accordingly.

-        if use_single_camera_mode is False:  # add the camera for this frame
-            frame.update(parse_colmap_camera_params(cam_id_to_camera[im_id]))
+        if not use_single_camera_mode:  # add the camera parameters for this frame
+            frame.update(parse_colmap_camera_params(cam_id_to_camera[im_data.camera_id]))

@dberga
Copy link
Contributor Author

dberga commented Apr 24, 2024

@dberga I couldn't push my changes to your feature branch, could you either change the permission to allow upstream authors (or just me) or make the suggested change accordingly.

-        if use_single_camera_mode is False:  # add the camera for this frame
-            frame.update(parse_colmap_camera_params(cam_id_to_camera[im_id]))
+        if not use_single_camera_mode:  # add the camera parameters for this frame
+            frame.update(parse_colmap_camera_params(cam_id_to_camera[im_data.camera_id]))

Done in b4f0312

Copy link
Collaborator

@jb-ye jb-ye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I manually tested this PR with a local colmap project that have multiple cameras, it works as expected.

@jb-ye jb-ye merged commit db93476 into nerfstudio-project:main Apr 24, 2024
2 checks passed
@ichsan2895
Copy link

ichsan2895 commented Apr 25, 2024

Hello, I success for using hloc with this command

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --no-same-dimensions --no-use_single_camera_mode

it detected all images (just 8 images)
image
image
image

But... When I run splatfacto,
the ns-viewer only detected 7 poses. There are no eval images too (AFAIK, 8th images should be eval)
image

Please gives me insight @jb-ye @dberga

@dberga
Copy link
Contributor Author

dberga commented Apr 25, 2024

Hello, I success for using hloc with this command

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --no-same-dimensions --no-use_single_camera_mode

it detected all images (just 8 images) image image image

But... When I run splatfacto, the ns-viewer only detected 7 poses. There are no eval images too (AFAIK, 8th images should be eval) image

Please gives me insight @jb-ye @dberga

It will depend on the amount of matches found from the features detected in the colmap/hloc method.
Considering there are 8 images, the disk+lightglue could find 7 cameras but maybe one of the images couldn't match with any other, thus being discarded. As being discarded the viewer can only show the available cameras.

Can you share the transforms.json?

To solve the lack on matching please try to add some parameters on the ns-process-data methods (e.g. other feature extractors and matchers, enable exhaustive feature matching, perspective mode, etc.). You can check all parameters by ns-process-data --help. You can see all parameters here https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/process_data/colmap_converter_to_nerfstudio_dataset.py#L29-106

@ichsan2895
Copy link

ichsan2895 commented Apr 25, 2024

Hello @dberga , I want to share my transforms.json.
Additional info:
First Attempt

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --no-same-dimensions --no-use_single_camera_mode

cp -r IMAGES/* IMAGES_MULTICAM/images

detected 8 of 8 images (100%).
transforms.json

Second attempt:

ns-process-data images --data IMAGES \
    --output-dir IMAGES_ORI-MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --no-same-dimensions --no-use_single_camera_mode

detected 7 of 8 images. (87,5%)
transforms (1).json

colmap 3.9.1 with CUDA, built from source
hloc-1.4
installed latest commits of nerfstudio and gsplat

@jb-ye
Copy link
Collaborator

jb-ye commented Apr 25, 2024

the ns-viewer only detected 7 poses. There are no eval images too (AFAIK, 8th images should be eval)

Could you share your dataset? @ichsan2895

@ichsan2895
Copy link

the ns-viewer only detected 7 poses. There are no eval images too (AFAIK, 8th images should be eval)

Could you share your dataset? @ichsan2895

Here is it
https://drive.usercontent.google.com/download?id=1ChzTC_Vokc5ms0zIW23VH76maHUSq44p&authuser=0

@dberga
Copy link
Contributor Author

dberga commented Apr 25, 2024

Hello @dberga , I want to share my transforms.json. Additional info: First Attempt

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --no-same-dimensions --no-use_single_camera_mode

cp -r IMAGES/* IMAGES_MULTICAM/images

detected 8 of 8 images (100%). transforms.json

Second attempt:

ns-process-data images --data IMAGES \
    --output-dir IMAGES_ORI-MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --no-same-dimensions --no-use_single_camera_mode

detected 7 of 8 images. (87,5%) transforms (1).json

colmap 3.9.1 with CUDA, built from source hloc-1.4 installed latest commits of nerfstudio and gsplat

It is very good you could obtain 100% matchings (1st case) considering a few amount of images.

I see in the first case that you add --skip-images-processing. By default without downscaling (skipping images processing) won't create the "images_X" folders with "frame_X" files. Without specifying this parameter it downscales by x2, x4, x8. In principle the preprocessing (in this case hloc+disk features+lightglue) works best with higher resolutions and minimal focal distortions.

Can you check (with COLMAP gui) the database that all cameras are found? Did the hloc found matches for all cases? In the second transforms case the frame_0001.jpg is missing.

@ichsan2895
Copy link

ichsan2895 commented Apr 25, 2024

Yes @dberga , I was intended to use --skip-image-proprecessing and manually copy 8 images to IMAGE_MULTICAM/images. I use splatfacto with --downscale-factor 1, so it does not need images_2, images_4, and images_8.

################

Hey, Sorry for my careless. In first attempt that I said before:

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --no-same-dimensions --no-use_single_camera_mode

cp -r IMAGES/* IMAGES_MULTICAM/images

I found 8th images in ns-view with viser, but it has wrong orientation.
image

Take a look in portrait image, but in viser, it was rotated automatically.
image

The transforms.json also does not translate it to potrait.
image

Michael-Spleenlab pushed a commit to Michael-Spleenlab/nerfstudio that referenced this pull request Apr 26, 2024
* changed run_hloc args to fill image_ids (added image_list and changed camera_mode to PER_IMAGE)

* added parameter same_camera on ns-process-data for multifocal cases in hloc. Also updated colmap_to_json to detect if there is one or several cameras per frame while keeping the camera type specifications

* ruff-formatted changes in colmap_utils.py

* use_single_camera_mode update args, including colmap_utils and hloc_utils

* changed multicamera warning's print to CONSOLE.print

* fix path typo console print

* rewritten use-single-camera-mode comment in ns-process-data

* (fix) inverted bool for colmap_utils

* change im_id to im_data.camera_id

---------

Co-authored-by: Jianbo Ye <jianboye@amazon.com>
@dberga
Copy link
Contributor Author

dberga commented Apr 26, 2024

Yes @dberga , I was intended to use --skip-image-proprecessing and manually copy 8 images to IMAGE_MULTICAM/images. I use splatfacto with --downscale-factor 1, so it does not need images_2, images_4, and images_8.

################

Hey, Sorry for my careless. In first attempt that I said before:

ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --no-same-dimensions --no-use_single_camera_mode

cp -r IMAGES/* IMAGES_MULTICAM/images

I found 8th images in ns-view with viser, but it has wrong orientation.

Take a look in portrait image, but in viser, it was rotated automatically.

The transforms.json also does not translate it to potrait.

Probably the parameter --no_same_dimensions is doing the rotation, check this https://github.com/nerfstudio-project/nerfstudio/blob/main/nerfstudio/process_data/process_data_utils.py#L263.

Maybe try adding an elif checking that:

  • when use_single_camera is False and same_dimensions is False, then not autorotate.
  • when use_single_camera is True and same_dimensions is False, then rotate.

Let us know so it can be updated

@ichsan2895
Copy link

  1. I think --no-same-dimensions is intended when we use different resolution images so the transforms.json creates resolution attributes for each image (instead of general attribute like colmap). But, I just know it possibly doing rotation too.

My recommendation @dberga : auto rotation has another flag like --auto-rotation & --no-auto-rotation. Then --no-same-dimensions for using different images resolution.

  1. Run this command:
ns-process-data images --data IMAGES \
    --output-dir IMAGES_MULTICAM \
    --sfm-tool hloc --feature-type disk --matcher-type disk+lightglue \
    --skip-images-processing --same-dimensions --no-use_single_camera_mode

cp -r IMAGES/* IMAGES_MULTICAM/images

The images still rotated.

@ichsan2895
Copy link

ichsan2895 commented Apr 26, 2024

INTERESTING FINDING:

  1. Run this command:
colmap model_converter \
    --input_path IMAGES_MULTICAM/colmap/sparse/0 \
    --output_path IMAGES_MULTICAM/colmap/sparse/0 \
    --output_type TXT
  1. Open IMAGES_MULTICAM/colmap/sparse/0/cameras.txt
  2. All of the images has same resolution (despite in input, there are 1 potrait & 7 landscape images). So the autorotation caused by hloc I think.
    --no-same-dimension and --same-dimensions does not have effect afterall since the cameras.txt has same resolution.
# Camera list with one line of data per camera:
#   CAMERA_ID, MODEL, WIDTH, HEIGHT, PARAMS[]
# Number of cameras: 8
8 OPENCV 4624 3472 3422.1921813355152 3420.2523863833289 2312 1736 0.072982410485304544 -0.11388968316811775 0.0014774746720753768 -0.0015804304081276507
7 OPENCV 4624 3472 3402.8378682136849 3397.5772595708336 2312 1736 0.082576038894206491 -0.15068003202440697 0.0013543233908796246 -0.00059204086450159483
6 OPENCV 4624 3472 3418.6384711506989 3412.8018492685578 2312 1736 0.055607240161820254 -0.061799060016399641 0.0010211725425041155 -0.00068701512432646462
5 OPENCV 4624 3472 3409.6767612744511 3419.3946673559321 2312 1736 0.07269686822641859 -0.091214819120700077 0.0021359424701642396 -0.002934762972419369
4 OPENCV 4624 3472 3398.1794370163429 3403.0184384187055 2312 1736 0.060456096642486239 -0.062672484962208674 0.00038078816854184716 -0.0023915384411823063
3 OPENCV 4624 3472 3373.4672113436777 3393.7038187398966 2312 1736 0.070667709654399016 -0.089770482952795674 0.00017934005012779037 -0.0030718905642279325
2 OPENCV 4624 3472 3390.3063822725585 3399.2824954832113 2312 1736 0.077084670017596793 -0.10620337151668688 0.0013100723624426642 -0.004853945126040737
1 OPENCV 4624 3472 3375.7035440085774 3104.9692601554093 2312 1736 -0.17221875899859262 -0.23014876258465428 -0.0038754464455599685 -0.0051693569426345893

@dberga
Copy link
Contributor Author

dberga commented Apr 26, 2024

The images in the output IMAGES_MULTICAM are rotated ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants