Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement nerfacto alpha transparency training #2165

Merged
merged 24 commits into from
Jul 18, 2023

Conversation

nepfaff
Copy link
Contributor

@nepfaff nepfaff commented Jul 3, 2023

This PR cleans up and replaces #2025.

Test data

Mustard data set with alpha masks.
https://drive.google.com/file/d/1XX4ioj9NgaRoMIA00Negp5x9XM8gWxjD/view?usp=sharing

ns-train nerfacto --pipeline.model.background_color "random" --pipeline.model.disable-scene-contraction True  instant-ngp-data --data data/mustard_alpha_images/

Results

Accumulation without alpha transparent training:
image

Accumulation with alpha transparent training:
image

Closes #1498
Closes #2025

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 3, 2023

@SamDM, @jkulhanek, @f-dy let me know if you have any comments :)

I tried to address all mentioned suggestions in #2025 while keeping the functionality the same.

@nepfaff nepfaff force-pushed the alpha_carving_v2 branch 2 times, most recently from a4bf22a to a28c90a Compare July 3, 2023 15:15
def blend_background(
cls,
image: Tensor,
outputs: Dict[str, Union[Tensor, list]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please pass the optional background_color directly instead of passing the outputs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, can you please pass RGB image and "opacity" as two separate arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passed them as separate arguments. Also, it is probably better to not allow for optional arguments but only call this function if we actually want to blend the background. We can resolve this if you agree with removing the optionality of the arguments.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't blending happen always? When background_color=random it would use the provided background_color, otherwise it would use the renderer's background colour? Perhaps I am missing something...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should only happen if the input data includes an alpha channel. If this is the case, we would also have a background_color to pass to this method.
So you are right, if "background_color" in outputs is not needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember looking at this a good while ago and thinking that a scoreboard of transient values, like this run's background color selection, could be more generally useful.

The other use case I had in my head was some way to prevent single bad iterations from having a catastrophic impact, maybe pause/stop training if metrics plummet instead of saving checkpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are you thinking in terms of scoreboard? Do you mean having a dedicated object/place to write values like background_color to instead of storing them in outputs?

The other use case I had in my head was some way to prevent single bad iterations from having a catastrophic impact, maybe pause/stop training if metrics plummet instead of saving checkpoint.

I'm sorry, but I'm not quite following how this relates to always blending the background color if an alpha channel is present. Or do you mean as a use case for the scoreboard idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean having a dedicated object/place to write values like background_color to instead of storing them in outputs?

That was the thought, although not sure whether there is need.

Or do you mean as a use case for the scoreboard idea?

Yes, this (sorry)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like something that would warrant a separate pull request that then also implements one of your suggested functionalities.
However, I do see one reason for implementing something like this as part of this PR: All the outputs are currently passed through this resizing operation. This doesn't make much sense for background_color. It does not crash as the background color is a per-image pixel color, but it still isn't great.

Any opinions on this or separate PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate PR probably - unless it would fit neatly into this one

nerfstudio/data/datasets/base_dataset.py Outdated Show resolved Hide resolved
nerfstudio/models/base_surface_model.py Outdated Show resolved Hide resolved
nerfstudio/models/nerfacto.py Outdated Show resolved Hide resolved
nerfstudio/models/nerfacto.py Outdated Show resolved Hide resolved
nerfstudio/models/nerfacto.py Outdated Show resolved Hide resolved
nerfstudio/models/nerfacto.py Outdated Show resolved Hide resolved
@jkulhanek jkulhanek requested a review from tancik July 4, 2023 20:53
@machenmusik
Copy link
Contributor

Do we think this is in condition to be merged?

@yonghoonkwon
Copy link

yonghoonkwon commented Jul 13, 2023

If set the background to random, rendering works fine, but there seems to be a problem with exporting mesh.
I started training with the following command and the function for exporting point cloud or mesh (poisson) is not working properly.

CUDA_VISIBLE_DEVICES=0 ns-train nerfacto --data $DATA_SET --max-num-iterations 300001 --vis viewer+tensorboard --experiment-name $EXP_NAME --pipeline.model.background_color random --pipeline.datamanager.camera-optimizer.mode off --pipeline.model.near-plane 0.5 --pipeline.model.far-plane 10.0 --machine.num-devices 0 --pipeline.datamanager.camera-res-scale-factor 0.5 --pipeline.model.disable-scene-contraction True nerfstudio-data --train-split-fraction 1

@jkulhanek
Copy link
Contributor

What does it mean the function is not working properly? Can you please post the error message or something?

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 13, 2023

I didn't get amazing results with Poisson on that dataset. This isn't alpha transparent training dependent though...

TSDF fusion works great and is significantly improved by alpha transparent training. This is actually my sole use case of this feature. Currently on the way to an airport, but will post some mesh export results once I'm there :)

@jkulhanek
Copy link
Contributor

@nepfaff , Can you please add me as a contributor to your fork so I can try implementing some changes? If you are not comfortable with this idea, I can alternatively create a new branch...

@yonghoonkwon
Copy link

What does it mean the function is not working properly? Can you please post the error message or something?

If you export pointcloud to a learned model, it will not proceed at 0%, but you will continue to consume hardware resources (GPU). I don't know the exact cause, but I think it's because of the random value in the background. If pipeline.model.background_color is black, ghost pixels are included, but the export was successful. Point cloud needs to be extracted for Poisson mesh extraction, but I think there is a problem in this process.

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 13, 2023

I did manage to run Poisson but I can try with point cloud. Will report the results soon

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 13, 2023

@nepfaff , Can you please add me as a contributor to your fork so I can try implementing some changes? If you are not comfortable with this idea, I can alternatively create a new branch...

Done :)

@yonghoonkwon
Copy link

yonghoonkwon commented Jul 13, 2023

I didn't get amazing results with Poisson on that dataset. This isn't alpha transparent training dependent though...

TSDF fusion works great and is significantly improved by alpha transparent training. This is actually my sole use case of this feature. Currently on the way to an airport, but will post some mesh export results once I'm there :)

What I tested is my custom data, but I will train and test again with mustard data and share the results.

@yonghoonkwon
Copy link

yonghoonkwon commented Jul 13, 2023

I did manage to run Poisson but I can try with point cloud. Will report the results soon

Can you share the script you used to train mustard data??

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 13, 2023

I did manage to run Poisson but I can try with point cloud. Will report the results soon

Can you share the script you used to train mustard data??

Does the command in the description not work?

@jkulhanek
Copy link
Contributor

@nepfaff , Can you please add me as a contributor to your fork so I can try implementing some changes? If you are not comfortable with this idea, I can alternatively create a new branch...

Thanks! Will try to implement some changes, let me then know if you like them or not...

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 13, 2023

I didn't get amazing results with Poisson on that dataset. This isn't alpha transparent training dependent though...
TSDF fusion works great and is significantly improved by alpha transparent training. This is actually my sole use case of this feature. Currently on the way to an airport, but will post some mesh export results once I'm there :)

What I tested is my custom data, but I will train and test again with mustard data and share the results.

It would also be great if you could share your data. The more diverse the test data, the better :)

@jkulhanek jkulhanek removed the request for review from tancik July 17, 2023 18:54
Copy link
Contributor Author

@nepfaff nepfaff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me! Will share some of the testing results soon

nerfstudio/models/base_model.py Outdated Show resolved Hide resolved
nerfstudio/exporter/exporter_utils.py Show resolved Hide resolved
@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 18, 2023

It looks good to me now.

The export results are as desired and as shown in the previous images.

@jkulhanek
Copy link
Contributor

Great! @nepfaff, thank you very much for your work. This has been a pleasure.

@jkulhanek
Copy link
Contributor

I will now merge this to the main, ok?

@jkulhanek jkulhanek merged commit 9f487a4 into nerfstudio-project:main Jul 18, 2023
4 checks passed
@jkulhanek jkulhanek deleted the alpha_carving_v2 branch July 18, 2023 10:55
@machenmusik
Copy link
Contributor

Nice! Just to confirm, this mode is automatic when input images contain alpha, but separate mask images will have previous non-carving behavior - is that right? Can both be used simultaneously?

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 18, 2023

Almost. It is only automatic if the images have an alpha channel and the background color is random (this is not true by default). Hence, alpha images can be used with the previous behavior if the color is not random.
I didn't test both simultaneously but it should work.

@gradeeterna
Copy link

Thanks for working on this! Just wondering if this should work with equirectangular images with an alpha channel? The old workflow with separate mask images did not work with equirectangular for some reason.

If so, do I just need to add --pipeline.model.background_color "random" to my training command?

Also, do masks in the alpha channel use less VRAM than having separate masks? I was never able to use those with large datasets as I would get CUDA out of memory errors.

@hnj5247
Copy link

hnj5247 commented Jul 21, 2023

Hello, Trying to follow up this great work. Is there any way to transform the image format( nerfstudios' original(.png) &camera position(transforms.jason) )
to this nerfacto alpha transparency training model's format?

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 22, 2023

Thanks for working on this! Just wondering if this should work with equirectangular images with an alpha channel? The old workflow with separate mask images did not work with equirectangular for some reason.

If so, do I just need to add --pipeline.model.background_color "random" to my training command?

Also, do masks in the alpha channel use less VRAM than having separate masks? I was never able to use those with large datasets as I would get CUDA out of memory errors.

It should work. Just add an alpha channel and specify the random background color as you suggested.
I'd assume that they would use less VRAM but haven't tested this. It would be amazing if you could try this and post the results here :)

@nepfaff
Copy link
Contributor Author

nepfaff commented Jul 22, 2023

Hello, Trying to follow up this great work. Is there any way to transform the image format( nerfstudios' original(.png) &camera position(transforms.jason) )

to this nerfacto alpha transparency training model's format?

You need to add an alpha channel to your images. This is a 4th channel with values between 0 and 1. For alpha-transparent training you would use binar values where zero is transparent.
The easiest way to do this is probably numpy, concatenating the alpha channel with the RGB images.

Then additionally specify a random background color as specified in the PR description.

@hnj5247
Copy link

hnj5247 commented Jul 24, 2023

Using rembg p <input.img.folder> <output.img.folder>
I was able to remove the background of real world photo and add alpha. [r,g,b,a] . Thank you!
However it worked when the line '--pipeline.model.disable-scene-contraction True' was removed.
Thank you for replying!

@gradeeterna
Copy link

Thanks for working on this! Just wondering if this should work with equirectangular images with an alpha channel? The old workflow with separate mask images did not work with equirectangular for some reason.
If so, do I just need to add --pipeline.model.background_color "random" to my training command?
Also, do masks in the alpha channel use less VRAM than having separate masks? I was never able to use those with large datasets as I would get CUDA out of memory errors.

It should work. Just add an alpha channel and specify the random background color as you suggested. I'd assume that they would use less VRAM but haven't tested this. It would be amazing if you could try this and post the results here :)

Hey, thanks for getting back to me. I tested a few scenes last night, but got strange results with equirectangular and fisheye. My use case is is quite different to the mustard example - I'm masking myself out of 360 equirectangular images, and the black border on circular fisheye images.

I exported png images with masks in the alpha from Metashape (black pixels masked) and used "ns-process metashape" to create the downscales, and trained nerfacto-huge with a random background color. The masks were definitely doing something, but the colours are strange and the masked objects were still kind of visible in the NeRF.

It definitely seems to use less VRAM than seperate masks though, as I didn't get cuda oom errors for the first time! :)

@f-dy
Copy link
Contributor

f-dy commented Aug 18, 2023

@gradeeterna to mask yourself out, you shouldn't use these masks ("alpha transparency trauining" / "alpha carving"), but the "ignore" masks.
Here's how I add a single mask to all my fisheye images. You have to add the mask to the transforms.json, but also downscale it. You can easily customize it to add one mask per image. See also the discussion in #1498 for the "ignore" masking strategy

cat > add_mask_to_transforms_json.py <<EOF
#!/usr/bin/env python
import sys
import json

if len(sys.argv) != 3:
    print(f"Usage: {sys.argv[0]} input_transforms.json output_transforms.json")
    sys.exit(1)
with open(sys.argv[1]) as input_file:
    file_contents = input_file.read()
parsed_json = json.loads(file_contents)
for frame in parsed_json["frames"]:
    frame["mask_path"] = "masks/mask.png"
with open(sys.argv[2], "w") as output_file:
    json.dump(parsed_json, output_file, indent=4)
EOF
cat > downsize_mask.py <<EOF
#!/usr/bin/env python
import cv2
import sys
import json
from pathlib import Path

if len(sys.argv) != 2:
    print(f"Usage: {sys.argv[0]} path_to/mask.png")
    print(f"Output is path_to/masks_{downscale}/mask.png")
    sys.exit(1)
mask_path = Path(sys.argv[1])
mask = cv2.imread(str(mask_path), cv2.IMREAD_GRAYSCALE)
height, width = mask.shape[:2]
processed_data_dir = mask_path.parent
downscale_factors = [2, 4, 8]
for downscale in downscale_factors:
    mask_path_i = processed_data_dir / f"masks_{downscale}"
    mask_path_i.mkdir(exist_ok=True)
    mask_path_i = mask_path_i / "mask.png"
    mask_i = cv2.resize(
        mask, (width // downscale, height // downscale), interpolation=cv2.INTER_NEAREST
    )
    cv2.imwrite(str(mask_path_i), mask_i)
    print(f"Wrote {mask_path_i}")
EOF

@gradeeterna
Copy link

gradeeterna commented Aug 18, 2023

@f-dy Hey, using a single ignore mask for all my fisheye images does work, but it slows down training by about 5x. When I use one mask per image with a big dataset, I get out of memory errors on my 3090 24GB. I was trying this method in case it was faster and uses less memory.

The way masks work with NGP is great, and doesn't seem to slow down training or increase memory usage. One mask per image goes in the main images folder, and they don't need to be added to the transforms.json.

Thanks for sharing those scripts, very useful!

@machenmusik
Copy link
Contributor

(IIRC there is an option that can be set which puts masks on gpu which significantly speeds up training.)

@gradeeterna
Copy link

Oh yeah, just found it - --pipeline.datamanager.masks-on-gpu True

Training speed is almost the same as without masks, thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Training with masks results in artefacts outside the mask regions
8 participants