You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am running your model on my custom dataset. However it seems that the saved depth from NYUv2 model is wrong. I think this might due to my misuse of your model's output. I have a script like this:
importosimportshutilimporttorchimportnumpyasnpimportcv2fromtqdmimporttqdmfrompathlibimportPathimportsys, jsonfromPILimportImageimporttorchvision.transforms.functionalasTF# I clone your repo and put to the place where I can directly importsys.path.insert(0, str(Path(__file__).parent.resolve() /"idisc"))
fromidisc.models.idiscimportIDiscfromidisc.utilsimport (DICT_METRICS_DEPTH, DICT_METRICS_NORMALS,
RunningMetric, validate)
model=IDisc.build(json.load(open('idisc/configs/nyu/nyu_swinl.json')))
model.load_pretrained("idisc/nyu_swinlarge.pt")
model=model.to("cuda")
model.eval()
# read in imageimage=np.asarray(Image.open(image_path))
image=TF.normalize(TF.to_tensor(image), **{"mean": [0.5, 0.5, 0.5], "std": [0.5, 0.5, 0.5]})
image=image.unsqueeze(0).to("cuda")
withtorch.inference_mode():
depth, *_=model(image)
TF.to_pil_image(depth[0].cpu()).save(save_path)
I am using Swin-Large model. The image_path is the path to this image
of size 224x224 (I uploaded the exact image in case you might need to debug this), DPT can generate depth like this
however the output of idisc swin-large is this
I believe I made some mistakes somewhere. I wonder if you can help me debug this.
Thanks!
The text was updated successfully, but these errors were encountered:
Thank you for using our model.
I believe that the effect comes from the saving part, the output depth of our model is metrics depth, thus is has floating values from [0.0, +inf), which can result in PIL saving the image in the wrong format.
I suggest you first convert the scalar float values to RGB with a colormap transformation and then save it. You can look into idisc/utils/visualization.py, more specifically into colorize function. It accepts 2D inputs (e.g., (H, W) shaped numpy array), min and max values (for NYU is 0.01 and 10.0 meters), and the colormap name. For instance, "magma" is a good colormap choice since it has a perceptually increasing colormap and does not introduce wrong spurious contrasts.
One little nitpick: the model was trained with ImageNet normalization statistics, hence it would be better to normalize the RGB image with those, instead of the default ones, i.e., {"mean": [0.5, 0.5, 0.5], "std": [0.5, 0.5, 0.5]}
Hello, thanks for the great work!
I am running your model on my custom dataset. However it seems that the saved depth from NYUv2 model is wrong. I think this might due to my misuse of your model's output. I have a script like this:
I am using Swin-Large model. The
![00001](https://private-user-images.githubusercontent.com/37067883/257586079-0c5cf734-5fb6-46c6-9aa4-cb3901e27c10.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk3ODExNjksIm5iZiI6MTcxOTc4MDg2OSwicGF0aCI6Ii8zNzA2Nzg4My8yNTc1ODYwNzktMGM1Y2Y3MzQtNWZiNi00NmM2LTlhYTQtY2IzOTAxZTI3YzEwLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MzAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjMwVDIwNTQyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTBkMzEwZjhjZjNkZDYxOTMxMjE4YTc1ODJlYzcwNjg2ODdhOTIxMDUxMGM0YTY2NTE5N2QzNmEwZDk4ZDFkMzAmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.7A31UKFXQkAKj9f14CXjzAT91239JrwI8SJzqFonVl4)
![image](https://private-user-images.githubusercontent.com/37067883/257584814-91bd8976-71d7-4d77-a5b7-e86ddb0602c9.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk3ODExNjksIm5iZiI6MTcxOTc4MDg2OSwicGF0aCI6Ii8zNzA2Nzg4My8yNTc1ODQ4MTQtOTFiZDg5NzYtNzFkNy00ZDc3LWE1YjctZTg2ZGRiMDYwMmM5LnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MzAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjMwVDIwNTQyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWJkZjIwOGVhYzk5OGMyZTIxOTAyMmI4Y2NjZDA0MTU1YTBhMDFlN2RiN2FlMTZjMjdhOWNkODBhZjBjNTk3OGUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.HVkRn4T_auQgwwQT-e7aW3WiP_SKqPTR2Bspi0WvbFM)
![image](https://private-user-images.githubusercontent.com/37067883/257584899-3f3b6a19-fea5-4a69-89c9-46f25c93a19d.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTk3ODExNjksIm5iZiI6MTcxOTc4MDg2OSwicGF0aCI6Ii8zNzA2Nzg4My8yNTc1ODQ4OTktM2YzYjZhMTktZmVhNS00YTY5LTg5YzktNDZmMjVjOTNhMTlkLnBuZz9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA2MzAlMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwNjMwVDIwNTQyOVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPWQ3NGNiYTA5NDMwMjEzNTQxZGY0NTM1MjgwZTMyZGYzYmYzYmVkYTgxYjA0ZmRjZmZjY2Q0YjNlMGE1MDAyMWUmWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.n9hfAn3dHpoz4G_W1-_Y0WVmhTr3tKDfDCGTQ25EaNM)
image_path
is the path to this imageof size 224x224 (I uploaded the exact image in case you might need to debug this), DPT can generate depth like this
however the output of idisc swin-large is this
I believe I made some mistakes somewhere. I wonder if you can help me debug this.
Thanks!
The text was updated successfully, but these errors were encountered: