-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue serving a model trained with the provided training code #28
Comments
Im dealing with a similar issue right now. It seems like the infer code is pretty much hard-coded to only work with models on huggingface. I am trying to figure out how to do inference with a locally trained model as well. I'll keep you updated if I get anywhere with it, I think its possible to load our custom models with the function |
Hi, Plz refer to this issue #24 (comment). Plz take a look on this and see if there's still any problem. Bests |
Hi @ZexinHe , @SamBahrami , I'm wondering what does <YOUR_EXACT_TRAINING_CONFIG> refer to? python scripts/convert_hf.py --config <YOUR_EXACT_TRAINING_CONFIG> convert.global_step=null Thank you in advance. |
For me, the line I used was |
Awesome @SamBahrami , thank you so much for your quick and kind response! |
Thanks to you @SamBahrami , I've successfully converted checkpoints to huggingface-compatible model! # Example usage
EXPORT_VIDEO=true
EXPORT_MESH=true
INFER_CONFIG="./configs/infer-b.yaml"
MODEL_NAME="./exps/releases/lrm-objaverse/small-dummyrun/step_000100"
IMAGE_INPUT="./assets/sample_input/owl.png"
python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH |
Oh, when I tried, it gives me list index error : root@b5f5ee77bf34:~/OpenLRM# # Example usage
EXPORT_VIDEO=true
EXPORT_MESH=true
INFER_CONFIG="./configs/infer-b.yaml"
MODEL_NAME="./exps/releases/lrm-objaverse/small-dummyrun/step_000100"
IMAGE_INPUT="./assets/sample_input/owl.png"
python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH
[2024-04-18 02:29:51,344] openlrm.models.modeling_lrm: [INFO] Using DINOv2 as the encoder
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/swiglu_ffn.py:43: UserWarning: xFormers is available (SwiGLU)
warnings.warn("xFormers is available (SwiGLU)")
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/attention.py:27: UserWarning: xFormers is available (Attention)
warnings.warn("xFormers is available (Attention)")
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/block.py:39: UserWarning: xFormers is available (Block)
warnings.warn("xFormers is available (Block)")
Loading weights from local directory
0%| | 0/1 [00:00<?, ?it/s]/root/OpenLRM/openlrm/datasets/cam_utils.py:153: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at ../aten/src/ATen/native/Cross.cpp:63.)
x_axis = torch.cross(up_world, z_axis)
0%| | 0/1 [00:14<?, ?it/s]Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/root/OpenLRM/openlrm/launch.py", line 36, in <module>
main()
File "/root/OpenLRM/openlrm/launch.py", line 32, in main
runner.run()
File "/root/OpenLRM/openlrm/runners/infer/base_inferrer.py", line 62, in run
self.infer()
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 284, in infer
self.infer_single(
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 244, in infer_single
mesh = self.infer_mesh(planes, mesh_size=mesh_size, mesh_thres=mesh_thres, dump_mesh_path=dump_mesh_path)
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 207, in infer_mesh
vtx_colors = self.model.synthesizer.forward_points(planes, vtx_tensor)['rgb'].squeeze(0).cpu().numpy() # (0, 1)
File "/root/OpenLRM/openlrm/models/rendering/synthesizer.py", line 206, in forward_points
for k in outs[0].keys()
IndexError: list index out of range |
Not sure. I did it the same. Try Perhaps try using a pretrained model for inference, and see if that works, before trying to use your own. This command |
Sure, thanks! |
@SamBahrami , could it be due to a lack of training data? |
No, it should not give you an error even if you don't have enough training data. |
Thank you for checking, @SamBahrami . EXPORT_VIDEO=true
EXPORT_MESH=true
INFER_CONFIG="./configs/infer-l.yaml"
MODEL_NAME="./exps/releases/lrm-objaverse/small-dummyrun/step_000100"
IMAGE_INPUT="./assets/sample_input/test.png"
python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH Also, I modified the source_size: 448
source_cam_dist: 2.0
render_size: 384
render_views: 160
render_fps: 40
frame_size: 2
mesh_size: 384
mesh_thres: 0.28 # ONLY MODIFIED THIS VALUE This is because, the output sigma values are all within the range of 0.2 to 0.3. def infer_mesh(self, planes: torch.Tensor, mesh_size: int, mesh_thres: float, dump_mesh_path: str):
grid_out = self.model.synthesizer.forward_grid(
planes=planes,
grid_size=mesh_size,
)
print("Sigma values:", grid_out['sigma']) # ADDED THIS LINE
vtx, faces = mcubes.marching_cubes(grid_out['sigma'].squeeze(0).squeeze(-1).cpu().numpy(), mesh_thres)
vtx = vtx / (mesh_size - 1) * 2 - 1
vtx_tensor = torch.tensor(vtx, dtype=torch.float32, device=self.device).unsqueeze(0)
vtx_colors = self.model.synthesizer.forward_points(planes, vtx_tensor)['rgb'].squeeze(0).cpu().numpy() # (0, 1)
vtx_colors = (vtx_colors * 255).astype(np.uint8)
mesh = trimesh.Trimesh(vertices=vtx, faces=faces, vertex_colors=vtx_colors)
# dump
os.makedirs(os.path.dirname(dump_mesh_path), exist_ok=True)
mesh.export(dump_mesh_path)
...
... And here is the output log of the sigma values : root@b5f5ee77bf34:~/OpenLRM# EXPORT_VIDEO=true
EXPORT_MESH=true
INFER_CONFIG="./configs/infer-l.yaml"
MODEL_NAME="./exps/releases/lrm-objaverse/small-dummyrun/step_000100"
IMAGE_INPUT="./assets/sample_input/test.png"
python -m openlrm.launch infer.lrm --infer $INFER_CONFIG model_name=$MODEL_NAME image_input=$IMAGE_INPUT export_video=$EXPORT_VIDEO export_mesh=$EXPORT_MESH
[2024-04-18 04:49:36,099] openlrm.models.modeling_lrm: [INFO] Using DINOv2 as the encoder
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/swiglu_ffn.py:43: UserWarning: xFormers is available (SwiGLU)
warnings.warn("xFormers is available (SwiGLU)")
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/attention.py:27: UserWarning: xFormers is available (Attention)
warnings.warn("xFormers is available (Attention)")
/root/OpenLRM/openlrm/models/encoders/dinov2/layers/block.py:39: UserWarning: xFormers is available (Block)
warnings.warn("xFormers is available (Block)")
Loading weights from local directory
0%| | 0/1 [00:00<?, ?it/s]/root/OpenLRM/openlrm/datasets/cam_utils.py:153: UserWarning: Using torch.cross without specifying the dim arg is deprecated.
Please either pass the dim explicitly or simply use torch.linalg.cross.
The default value of dim will change to agree with that of linalg.cross in a future release. (Triggered internally at ../aten/src/ATen/native/Cross.cpp:63.)
x_axis = torch.cross(up_world, z_axis)
Sigma values: tensor([[[[[0.2982],
[0.2950],
[0.2925],
...,
[0.2924],
[0.2949],
[0.2985]],
[[0.2965],
[0.2925],
[0.2895],
...,
[0.2887],
[0.2925],
[0.2966]],
[[0.2949],
[0.2900],
[0.2862],
...,
[0.2861],
[0.2904],
[0.2949]],
...,
[[0.2965],
[0.2916],
[0.2878],
...,
[0.2891],
[0.2922],
[0.2968]],
[[0.2979],
[0.2937],
[0.2907],
...,
[0.2924],
[0.2949],
[0.2983]],
[[0.2994],
[0.2961],
[0.2935],
...,
[0.2959],
[0.2979],
[0.3002]]],
[[[0.2952],
[0.2900],
[0.2863],
...,
[0.2880],
[0.2912],
[0.2957]],
[[0.2936],
[0.2873],
[0.2830],
...,
[0.2830],
[0.2879],
[0.2938]],
[[0.2921],
[0.2851],
[0.2799],
...,
[0.2803],
[0.2853],
[0.2921]],
...,
[[0.2944],
[0.2874],
[0.2821],
...,
[0.2841],
[0.2882],
[0.2948]],
[[0.2956],
[0.2894],
[0.2848],
...,
[0.2877],
[0.2908],
[0.2963]],
[[0.2971],
[0.2917],
[0.2879],
...,
[0.2917],
[0.2944],
[0.2983]]],
[[[0.2931],
[0.2863],
[0.2810],
...,
[0.2841],
[0.2880],
[0.2934]],
[[0.2912],
[0.2832],
[0.2770],
...,
[0.2783],
[0.2842],
[0.2917]],
[[0.2899],
[0.2812],
[0.2738],
...,
[0.2746],
[0.2816],
[0.2902]],
...,
[[0.2928],
[0.2843],
[0.2769],
...,
[0.2791],
[0.2855],
[0.2937]],
[[0.2940],
[0.2862],
[0.2797],
...,
[0.2834],
[0.2883],
[0.2945]],
[[0.2952],
[0.2888],
[0.2832],
...,
[0.2877],
[0.2914],
[0.2963]]],
...,
[[[0.2916],
[0.2861],
[0.2811],
...,
[0.2845],
[0.2887],
[0.2940]],
[[0.2900],
[0.2827],
[0.2771],
...,
[0.2805],
[0.2859],
[0.2924]],
[[0.2890],
[0.2807],
[0.2741],
...,
[0.2774],
[0.2833],
[0.2908]],
...,
[[0.2867],
[0.2788],
[0.2723],
...,
[0.2787],
[0.2835],
[0.2894]],
[[0.2892],
[0.2824],
[0.2770],
...,
[0.2824],
[0.2870],
[0.2921]],
[[0.2920],
[0.2863],
[0.2817],
...,
[0.2865],
[0.2900],
[0.2950]]],
[[[0.2946],
[0.2900],
[0.2857],
...,
[0.2888],
[0.2923],
[0.2968]],
[[0.2932],
[0.2875],
[0.2819],
...,
[0.2852],
[0.2897],
[0.2947]],
[[0.2919],
[0.2852],
[0.2794],
...,
[0.2824],
[0.2873],
[0.2929]],
...,
[[0.2892],
[0.2837],
[0.2785],
...,
[0.2851],
[0.2875],
[0.2921]],
[[0.2918],
[0.2863],
[0.2814],
...,
[0.2879],
[0.2906],
[0.2946]],
[[0.2946],
[0.2902],
[0.2861],
...,
[0.2909],
[0.2936],
[0.2973]]],
[[[0.2983],
[0.2941],
[0.2908],
...,
[0.2935],
[0.2962],
[0.2996]],
[[0.2966],
[0.2921],
[0.2885],
...,
[0.2908],
[0.2940],
[0.2978]],
[[0.2950],
[0.2899],
[0.2859],
...,
[0.2887],
[0.2921],
[0.2961]],
...,
[[0.2933],
[0.2886],
[0.2844],
...,
[0.2905],
[0.2928],
[0.2955]],
[[0.2956],
[0.2916],
[0.2880],
...,
[0.2937],
[0.2957],
[0.2977]],
[[0.2976],
[0.2939],
[0.2908],
...,
[0.2962],
[0.2981],
[0.3001]]]]], device='cuda:0')
0%| | 0/1 [00:20<?, ?it/s]Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/root/OpenLRM/openlrm/launch.py", line 36, in <module>
main()
File "/root/OpenLRM/openlrm/launch.py", line 32, in main
runner.run()
File "/root/OpenLRM/openlrm/runners/infer/base_inferrer.py", line 62, in run
self.infer()
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 285, in infer
self.infer_single(
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 245, in infer_single
mesh = self.infer_mesh(planes, mesh_size=mesh_size, mesh_thres=mesh_thres, dump_mesh_path=dump_mesh_path)
File "/root/OpenLRM/openlrm/runners/infer/lrm.py", line 208, in infer_mesh
vtx_colors = self.model.synthesizer.forward_points(planes, vtx_tensor)['rgb'].squeeze(0).cpu().numpy() # (0, 1)
File "/root/OpenLRM/openlrm/models/rendering/synthesizer.py", line 206, in forward_points
for k in outs[0].keys()
IndexError: list index out of range Could you please point me in the right direction? Thanks in advance for your help! |
Hi @SamBahrami , thanks to you I've generated checkpoint model, and tried inference on my trained model. |
Hi @SamBahrami, thank you for your comment! I appreciate your help. I'm currently experiencing a "list index out of range" error and I'm not sure how to resolve it. Could you please assist me in addressing this issue? Thank you in advance for your assistance! |
I'm trying to run inference on a custom model, trained with the provided code, but there seems to be a problem with building the model:
OpenLRM/openlrm/runners/infer/lrm.py
Lines 121 to 127 in c2260e0
and the folder that is passed as
model_name
argument looks like this:which contains a file named
model.safetensors
as required byhuggingface_hub
when initialising from path.From some tests it seems that the method
hf_model_cls.from_pretrained
needs as dictionary the section "model" from fileconfigs/train-sample.yaml
But even so, after passing this as a dictionary, the code breaks a bit further:
Could anyone help here?
The text was updated successfully, but these errors were encountered: