-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training Issue #134
Comments
Hi @AymanMIbrahim, It seems the error means there's no dataset, can you provide me with your training command? so I can check more on the path to the images and cameras. |
Hello @SteveJunGao I run code in clusters. |
Hi @song-wensong, I guess you might need to append one more directory name for the
|
I am having the same issue: Training options: Output directory: result/00031-stylegan2-04460130-gpus1-batch4-gamma40 Creating output directory... The command I use is: python train_3d.py --outdir=result --data=/mnt/c/AIData/img/04460130 --camera_path =/mnt/c/AIData/camera --gpus=1 --batch=4 --gamma=40 --data_camera_mode shapenet_car --dmtet_scale 1.0 --use_shapenet_split 1 --one_3d_generator 1 --fp32 0 -- I have check /mnt/c/AIData/img/04460130 folder, pretty sure I have a bunch of training data there. |
In Windows WSL you may want to put the data in the WSL filesystem. If it is in /mnt/* I don't know whether that will affect Get3D finding the data, but I wouldn't rule it out. Certainly access will be slow . |
After trial and error, I think I got my copy started the process of training. But some other problem emerged. So I put it here for future reference. ==> start Training options: Output directory: result/00060-stylegan2-04460130-gpus1-batch4-gamma40 Creating output directory... Num images: 1680 Constructing networks... tick 0 kimg 0.0 time 26s sec/tick 16.5 sec/kimg 4134.89 maintenance 9.2 |
Hi @EadmondDai, From the error message you posted, it seems the model has been killed due to the lack of resources, what's your training platform? (e.g. the GPU/CPU, their memory sizes and the systems?) |
could i only use RTX4090 to train a model? |
Hi, I am having a similar issue. ==> start Training options: Output directory: /home/jovyan/results/00004-stylegan2-14bb2e591332db56b0be6ed024602be5-gpus1-batch32-gamma40 Creating output directory... Could I check how the structure of the data file should be like? |
Hi!
I think it maybe didn’t detect the training images successfully. You should check them first.
Best regards,
Melinda
2024年7月1日 11:19,ben14132 ***@***.***> 写道:
CAUTION: This email is not originated from PolyU. Do not click links or open attachments unless you recognize the sender and know the content is safe.
Hi, I am having a similar issue.
==> start
==> use shapenet dataset
==> use shapenet folder number 0
==> use image path: /home/jovyan/render_shapenet_data/content/GET3D/render_shapenet_data/save/img/14bb2e591332db56b0be6ed024602be5, num images: 0
==> launch training
Training options:
{
"G_kwargs": {
"class_name": "training.networks_get3d.GeneratorDMTETMesh",
"z_dim": 512,
"w_dim": 512,
"mapping_kwargs": {
"num_layers": 8
},
"iso_surface": "dmtet",
"one_3d_generator": true,
"n_implicit_layer": 1,
"deformation_multiplier": 1.0,
"use_style_mixing": true,
"dmtet_scale": 1.0,
"feat_channel": 16,
"mlp_latent_channel": 32,
"tri_plane_resolution": 256,
"n_views": 1,
"render_type": "neural_render",
"use_tri_plane": true,
"tet_res": 90,
"geometry_type": "conv3d",
"data_camera_mode": "shapenet_car",
"channel_base": 32768,
"channel_max": 512,
"fused_modconv_default": "inference_only"
},
"D_kwargs": {
"class_name": "training.networks_get3d.Discriminator",
"block_kwargs": {
"freeze_layers": 0
},
"mapping_kwargs": {},
"epilogue_kwargs": {
"mbstd_group_size": 4
},
"data_camera_mode": "shapenet_car",
"add_camera_cond": true,
"channel_base": 32768,
"channel_max": 512,
"architecture": "skip"
},
"G_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"D_opt_kwargs": {
"class_name": "torch.optim.Adam",
"betas": [
0,
0.99
],
"eps": 1e-08,
"lr": 0.002
},
"loss_kwargs": {
"class_name": "training.loss.StyleGAN2Loss",
"gamma_mask": 40.0,
"r1_gamma": 40.0,
"lambda_flexicubes_surface_reg": 0.5,
"lambda_flexicubes_weights_reg": 0.1,
"style_mixing_prob": 0.9,
"pl_weight": 0.0
},
"data_loader_kwargs": {
"pin_memory": true,
"prefetch_factor": 2,
"num_workers": 3
},
"inference_vis": false,
"training_set_kwargs": {
"class_name": "training.dataset.ImageFolderDataset",
"path": "/home/jovyan/render_shapenet_data/content/GET3D/render_shapenet_data/save/img/14bb2e591332db56b0be6ed024602be5",
"use_labels": false,
"max_size": 0,
"xflip": false,
"resolution": 1024,
"data_camera_mode": "shapenet_car",
"add_camera_cond": true,
"camera_path": "/home/jovyan/render_shapenet_data/content/GET3D/render_shapenet_data/save/camera",
"split": "train",
"random_seed": 0
},
"resume_pretrain": null,
"D_reg_interval": 16,
"num_gpus": 1,
"batch_size": 32,
"batch_gpu": 4,
"metrics": [
"fid50k"
],
"total_kimg": 20000,
"kimg_per_tick": 1,
"image_snapshot_ticks": 50,
"network_snapshot_ticks": 200,
"random_seed": 0,
"ema_kimg": 10.0,
"G_reg_interval": 4,
"run_dir": "/home/jovyan/results/00004-stylegan2-14bb2e591332db56b0be6ed024602be5-gpus1-batch32-gamma40"
}
Output directory: /home/jovyan/results/00004-stylegan2-14bb2e591332db56b0be6ed024602be5-gpus1-batch32-gamma40
Number of GPUs: 1
Batch size: 32 images
Training duration: 20000 kimg
Dataset path: /home/jovyan/render_shapenet_data/content/GET3D/render_shapenet_data/save/img/14bb2e591332db56b0be6ed024602be5
Dataset size: 0 images
Dataset resolution: 1024
Dataset labels: False
Dataset x-flips: False
Creating output directory...
Launching processes...
Setting up PyTorch plugin "upfirdn2d_plugin"... /usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Done.
Setting up PyTorch plugin "bias_act_plugin"... /usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Done.
Setting up PyTorch plugin "filtered_lrelu_plugin"... /usr/local/lib/python3.8/dist-packages/torch/utils/cpp_extension.py:1967: UserWarning: TORCH_CUDA_ARCH_LIST is not set, all archs for visible cards are included for compilation.
If this is not desired, please set os.environ['TORCH_CUDA_ARCH_LIST'].
warnings.warn(
Done.
Loading training set...
==> use shapenet dataset
==> use shapenet folder number 0
==> use image path: /home/jovyan/render_shapenet_data/content/GET3D/render_shapenet_data/save/img/14bb2e591332db56b0be6ed024602be5, num images: 0
Traceback (most recent call last):
File "train_3d.py", line 337, in
main() # pylint: disable=no-value-for-parameter
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1130, in call
return self.main(*args, **kwargs)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1055, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 1404, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/usr/local/lib/python3.8/dist-packages/click/core.py", line 760, in invoke
return __callback(*args, **kwargs)
File "train_3d.py", line 331, in main
launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run)
File "train_3d.py", line 103, in launch_training
subprocess_fn(rank=0, c=c, temp_dir=temp_dir)
File "train_3d.py", line 49, in subprocess_fn
training_loop_3d.training_loop(rank=rank, **c)
File "/home/jovyan/GET3D/training/training_loop_3d.py", line 134, in training_loop
training_set_sampler = misc.InfiniteSampler(
File "/home/jovyan/GET3D/torch_utils/misc.py", line 120, in init
assert len(dataset) > 0
AssertionError
Could I check how the structure of the data file should be like?
—
Reply to this email directly, view it on GitHub<#134 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/BEP543KNFOAN5TYA4NWHFYTZKDDCTAVCNFSM6AAAAAA2BJ3PA6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJZGE2TENZRG4>.
You are receiving this because you commented.Message ID: ***@***.***>
[https://www.polyu.edu.hk/emaildisclaimer/PolyU_Email_Signature.jpg]
Disclaimer:
This message (including any attachments) contains confidential information intended for a specific individual and purpose. If you are not the intended recipient, you should delete this message and notify the sender and The Hong Kong Polytechnic University (the University) immediately. Any disclosure, copying, or distribution of this message, or the taking of any action based on it, is strictly prohibited and may be unlawful.
The University specifically denies any responsibility for the accuracy or quality of information obtained through University E-mail Facilities. Any views and opinions expressed are only those of the author(s) and do not necessarily represent those of the University and the University accepts no liability whatsoever for any losses or damages incurred or caused to any party as a result of the use of such information.
|
Hi @Remember12344 |
He doesn't read the img path or the camera path
use image path: /home/paperspace/Get3d_Updated/GET3D/Render_Image/, num images: 0 Traceback (most recent call last): File "train_3d.py", line 330, in <module> main() # pylint: disable=no-value-for-parameter File "/home/paperspace/miniconda3/envs/get3d/lib/python3.8/site-packages/click/core.py", line 1130, in __call__ return self.main(*args, **kwargs) File "/home/paperspace/miniconda3/envs/get3d/lib/python3.8/site-packages/click/core.py", line 1055, in main rv = self.invoke(ctx) File "/home/paperspace/miniconda3/envs/get3d/lib/python3.8/site-packages/click/core.py", line 1404, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/paperspace/miniconda3/envs/get3d/lib/python3.8/site-packages/click/core.py", line 760, in invoke return __callback(*args, **kwargs) File "train_3d.py", line 324, in main launch_training(c=c, desc=desc, outdir=opts.outdir, dry_run=opts.dry_run) File "train_3d.py", line 103, in launch_training subprocess_fn(rank=0, c=c, temp_dir=temp_dir) File "train_3d.py", line 49, in subprocess_fn training_loop_3d.training_loop(rank=rank, **c) File "/home/paperspace/Get3d_Updated/GET3D/training/training_loop_3d.py", line 134, in training_loop training_set_sampler = misc.InfiniteSampler( File "/home/paperspace/Get3d_Updated/GET3D/torch_utils/misc.py", line 120, in __init__ assert len(dataset) > 0 AssertionError
The text was updated successfully, but these errors were encountered: