Training the problem #6

SHOUshou0426 · 2022-10-24T01:43:21Z

Hello, there is a cutoff in the training data 999 epoch Evaluating k-NN accuracy. appear error:ValueError: range() arg 3 must not be zero but my train afhq datasets likewise error :ValueError: range() arg 3 must not be zero

**Traceback (most recent call last):
File "train.py", line 258, in
File "train.py", line 254, in main
File "train.py", line 190, in training_loop
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics\knn_evaluator.py", line 69, in evaluate
top1, top5 = knn_classifier(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context
return func(*args, kwargs)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics\knn_evaluator.py", line 106, in knn_classifier
for idx in range(0, num_test_images, imgs_per_chunk):
ValueError: range() arg 3 must not be zero

this is my print
num_test_images, num_chunks = test_labels.shape[0], 100
num_test_images = 32；

imgs_per_chunk = num_test_images // num_chunks
imgs_per_chunk = 0

environment：torch=1.11.0+cu113 cuda=11.3

SHOUshou0426 · 2022-10-24T01:49:05Z

I use the command like this:
python train.py --mod-type adain --total-nimg 1.6M --batch-size 4 --load-size 320 --crop-size 256 --image-size 256 --train-dataset datasets/l2l_cloth/train --eval-dataset datasets/l2l_cloth/val --out-dir runs --extra-desc some descriptions

kunheek · 2022-10-24T01:53:50Z

Hi, can you tell me how many images you are using for your test image?
I guess it happens when number of your validation set is less than 100.

SHOUshou0426 · 2022-10-24T01:56:18Z

train is 130 images val is 32 images but my use AFHQ dataset appear error ValueError: range() arg 3 must not be zero

SHOUshou0426 · 2022-10-24T01:58:15Z

train AFHQ dataset is 19999 epoch appear error

kunheek · 2022-10-24T02:13:06Z

I see. Can you share the command used for AFHQ dataset? I will reproduce it myself.
Until the problem is fixed, you can train your model without evaluation by adding --evaluation false to the command. You can evaluate if after training using saved checkpoints.

By the way, due to the use of SwAV, I recommend to use batch size larger than 4 (16 will be enough). Also, 130 images may not be enough if you are training a model from scratch.

SHOUshou0426 · 2022-10-24T02:32:20Z

I use AFHQ dataset order is
python train.py --mod-type adain --total-nimg 1.6M --batch-size 16 --load-size 320 --crop-size 256 --image-size 256 --train-dataset datasets/afhq/train --eval-dataset datasets/afhq/val --out-dir runs --extra-desc some descriptions

your use metrics order not appear error ?
my attempt readme order is
python -m metrics fid reconstruction --seed 123 --checkpoint ./checkpoints/afhq-stylegan2-5M.pt --train-dataset ./datasets/afhq/train --eval-dataset ./datasets/afhq/val

is error

*C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py:322: UserWarning: Error checking compiler version for cl: [WinError 2] 系统找不到指定的文件。
warnings.warn(f'Error checking compiler version for {compiler}: {error}')
信息: 用提供的模式无法找到文件。
Traceback (most recent call last):
File "C:\Users\yuanx.conda\envs\style2\lib\runpy.py", line 192, in _run_module_as_main
return run_code(code, main_globals, None,
File "C:\Users\yuanx.conda\envs\style2\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics_main.py", line 88, in
main()
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\metrics_main.py", line 70, in main
model = StyleAwareDiscriminator(opts)
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\mylib\base_model.py", line 23, in init
self._create_networks()
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\model\model.py", line 72, in create_networks
self.G = Generator(
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\model\networks\generator.py", line 42, in init
from .stylegan2_layers import EncodeBlock, StyleBlock
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\model\networks\stylegan2_layers.py", line 5, in
import model.networks.stylegan2_op as ops
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\model\networks\stylegan2_op_init.py", line 1, in
from .fused_act import FusedLeakyReLU, fused_leaky_relu
File "C:\Users\yuanx\Desktop\style\style-aware-discriminator\model\networks\stylegan2_op\fused_act.py", line 10, in
fused = load(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py", line 1144, in load
return _jit_compile(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py", line 1357, in _jit_compile
_write_ninja_file_and_build_library(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py", line 1456, in _write_ninja_file_and_build_library
_write_ninja_file_to_build_library(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py", line 1898, in _write_ninja_file_to_build_library
_write_ninja_file(
File "C:\Users\yuanx.conda\envs\style2\lib\site-packages\torch\utils\cpp_extension.py", line 2023, in _write_ninja_file
cl_paths = subprocess.check_output(['where',
File "C:\Users\yuanx.conda\envs\style2\lib\subprocess.py", line 411, in check_output
return run(popenargs, stdout=PIPE, timeout=timeout, check=True,
File "C:\Users\yuanx.conda\envs\style2\lib\subprocess.py", line 512, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['where', 'cl']' returned non-zero exit status 1.

kunheek · 2022-10-24T04:15:18Z

Note that custom CUDA kernel only works on Linux. It seems that you are using Windows.

SHOUshou0426 · 2022-10-24T05:39:48Z

Does the command need to be modified

SHOUshou0426 · 2022-10-24T05:42:30Z

The equipment is not enough, the use of less than 16batchsize will affect the effect

kunheek · 2022-10-24T05:48:25Z

You can use --mod-type=adain, but I cannot guarantee that it will work as I have never tested the code on Windows. I recommend you to run the code on Linux (you can use WSL if you are familiar with it).

In general, the larger the batch size, the better. I haven't tested the code with batch size smaller than 16, so I can't tell you the results of smaller batch sizes.

SHOUshou0426 · 2022-10-24T05:52:58Z

train is --mod_type=adain no problem，The metrics use readme command is faulty

kunheek · 2022-10-24T05:58:54Z

This is not because there is a problem, but because the '--mod-type' is automatically set according to the checkpoint used. Checkpoint 'afhq-stylegan2-5M.pt ' is the model trained using --mod-type=stylegan2.

SHOUshou0426 · 2022-10-24T06:12:27Z

thank you for the response,Try to WSL

kunheek closed this as completed Oct 28, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Training the problem #6

Training the problem #6

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022 •

edited

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

Training the problem #6

Training the problem #6

Comments

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022 • edited

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022

SHOUshou0426 commented Oct 24, 2022

kunheek commented Oct 24, 2022 •

edited