-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
about training steps #6
Comments
? |
Any reply is really appreciated |
Sorry for the delayed response.
|
Thanks for your replay |
For ShapeNet (both cars and chairs), the models were trained for roughly 300k iterations without the GAN loss and 200k more iterations with the GAN loss. |
I used the following training command : and the following testing command: python evaler.py --dataset chair --data_id_list ./testing_tuple_lists/id_chair_random_elevation.txt --loss --checkpoint /data/ehab/Multiview2NovelviewMaster/train_dir/chair-default-bs_8_lr_flow_0.0001_pixel_5e-05_d_0.0001-num_input-4-20190325-150046/model-335001 --write_summary --summary_file log_chair335.txt The recorded results in the report for only two views not 4 as follow: Checkpoint: /data/ehab/Multiview2NovelviewMaster/train_dir/chair-default-bs_8_lr_flow_0.0001_pixel_5e-05_d_0.0001-num_input-4-20190325-150046/model-335001 do you have any idea why report contains only results for two views only? |
By default, the evaler only feeds two source views to the model (which can be seen here). You need to specify |
Dear Author
Thanks for sharing your interesting work but i have the following questions:
[2019-03-24 17:43:27,001] [train step 145431] Loss: 4.35295 Pixel loss: 4.03669 Flow loss: 0.31627 (1.589 sec/batch, 5.036 instances/sec)
and did not finish yet. so am asking what is the number of training steps?
python trainer.py --batch_size 8 --dataset car --num_input 4
but it gives the following error after reaching training step (number train step 4251), do you have any idea why?
[2019-03-23 05:36:24,763] [train step 4261] Loss: 2.96025 Pixel loss: 2.85450 Flow loss: 0.10575 (1.607 sec/batch, 2.489 instances/sec)
Traceback (most recent call last):
File "trainer.py", line 380, in
main()
File "trainer.py", line 377, in main
trainer.train()
File "trainer.py", line 193, in train
opt_gan=s > gan_start_step, is_train=True)
File "trainer.py", line 209, in run_single_step
batch_chunk = self.session.run(batch)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run
run_metadata_ptr)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run
feed_dict_tensor, options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run
options, run_metadata)
File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call
raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.OutOfRangeError: RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 4, current size 3)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_STRING, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
Caused by op u'shuffle_batch', defined at:
File "trainer.py", line 380, in
main()
File "trainer.py", line 374, in main
trainer = Trainer(config, dataset_train, dataset_test)
File "trainer.py", line 48, in init
dataset, self.batch_size, is_training=True)
File "/data/ehab/Multiview2NovelviewMaster/input_ops.py", line 76, in create_input_ops
min_after_dequeue=min_capacity,
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 1220, in shuffle_batch
name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/training/input.py", line 791, in _shuffle_batch
dequeued = queue.dequeue_many(batch_size, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/data_flow_ops.py", line 457, in dequeue_many
self._queue_ref, n=n, component_types=self._dtypes, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_data_flow_ops.py", line 1342, in _queue_dequeue_many_v2
timeout_ms=timeout_ms, name=name)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in init
self._traceback = self._graph._extract_stack() # pylint: disable=protected-access
OutOfRangeError (see above for traceback): RandomShuffleQueue '_0_shuffle_batch/random_shuffle_queue' is closed and has insufficient elements (requested 4, current size 3)
[[Node: shuffle_batch = QueueDequeueManyV2[component_types=[DT_FLOAT, DT_STRING, DT_FLOAT], timeout_ms=-1, _device="/job:localhost/replica:0/task:0/cpu:0"](shuffle_batch/random_shuffle_queue, shuffle_batch/n)]]
I really appreciate your time and your reply
Regards
The text was updated successfully, but these errors were encountered: