session run is gray in tensorboard->graph, and unknown device #26

elulue · 2018-08-17T03:56:21Z

Hi,
Branch py3 working fine on my PC, I use ubuntu18.04, py3.5 and tensorflow 1.10 with singal video card Nvidia 1070.
I see my GPU usage is around 30% while most video card memory been occupied during training. I'd like to see if there's room to improve the performance so goto tensorboard.

But the device is unknown when I check it in tensorboard->graph, also could not see compute time.
Could you pls kindly let me know if any tip to fix it ?
Thanks a lot.

elulue · 2018-08-17T06:40:00Z

btw, if use tfdbg, I can see -
"WARNING:tensorflow:Failed to load partition graphs for device /job:localhost/replica:0/task:0/device:CPU:0 from disk. As a fallback, the client graphs will be used. This may cause mismatches in device names."

elulue · 2018-08-18T16:21:38Z

I've found the root cause, tensorboard don't record compute time etc by default.
I updated below and working for me -


                if np.mod(global_step, show_every_n_step) == 1:
                    run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
                    run_metadata = tf.RunMetadata()

                    train_loss, train_mse, _, train_merged_sum = self.sess.run(
                        [self.loss, self.mse, self.optim, self.summary], train_data_feed,
                        options=run_options, run_metadata=run_metadata)

                    self.writer.add_run_metadata(run_metadata, 'step{}'.format(global_step))
                    self.writer.add_summary(train_merged_sum, global_step=global_step)

                else:
                    train_loss, train_mse, _, train_merged_sum = self.sess.run(
                        [self.loss, self.mse, self.optim, self.summary], train_data_feed)
                    self.writer.add_summary(train_merged_sum, global_step=global_step)

elulue closed this as completed Aug 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

session run is gray in tensorboard->graph, and unknown device #26

session run is gray in tensorboard->graph, and unknown device #26

elulue commented Aug 17, 2018

elulue commented Aug 17, 2018

elulue commented Aug 18, 2018

session run is gray in tensorboard->graph, and unknown device #26

session run is gray in tensorboard->graph, and unknown device #26

Comments

elulue commented Aug 17, 2018

elulue commented Aug 17, 2018

elulue commented Aug 18, 2018