If we use the `!` magic to execute the training and logging code,or if we run it in a terminal, everything behaves as expected:

In [1]:
for _ in range(2):
    !python training.py

[34m[1mwandb[0m: Tracking run with wandb version 0.9.5
[34m[1mwandb[0m: Run data is saved locally in wandb/run-20200825_012929-j9xk02w5
[34m[1mwandb[0m: Syncing run [33mlogical-frog-217[0m
[34m[1mwandb[0m: ⭐️ View project at [34m[4mhttps://app.wandb.ai/charlesfrye/uncategorized[0m
[34m[1mwandb[0m: 🚀 View run at [34m[4mhttps://app.wandb.ai/charlesfrye/uncategorized/runs/j9xk02w5[0m
[34m[1mwandb[0m: Run `wandb off` to turn off syncing.


[34m[1mwandb[0m: Waiting for W&B process to finish, PID 5422
[34m[1mwandb[0m: Program ended successfully.
[34m[1mwandb[0m: Run summary:
[34m[1mwandb[0m:   global_step 1
[34m[1mwandb[0m:       trn/bar 0.15508607029914856
[34m[1mwandb[0m:      _runtime 2.1646974086761475
[34m[1mwandb[0m:       trn/foo 0.9074437618255615
[34m[1mwandb[0m:         _step 1
[34m[1mwandb[0m:    _timestamp 1598344170.3939404
[34m[1mwandb[0m: Syncing 4 W&B file(s), 0 media file(s), 0 artifact file(s) and 2 other file(s)
[34m

But that's not the structure of the code I inherited.

Instead, it's intended that you import a class and execute its `.main` method.

In [2]:
import training

The first time we do this, everything goes fine -- the metrics are saved with the desired names, as can be seen in the charts of the linked run.

In [3]:
app = training.LunaTrainingApp(sys_argv=[])
app.main()

Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable


On the second run, however, the metrics are all pre-pended with the directory name of the `wandb.run.dir`.

This is bad because this directory changes on every run,
so it becomes very difficult to compare across runs.

Note that this still happens even though we only have one `SummaryWriter` and so only one `tfevents` file.

In [4]:
app = training.LunaTrainingApp(sys_argv=[])
app.main()

Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable


If we simply force the `tfevents` files to all be written into some `new` directory, we can control the prefix for one iteration, but then on the second iteration, no metrics get logged at all (not even the `tfevents` file).

In [5]:
for _ in range(2):
    app = training.LunaTrainingApp(sys_argv=["--new_dir"])
    app.main()

Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable


Failed to query for notebook name, you can set it manually with the WANDB_NOTEBOOK_NAME environment variable


Weirdly enough, if you run the code in the cell above _before_ running the code that writes the `tfevents` to `wandb.run.dir`, the resulting metrics are logged with the desired names, no pre-pending
(see [this run](https://app.wandb.ai/charlesfrye/uncategorized/runs/2hd3i6p0) and [this run](https://app.wandb.ai/charlesfrye/uncategorized/runs/36p7heao), and note the session histroy in the latter, as compared with the linked runs above).

And for one final bit of weirdness,
if we go back and run the code with the `!` shell magic again,
having run `app.main()` without the `new_dir` argument at least once,
it no longer works.

In [6]:
!python training.py

Traceback (most recent call last):
  File "training.py", line 65, in <module>
    LunaTrainingApp().main()
  File "training.py", line 39, in main
    self.logMetrics(epoch_ndx, 'trn')
  File "training.py", line 52, in logMetrics
    self.initTensorboardWriters(new_dir=self.cli_args.new_dir)
  File "training.py", line 27, in initTensorboardWriters
    tb_dir = wandb.run.dir
AttributeError: 'NoneType' object has no attribute 'dir'
