Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the exactly meaning of sync_cycle? #415

Closed
zchrissirhcz opened this issue Apr 20, 2018 · 2 comments
Closed

What is the exactly meaning of sync_cycle? #415

zchrissirhcz opened this issue Apr 20, 2018 · 2 comments

Comments

@zchrissirhcz
Copy link

Hi, I'm using VisualDL to plot loss vs iteration curve during my network's training (including validation), by putting the loss value to scalar objects. However, don't know the exactly meaning of sync_cycle, and I often have to wait a long time to see the first plotted points or lines on the web page.

Some details are: I set sync_cycle to 10:

logw = LogWriter(log_dir, sync_cycle=10)
sc_train_loss = logger.scalar("loss")
sc_val_loss = logger.scalar("loss")

and I do put loss value to sc_train_loss during each iteration of training or validation phase. I use Caffe and see the terminal's output have already print out many iteration (~1000 iteration) result, but the webpage's plot is still empy. The terminal's output value are not zero, they are like:

I0420 13:36:23.006973 20468 solver.cpp:239] Iteration 1620 (4.86939 iter/s, 4.10729s/20 iters), loss = 12.0148
I0420 13:36:23.007010 20468 solver.cpp:258]     Train net output #0: loss = 12.0148 (* 1 = 12.0148 loss)
I0420 13:36:23.007019 20468 sgd_solver.cpp:112] Iteration 1620, lr = 0.0001
I0420 13:36:27.057492 20468 solver.cpp:239] Iteration 1640 (4.93785 iter/s, 4.05035s/20 iters), loss = 16.038
I0420 13:36:27.057543 20468 solver.cpp:258]     Train net output #0: loss = 16.038 (* 1 = 16.038 loss)
I0420 13:36:27.057552 20468 sgd_solver.cpp:112] Iteration 1640, lr = 0.0001
I0420 13:36:31.127528 20468 solver.cpp:239] Iteration 1660 (4.91419 iter/s, 4.06985s/20 iters), loss = 13.2756
I0420 13:36:31.127578 20468 solver.cpp:258]     Train net output #0: loss = 13.2756 (* 1 = 13.2756 loss)
I0420 13:36:31.127588 20468 sgd_solver.cpp:112] Iteration 1660, lr = 0.0001
I0420 13:36:35.280731 20468 solver.cpp:239] Iteration 1680 (4.8158 iter/s, 4.15299s/20 iters), loss = 13.6123
I0420 13:36:35.280764 20468 solver.cpp:258]     Train net output #0: loss = 13.6123 (* 1 = 13.6123 loss)
I0420 13:36:35.280771 20468 sgd_solver.cpp:112] Iteration 1680, lr = 0.0001
I0420 13:36:59.297955 20468 solver.cpp:239] Iteration 1700 (0.832762 iter/s, 24.0165s/20 iters), loss = 14.1602
I0420 13:36:59.297991 20468 solver.cpp:258]     Train net output #0: loss = 14.1602 (* 1 = 14.1602 loss)
I0420 13:36:59.298000 20468 sgd_solver.cpp:112] Iteration 1700, lr = 0.0001
I0420 13:37:03.349117 20468 solver.cpp:239] Iteration 1720 (4.93706 iter/s, 4.05099s/20 iters), loss = 14.279
I0420 13:37:03.349165 20468 solver.cpp:258]     Train net output #0: loss = 14.279 (* 1 = 14.279 loss)
I0420 13:37:03.349189 20468 sgd_solver.cpp:112] Iteration 1720, lr = 0.0001
I0420 13:37:07.519793 20468 solver.cpp:239] Iteration 1740 (4.79565 iter/s, 4.17045s/20 iters), loss = 13.2219
I0420 13:37:07.519840 20468 solver.cpp:258]     Train net output #0: loss = 13.2219 (* 1 = 13.2219 loss)

And in my code, it will do validation until 5250 iteration, i.e. just after 5250 iteration the sc_val_loss scalar will be filled with the first data. So, is this the reason that I have to wait until 5250 training iteration and the first validation iteration to see the plot on webpage? Thanks.

@jetfuel
Copy link
Collaborator

jetfuel commented Apr 24, 2018

@zchrissirhcz Hi, thank you for using VisualDL. The sync_cycle is used to calculated when to sync the record to the disk. and currently is not optimized.

VisualDL keep track a counter. Each add_record or visualdDL related execution will increase the counter. Once the counter reaches to the sync_cycle, VisualDL will write the records to the file system.

Because IO is very expansive, currently VisualDL will modify the sync_cycle automatically so each IO is roughly 30 seconds apart.

We are actually hoping to remove sync_cycle and hide it from the users, since that's implementation details that our users don't need to know.

btw, there is a hidden save function. You can force a sync by calling the save()

logw = LogWriter("./random_log", sync_cycle=10000)
logw.save()

Hopefully this will resolve your issue.

@zchrissirhcz
Copy link
Author

@jetfuel Thanks for your reply. The save( ) function you mentioned would be helpful for me to see the plot quickly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants