Ugly implementation of evaluation control #17

First, the evaluation worker should not sample data in asynchronous manners, which will cause a waste of computing resources. Instead, it is supposed to wait for the training manager to send signals along with policy parameters to be evaluated. Maybe it has to maintain a local parameter buffer to handle the case when new signals coming in during evaluation time.

Second, the information received from the training manager should contain the corresponding (training) epoch number, so the evaluation worker can log the evaluation metrics with the training epoch rather than the evaluator's local sample epoch.

Thank you!

KornbergFresnel added the enhancement New feature or request label Jul 29, 2021

KornbergFresnel closed this as completed Nov 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ugly implementation of evaluation control #17

Ugly implementation of evaluation control #17

KornbergFresnel commented Jul 29, 2021 •

edited

Loading

zbzhu99 commented Jul 29, 2021 •

edited

Loading

Ugly implementation of evaluation control #17

Ugly implementation of evaluation control #17

Comments

KornbergFresnel commented Jul 29, 2021 • edited Loading

zbzhu99 commented Jul 29, 2021 • edited Loading

KornbergFresnel commented Jul 29, 2021 •

edited

Loading

zbzhu99 commented Jul 29, 2021 •

edited

Loading