You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed that the evaluate_or_sample function doesn't release memory between checkpoints.
For example, if I run ddsp_run in eval mode while another ddsp_run is training, the size of the evaluation process keeps growing as it loads and evaluates the checkpoints that are being generated by the training process. While killing and rerunning the eval process solves the issue, it is not an ideal solution.
The text was updated successfully, but these errors were encountered:
djtrip
changed the title
evaluation not releasing memmory
evaluation not releasing memory
Apr 2, 2020
That's an interesting issue, we haven't run into it before. Are you running eval and training on a single machine? Our setups all run on separate instances reading/writing on a shared filespace. It's fairly straightforward to set something like that up on cloud, with two instances using the same bucket.
I noticed that the evaluate_or_sample function doesn't release memory between checkpoints.
For example, if I run ddsp_run in eval mode while another ddsp_run is training, the size of the evaluation process keeps growing as it loads and evaluates the checkpoints that are being generated by the training process. While killing and rerunning the eval process solves the issue, it is not an ideal solution.
The text was updated successfully, but these errors were encountered: