You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
我这边尝试了动态方法,收敛的更快。更稳定。刚开始训练用的是固定值,就是作者的。当训练到了固定批次--64*n采用动态的。动态设计根据focus-loss灵感来的--这里可以用早停法,当某个head的收敛了,loss小于设定值,开始使用动态,但是有个问题,在自己的数据上训练,波动太大,不收敛,还是建议用固定轮次。
hello, 使用多卡训练了吗,我用多卡总是报这个错误 RuntimeError: Expected to have finished reduction in the prior iteration before starting a new one. This error indicates that your module has parameters that were not used in producing loss. You can enable unused parameter detection by (1) passing the keyword argument find_unused_parameters=Truetotorch.nn.parallel.DistributedDataParallel; (2) making sure all forwardfunction outputs participate in calculating loss. If you already have done the above two steps, then the distributed data parallel module wasn't able to locate the output tensors in the return value of your module'sforwardfunction. Please include the loss function and the structure of the return value offorward of your module when reporting this issue (e.g. list, dict, iterable).
这个已经设置了find_unused_parameters=True,还是一样的错,模型返回值只有pred
我这边尝试了动态方法,收敛的更快。更稳定。刚开始训练用的是固定值,就是作者的。当训练到了固定批次--64*n采用动态的。动态设计根据focus-loss灵感来的--这里可以用早停法,当某个head的收敛了,loss小于设定值,开始使用动态,但是有个问题,在自己的数据上训练,波动太大,不收敛,还是建议用固定轮次。
The text was updated successfully, but these errors were encountered: