You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The native size reward estimation is driven by a ML model (IR2Native) and thus isn't exactly ground truth data. Mircea can probably speak more to the exact reasons for switching over, but ground truth data (or something quite close to it) is almost always going to be better, assuming the training algorithm is able to work with it, which was the original concern that led to the IR2Native model.
Eventually the IR2Native model in upstream LLVM is going to get removed as it's not used anywhere, but we're currently holding off as are looking to use it for a comparison study against some new techniques.
Basically what @boomanaiden154 said. In addition, we used the IR2Size model initially to train with algorithms like DQN, that want dense rewards - i.e. a reward after each action (each inlined callsite). It turned out that didn't work as well as the final reward training methods (at least the way we tried it)
void recordInliningImpl() override {
MLInlineAdvice::recordInliningImpl();
getAdvisor()->resetNativeSize(Caller);
int Reward = std::numeric_limits::max();
if (InlineSizeEstimatorAnalysis::isEvaluatorRequested() &&
!getAdvisor()->isForcedToStop()) {
int NativeSizeAfter = *getAdvisor()->getNativeSizeEstimate(*Caller) +
*CalleeSizeEstimateBefore;
Reward = NativeSizeAfter -
(*CallerSizeEstimateBefore + *CalleeSizeEstimateBefore);
getAdvisor()->updateNativeSizeEstimate(Reward);
}
log(Reward, /Success=/true);
}
////////////////////////////////////////////////////////////////////////
The text was updated successfully, but these errors were encountered: