Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

【Question】Why use llvm-size to calculate rewards? llvm also calculates size rewards? #319

Closed
18liumin opened this issue Nov 23, 2023 · 2 comments

Comments

@18liumin
Copy link

void recordInliningImpl() override {
MLInlineAdvice::recordInliningImpl();
getAdvisor()->resetNativeSize(Caller);
int Reward = std::numeric_limits::max();
if (InlineSizeEstimatorAnalysis::isEvaluatorRequested() &&
!getAdvisor()->isForcedToStop()) {
int NativeSizeAfter = *getAdvisor()->getNativeSizeEstimate(*Caller) +
*CalleeSizeEstimateBefore;
Reward = NativeSizeAfter -
(*CallerSizeEstimateBefore + *CalleeSizeEstimateBefore);
getAdvisor()->updateNativeSizeEstimate(Reward);
}
log(Reward, /Success=/true);
}

////////////////////////////////////////////////////////////////////////

cmdline = [self._llvm_size_path, output_native_path]
output_bytes = compilation_runner.start_cancellable_process(
    cmdline,
    timeout=self._compilation_timeout,
    cancellation_manager=self._cancellation_manager,
    want_output=True)
if not output_bytes:
  raise RuntimeError(f'Empty llvm-size output: {" ".join(cmdline)}')
output = output_bytes.decode('utf-8')
tmp = output.split('\n')
if len(tmp) != 3:
  raise RuntimeError(f'Wrong llvm-size output {output}')
tmp = tmp[1].split('\t')
native_size = int(tmp[0])

if native_size == 0:
  return {}

if reward_only:
  return {_DEFAULT_IDENTIFIER: (None, native_size)}

result = log_reader.read_log_as_sequence_examples(log_path)
if len(result) != 1:
  return {}

sequence_example = next(iter(result.values()))

if not sequence_example.HasField('feature_lists'):
  return {}

return {_DEFAULT_IDENTIFIER: (sequence_example, native_size)}
@boomanaiden154
Copy link
Collaborator

The native size reward estimation is driven by a ML model (IR2Native) and thus isn't exactly ground truth data. Mircea can probably speak more to the exact reasons for switching over, but ground truth data (or something quite close to it) is almost always going to be better, assuming the training algorithm is able to work with it, which was the original concern that led to the IR2Native model.

Eventually the IR2Native model in upstream LLVM is going to get removed as it's not used anywhere, but we're currently holding off as are looking to use it for a comparison study against some new techniques.

@mtrofin
Copy link
Collaborator

mtrofin commented Nov 27, 2023

Basically what @boomanaiden154 said. In addition, we used the IR2Size model initially to train with algorithms like DQN, that want dense rewards - i.e. a reward after each action (each inlined callsite). It turned out that didn't work as well as the final reward training methods (at least the way we tried it)

@mtrofin mtrofin closed this as completed Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants