Skip to content

C++ optimization problem #1653

Open
Open
@gitoleg

Description

@gitoleg

I would like to discuss the next problem connected with CIR C++ optimizations and CIR in general.

We started to develop an optimization for the next simple case (but it's common for large C++ codebases): once a vector instance is created on every loop iteration, we may want to hoist it from the loop and call the clear method every time instead.

Example:

for (...) {
    vector<int> v;
    ...
}

Replace with:

{ // add extra scope
  vector<int> v;
  for (...) {
    v.clear();
    ...
  }
}

Instead of N allocations we would have only one. Extra scope is added in order to preserve the semantics, so the ~vector() will be called as soon as possible and no impact on the following code would happen. It looks pretty clear and easy, right?

There are several problems though. We need to attach attributes to vector ctor and dtor in order to identify them later. (maybe we start to mangle too early?)

But the most significant problem we faced is that clear method is not instantiated in the toy example - so it may be the same case for the real code, i.e. I can not just insert it in the loop body region.

And here are some approaches I can think of:

  1. The most easiest one - just apply the optimization once the method exists in the translation unit. This may work only if there is a vector instantiated with the same type and the clear method is called. I would not rely on it.
  2. Use two stage compilation - first find if the optimization makes sense, dump some info in file (e.g. json: "vector" : ["clear"]) and then read it on the second stage and force the method instantiation, and apply the code transformation. But we need to patch AST generation somehow - so the method will appear in CIR after. I have some doubts here either.

I believe that it's not some special case and soon or later someone else will face with the similar problem - many CIR C++ related optimizations will be different from common LLVM IR ones - probably we will add, remove and replace something basing on our knowledge of stdlib.

So what do you think about how to do it? Are there any better approaches? Or steps to perform? Any thoughts would be helpful :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions