Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Experimental improvement in compile time #1460

Closed

Conversation

gassmoeller
Copy link
Member

This resulted out of a discussion in #1458.

GCC reports that 80% of our compile time are spend parsing included header files. This of course would not be necessary, because our headers usually do not change between includes from different source files. I found a nice cmake project called 'cotire' (https://github.com/sakra/cotire) that automatically precompiles headers used in cmake projects. Before discussing drawbacks and limitations, here are some results from compiling aspect in an fresh build directory on my laptop with make -j8 and a fixed medium CPU frequency (to avoid throttling):

master: gcc spends 80% in parsing phase

real    8m43.378s
user    62m49.015s
sys     2m15.041s

branch: gcc spends <10% in parsing phase

real    4m43.297s
user    27m32.818s
sys     2m3.505s

I wish I had found this a few years ago, when I had a slower laptop 😄. Now there are some limitations:

  • Precompiling headers can fail (in fact I had to exclude one system header from the precompiler on my laptop, which is pretty easy to do). I have no feeling for how frequently this would cause problems, so far it is 1 out of 2 machines I tested on (our workstation compiled flawlessly). We would definitely need to test this on many machines before activating it by default, but I will not remove this from my laptop even if we do not switch it on by default on master! In particular we should check with clang, and on MacOS. I do not see compatibility with clusters as a major issue, because there you recompile less often, and they usually have many cores available.
  • This does not improve compile time on workstations with 40 cores (I tested), because we add a small serial process (~10-15 seconds) at the beginning of the compile process, and this offsets the gains in the parallel compilation later on. Of course this only holds if you compile the first time, later on the precompiled header can be reused.
  • We need to check for license compatibility to check if we can simply include the cmake project. So far I have read that it is possible to include MIT licensed code into GPL projects without problems, but I am no expert.
  • We should probably create a cmake variable ASPECT_PRECOMPILE_HEADERS to allow users to switch this on and off in a simple way.

Opinions? I was pretty proud of myself when I found this. I have certainly waited for compiling ASPECT way longer than the few hours it took me to put this together 😄.

@gassmoeller
Copy link
Member Author

It seems the tester has the same issue as my laptop (maybe a ubuntu specific setup that confuses cotire). Anyway, for now I have excluded headers in /usr/include from getting precompiled, which fixes the issue on my laptop without much slowdown.

But the madness continues 😄. Cotire also offers a feature called unity builds, which essentially dumps all C++ code into a fixed number of compilation units (e.g. number of processors available), and therefore reduces a lot of redundant work for the compiler. As far as I understand they are not standard conform, and require a bit of code adjustments on our side (see 2nd commit), but the results are impressive. Same setup as described in my post above now compiles the target aspect_unity in 2:30 min, i.e. more than 3 times faster than before. Full timings:

real    2m29.815s
user    7m53.449s
sys     0m11.654s

On machines with less CPUs this difference will become even larger, because the user time is now only 1/8th of the time before. There is one translation unit that takes longer than the others, therefore less parallel speedup. Probably Simulator.

Anyway this is not really clean, but before spending much time doing polishing and documenting everything, is this something we want to pursue?

CMakeLists.txt Outdated
# and therefore the whole precompilation
cotire(aspect)
ELSE()
MESSAGE(STATUS "Disabling cotire. Do not precompile external header files.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just omit the second part of the sentence. It reads like a command to the reader ("do not do this" -- to which I would reply "do what?")

internal::Plugins::PluginList<Interface<2> >,
internal::Plugins::PluginList<Interface<3> > > registered_plugins;
aspect::internal::Plugins::PluginList<Interface<2> >,
aspect::internal::Plugins::PluginList<Interface<3> > > registered_plugins;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that all of the changes to the src files could also be merged independently -- they all seem like good ideas in principle.

@@ -1526,8 +1526,15 @@ namespace aspect
// explicit instantiation of the functions we implement in this file
namespace aspect
{
#ifndef aspect_simulator_defined
#define MAYBE_ADVECTION_FIELD(dim) \
template struct Simulator<dim>::AdvectionField;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this necessary? what is the error message you are trying to work around? this seems to me like fixing a legitimate problem the wrong way.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this I would need some help with (template problems).
When I do a unity build with an explicit instantiation of the Simulator class and this line in one translation unit, I get the following error:

In file included from /scratch/gassmoel/aspect-debug/cotire/aspect_CXX_unity.cxx:2118:0:
/raid/gassmoel/software/aspect/source/simulator/helper_functions.cc:1537:35: error: duplicate explicit instantiation of ‘struct aspect::Simulator<2>::AdvectionField’ [-fpermissive]
   template struct Simulator<dim>::AdvectionField; \
                                   ^
/raid/gassmoel/software/aspect/include/aspect/global.h:384:3: note: in expansion of macro ‘INSTANTIATE’
   INSTANTIATIONS(2) \
   ^
/raid/gassmoel/software/aspect/source/simulator/helper_functions.cc:1555:3: note: in expansion of macro ‘ASPECT_INSTANTIATE’
   ASPECT_INSTANTIATE(INSTANTIATE)
   ^
/raid/gassmoel/software/aspect/source/simulator/helper_functions.cc:1537:35: error: duplicate explicit instantiation of ‘struct aspect::Simulator<3>::AdvectionField’ [-fpermissive]
   template struct Simulator<dim>::AdvectionField; \
                                   ^
/raid/gassmoel/software/aspect/include/aspect/global.h:385:3: note: in expansion of macro ‘INSTANTIATE’
   INSTANTIATIONS(3)
   ^
/raid/gassmoel/software/aspect/source/simulator/helper_functions.cc:1555:3: note: in expansion of macro ‘ASPECT_INSTANTIATE’
   ASPECT_INSTANTIATE(INSTANTIATE)
   ^
make[3]: *** [CMakeFiles/aspect_unity.dir/cotire/aspect_CXX_unity.cxx.o] Error 1
make[2]: *** [CMakeFiles/aspect_unity.dir/all] Error 2
make[1]: *** [CMakeFiles/aspect_unity.dir/rule] Error 2

@gassmoeller
Copy link
Member Author

Just to illustrate the problem, a minimal example of the code we are trying to compile in the unity build looks like this:
content from core.cc:

template<int dim>
class Simulator
{
  public:
    struct AdvectionField
    {
      int get_dim();
    };

    int member[dim];
};

int main (int argc, char *argv[])
{
  Simulator<2> test;
  return 0;
}

template class Simulator<2>;

appended content from helper_functions.cc:

template<int dim>
int Simulator<dim>::AdvectionField::get_dim()
{
  return dim;
}

template struct Simulator<2>::AdvectionField;

This problem is more complicated than my everyday knowledge of C++. As far as I can see a definition and instantiation of AdvectionField in core.cc would solve the problem, but that is not pretty either.

@gassmoeller
Copy link
Member Author

I separated the less controversial parts of this PR into #1472 and #1473, so that we can proceed without worrying about the ugly workaround for the instantiation problem in unity builds.

@tjhei
Copy link
Member

tjhei commented Apr 30, 2017

A few comments:

  • This is a neat project and compile speed improvements are always great!
  • I would like to propose to test things during the hackathon but not enable anything before/during.
  • I am worried that we need to improve the testing infrastructure to detect problems that come up with changes like this (test different combinations of settings, unity on/off etc.)
  • Unity builds might increases the amount of RAM required by the compiler instances, right? I know we have been splitting up .cc files inside deal.II to keep RAM usage low. This is something we need to check.

@bangerth
Copy link
Contributor

I tried this, and putting the explicit instantiation of AdvectionField into core.cc doesn't work because the member functions of that class are defined in helper_functions.cc. The instantiation really has to be in helper_functions.cc.

The right solution would be to make sure that cotire puts helper_function.cc before core.cc. If that isn't possible, then your solution would work as well (though I would name the macro MAYBE_INSTANTIATE_ADVECTION_FIELD).

@gassmoeller
Copy link
Member Author

Ok, cotire uses the property 'SOURCES' of the target 'aspect' to determine the order in which files should go into the unity build.
I have now tried for two hours to tell cmake in which order it should add the files to this property, but without success. I succeeded in rearranging TARGET_SRC, but SOURCES is a read-only property so I can not overwrite it. Also adding a 'OBJECTS_DEPENDS' property with the path to helper functions (absolute or relative) to core.cc does not change anything. Does one of you have an idea how to do this? And who designed the cmake language? I do not even get error messages for wrong commands. I thought they wanted to replace make by something better, not worse?

@tjhei
Copy link
Member

tjhei commented Apr 30, 2017

Ok, cotire uses the property 'SOURCES' of the target 'aspect' to determine the order in which files should go into the unity build.

Can you do something like

LIST(REMOVE_ITEM TARGET_SRC source/helper_funtions.cc)
LIST(APPEND TARGET_SRC source/helper_funtions.cc)

?

@gassmoeller
Copy link
Member Author

Yes, that was what I did, although helper_functions.cc needs to be before simulator.cc, so I tried (and checked that it works as intended by outputting TARGET_SRC):

LIST(REMOVE_ITEM TARGET_SRC source/helper_funtions.cc)
LIST(INSERT TARGET_SRC 0 source/helper_funtions.cc)

I did this within the preparations for the cotire call. Would it be necessary to do this before some other command? I did not find any command like ADD_EXECUTABLE(aspect ${TARGET_SRC}).

@tjhei
Copy link
Member

tjhei commented Apr 30, 2017

I did not find any command like

this happens inside INVOKE_AUTO_PILOT or whatever it is called, so do it before that.

@gassmoeller
Copy link
Member Author

Thanks! That did the trick. #1473 now works without the ugly definitions in core.cc and helper_functions.cc. I had to split the cmake instructions in two parts (reordering source files before setting up the target, and calling cotire afterwards). I will close this PR. #1473 should be ready for a review / merge.

@gassmoeller gassmoeller deleted the speedup_compile_time branch January 26, 2018 07:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants