Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize compilation units and object libraries: investigate sizes, minimize includes, minimize compilation times, enforce DAG #13949

Open
tamiko opened this issue Jun 9, 2022 · 3 comments

Comments

@tamiko
Copy link
Member

tamiko commented Jun 9, 2022

Having in investigated #13926 and the fix #13932 teaches us a valuable lession: In this instance removing external library includes from our own header into a single compilation unit does speed up the compilation process by a factor of 20%.

In general, after the 9.4 release we should quantify how large our compilation units are (meaning the preprocessed CC file prior to compilation) for two configurations, (a) no external dependencies, (b) all external dependencies enabled and investigate whether we can bring this down in size.

On a related note, I think it is time we start to discuss whether we want to go through the pain of

  • untangling our include files and clearly identifying public headers and internal headers,
  • enforce a DAG of "module dependencies" on our object libraries

All of this will be necessary if we want to make progress towards C++ modules (if we ever deploy them or not).

@tamiko
Copy link
Member Author

tamiko commented Jun 9, 2022

Just as an example that this is possible for a (really) large scale project: The linux kernel is moving towards merging the "fast kernel headers" patchset: https://lwn.net/ml/linux-kernel/YdIfz+LMewetSaEB@gmail.com/

@drwells
Copy link
Member

drwells commented Aug 4, 2022

clang now has an option -ftime-trace which prints fairly detailed profiling information for each compilation unit. I went through a dozen or so files and found a few easy things we can fix:

  • remove utilities.h from index_set.h ✔️
  • Move the Trilinos headers out of utilities.h and into something else ✔️
  • remove sparsity_tools.h from graph_coloring.h ✔️
  • remove tria_base.h from dof_handler.h ✔️
  • put FiniteElementRelatedData and MappingRelatedData in their own headers (they are presently in fe_update_flags.h) ✔️
  • Forward declare LA::d::V in tria_description.h ✔️
  • Perhaps we can remove the kernel conversion functions in cgal/point_conversion.h
  • Move our Mutex wrapper into its own header ✔️
  • Move the packing/unpacking templates into their own header
  • put the RTree code and all functions that use it directly into its own header better alternative: split up GridTools in the same way we split up VectorTools
  • split out boost serialization: since this library depends on headers we have to recompile a lot of it every time its included
  • avoid including any boost headers in our own mpi.h - recompiling the signals header is quite expensive ✔️
  • reimplement iota_view without boost
  • move Mapping::InternalDataBase into an independent class

@drwells
Copy link
Member

drwells commented Aug 18, 2022

Another thing to check out is how expensive individual headers are. Here are the top 20 from parsing test output data:

 702/1232 Test  #364: all-headers/base/parsed_convergence_table.h.release .....................................   Passed   11.49 sec
 888/1232 Test  #740: all-headers/fe/mapping_q_internal.h.release .............................................   Passed   11.14 sec
 901/1232 Test  #764: all-headers/grid/grid_tools.h.release ...................................................   Passed   11.01 sec
1149/1232 Test #1266: all-headers/non_matching/coupling.h.release .............................................   Passed   10.77 sec
1214/1232 Test #1398: all-headers/particles/generators.h.release ..............................................   Passed   10.38 sec
1175/1232 Test #1316: all-headers/numerics/fe_field_function.h.release ........................................   Passed   10.32 sec
1224/1232 Test #1412: all-headers/particles/utilities.h.release ...............................................   Passed   10.26 sec
 900/1232 Test  #766: all-headers/grid/grid_tools_cache.h.release .............................................   Passed    9.69 sec
1218/1232 Test #1404: all-headers/particles/particle_handler.h.release ........................................   Passed    9.62 sec
1126/1232 Test #1218: all-headers/meshworker/simple.h.release .................................................   Passed    8.98 sec
1125/1232 Test #1216: all-headers/meshworker/scratch_data.h.release ...........................................   Passed    8.89 sec
1122/1232 Test #1212: all-headers/meshworker/mesh_loop.h.release ..............................................   Passed    8.85 sec
 639/1232 Test  #240: all-headers/base/bounding_box_data_out.h.release ........................................   Passed    8.77 sec
1121/1232 Test #1210: all-headers/meshworker/loop.h.release ...................................................   Passed    8.75 sec
1186/1232 Test #1340: all-headers/numerics/vector_tools.h.release .............................................   Passed    8.74 sec
1118/1232 Test #1202: all-headers/meshworker/integration_info.h.release .......................................   Passed    8.74 sec
1091/1232 Test #1148: all-headers/matrix_free/fe_evaluation.h.release .........................................   Passed    8.65 sec
1184/1232 Test #1334: all-headers/numerics/smoothness_estimator.h.release .....................................   Passed    8.64 sec
1123/1232 Test #1214: all-headers/meshworker/output.h.release .................................................   Passed    8.64 sec

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants