@@ -309,10 +309,6 @@ different. There are several differences which can be noticed:
309309 * rootcling -cxxmodule creates a single artifact *Name.pcm* after the library
310310 name. At a final stage, ROOT might be able to integrate the Name.pcm with the
311311 shared library itself.
312- * Preloads all \*pcm files at start up time -- this currently is the only
313- remaining bottleneck which introduces a relatively small performance overhead
314- at startup time and is described bellow. It will be negligible for third-
315- party code (dominated by header parsing).
316312 * Improved correctness in number of cases -- in a few cases ROOT is more
317313 correct. In particular, when resolving global variables and function
318314 declarations which are not part of the ROOT PCH.
@@ -323,6 +319,21 @@ different. There are several differences which can be noticed:
323319 the LD_LIBRARY_PATH descending to the system libraries. The algorithm is very
324320 efficient because it uses bloom filters[[5]]. This in turn allows ROOT symbol
325321 to be extended to system libraries.
322+
323+ ### Module Registration Approaches
324+
325+ The C++ modules system supports /*preloading*/ of all modules at startup time.
326+ The current implementation of loading of C++ modules in clang has an overhead
327+ and is between 40-60 MB depending on the ROOT configuration while there might
328+ be 2x slowdown depending on the workflow. These issues are very likely to be
329+ addressed by the LLVM community in midterm.
330+
331+ Preloading of all C++ modules is semantically the closest to C++ behavior.
332+ However, in order to achieve performance ROOT loads them on demand using
333+ a global module index file. It has sufficient information to map a looked up
334+ identifier to the module which contains the corresponding definition. Switching
335+ back to preloading of all C++ modules is done by setting the `ROOT_USE_GMI`
336+ environment variable to false.
326337
327338### Supported Platforms
328339
@@ -349,14 +360,15 @@ different. There are several differences which can be noticed:
349360
350361## State of the union
351362
352- C++ Modules-aware ROOT preloads all modules at start up time. Our motivating
353- example:
363+ Preloading all modules at start up time turn our motivating example into:
354364
355365```cpp
356366// ROOT prompt
357367root [] S *s; // #1: does not require a definition.
358368root [] foo::bar *baz1; // #2: does not require a definition.
359369root [] foo::bar baz2; // #3: requires a definition.
370+ root [] TCanvas* c = new TCanvas(); // #4 requires a definition
371+
360372```
361373
362374becomes equivalent to
@@ -368,12 +380,29 @@ root [] import Foo.*;
368380root [] S *s; // #1: does not require a definition.
369381root [] foo::bar *baz1; // #2: does not require a definition.
370382root [] foo::bar baz2; // #3: requires a definition.
383+ root [] TCanvas* c = new TCanvas(); // #4 requires a definition
371384```
372385
373386The implementation avoids recursive actions and relies on a well-defined (by
374387the C++ standard) behavior. Currently, this comes with a constant performance
375388overhead which we go in details bellow.
376389
390+ ROOT uses the global module index (GMI) to avoid the performance overhead. ROOT
391+ only preloads the set of C++ modules which are not present in the GMI. The
392+ example becomes equivalent to:
393+
394+ ```cpp
395+ // ROOT prompt
396+ root [] import Foo.*; // Preload Foo if it is not in the GMI.
397+ root [] S *s; // #1: does not require a definition.
398+ root [] foo::bar *baz1; // #2: does not require a definition.
399+ root [] foo::bar baz2; // #3: requires a definition.
400+ root [] TCanvas* c = new TCanvas(); // #4 requires a definition
401+ ```
402+
403+ Line #4 forces cling to send ROOT a callback that TCanvas in unknown but
404+ the GMI resolves it to module Gpad, loads it and returns the control to cling.
405+
377406
378407### Performance
379408This section compares ROOT PCH technology with C++ Modules which is important but
@@ -385,16 +414,9 @@ is not available.
385414The comparisons are to give a good metric when we are ready to switch ROOT to use
386415C++ Modules by default. However, since it is essentially the same technology,
387416optimizations of C++ Modules also affect the PCH. We have a few tricks up in
388- the slaves to but they come with given trade-offs. For example, we can avoid
389- preloading of all modules at the cost of introducing recursive behavior in
390- loading. This requires to build a global module index which is an on-disk
391- hash table. It will contain information about the mapping between an
392- identifier and a module name. Upon failed identifier lookup we will use the
393- map to decide which set of modules should be loaded. Another optimization
394- includes building some of the modules without `-fmodules-local-submodule-visibility`.
395- In turn, this would flatten the C++ modules structure and give us performance
396- comparable to the ROOT PCH. The trade-off is that we will decrease the
397- encapsulation and leak information about implementation-specific header files.
417+ the sleeves to but they come with given trade-offs.
418+
419+ #### Preloading of C++ Modules
398420
399421The main focus for the technology preview was not in performance until recently.
400422We have invested some resources in optimizations and we would like to show you
@@ -413,6 +435,14 @@ The performance is dependent on many factors such as configuration of ROOT and
413435workflow. You can read more at our Intel IPCC-ROOT Showcase presentation
414436here (pp 25-33)[[8]].
415437
438+ #### Loading C++ Modules on Demand
439+
440+ In long term, we should optimize the preloading of modules to be a no-op and
441+ avoid recursive behavior based on identifier lookup callbacks. Unfortunately,
442+ at the moment the loading of C++ modules on demand shows significantly better
443+ performance results.
444+
445+
416446You can visit our continuous performance monitoring tool where we compare
417447the performance of ROOT against ROOT with a PCH [[9]].
418448*Note: if you get error 400, clean your cache or open a private browser session.*
0 commit comments