Skip to content

Commit

Permalink
[docs] [C++20] [Modules] Ideas for transitioning to modules (#80687)
Browse files Browse the repository at this point in the history
This patch tries to provide some ideas to transform an existing
libraries to modules. I feel this is helpful for users who is interested
in modules from my observation. While the syntax of modules look easy to
understand, the practice gets harder if the users want to make
compatible work with headers. Especially the `std` module can be
installed in clang18, I think such document may be helpful.

I tried to not be too wordy in this document and I don't want the users
to have impressions that they have to follow this or this is the best
practice. So I tried to use the term `idea` imply the users this is not
compulsory.

I add some regular reviewers for modules, but review opinions from users
are highly recommended.

I want to land this before the releasing of clang18 (in the early
March). We can and should improve this continuously.
  • Loading branch information
ChuanqiXu9 committed Feb 20, 2024
1 parent 0a6c74e commit c8fab88
Showing 1 changed file with 339 additions and 0 deletions.
339 changes: 339 additions & 0 deletions clang/docs/StandardCPlusPlusModules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -610,6 +610,345 @@ the following style significantly:

The key part of the tip is to reduce the duplications from the text includes.

Ideas for converting to modules
-------------------------------

For new libraries, we encourage them to use modules completely from day one if possible.
This will be pretty helpful to make the whole ecosystems to get ready.

For many existing libraries, it may be a breaking change to refactor themselves
into modules completely. So that many existing libraries need to provide headers and module
interfaces for a while to not break existing users.
Here we provide some ideas to ease the transition process for existing libraries.
**Note that the this section is only about helping ideas instead of requirement from clang**.

Let's start with the case that there is no dependency or no dependent libraries providing
modules for your library.

ABI non-breaking styles
~~~~~~~~~~~~~~~~~~~~~~~

export-using style
^^^^^^^^^^^^^^^^^^

.. code-block:: c++

module;
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"
export module your_library;
export namespace your_namespace {
using decl_1;
using decl_2;
...
using decl_n;
}

As the example shows, you need to include all the headers containing declarations needs
to be exported and `using` such declarations in an `export` block. Then, basically,
we're done.

export extern-C++ style
^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: c++

module;
#include "third_party/A/headers.h"
#include "third_party/B/headers.h"
...
#include "third_party/Z/headers.h"
export module your_library;
#define IN_MODULE_INTERFACE
extern "C++" {
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"
}

Then in your headers (from ``header_1.h`` to ``header_n.h``), you need to define the macro:

.. code-block:: c++

#ifdef IN_MODULE_INTERFACE
#define EXPORT export
#else
#define EXPORT
#endif

And you should put ``EXPORT`` to the beginning of the declarations you want to export.

Also it is suggested to refactor your headers to include thirdparty headers conditionally:

.. code-block:: c++

#ifndef IN_MODULE_INTERFACE
#include "third_party/A/headers.h"
#endif

#include "header_x.h"

...

This may be helpful to get better diagnostic messages if you forgot to update your module
interface unit file during maintaining.

The reasoning for the practice is that the declarations in the language linkage are considered
to be attached to the global module. So the ABI of your library in the modular version
wouldn't change.

While this style looks not as convenient as the export-using style, it is easier to convert
to other styles.

ABI breaking style
~~~~~~~~~~~~~~~~~~

The term ``ABI breaking`` sounds terrifying generally. But you may want it here if you want
to force your users to introduce your library in a consistent way. E.g., they either include
your headers all the way or import your modules all the way.
The style prevents the users to include your headers and import your modules at the same time
in the same repo.

The pattern for ABI breaking style is similar with export extern-C++ style.

.. code-block:: c++

module;
#include "third_party/A/headers.h"
#include "third_party/B/headers.h"
...
#include "third_party/Z/headers.h"
export module your_library;
#define IN_MODULE_INTERFACE
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"

#if the number of .cpp files in your project are small
module :private;
#include "source_1.cpp"
#include "source_2.cpp"
...
#include "source_n.cpp"
#else // the number of .cpp files in your project are a lot
// Using all the declarations from thirdparty libraries which are
// used in the .cpp files.
namespace third_party_namespace {
using third_party_decl_used_in_cpp_1;
using third_party_decl_used_in_cpp_2;
...
using third_party_decl_used_in_cpp_n;
}
#endif

(And add `EXPORT` and conditional include to the headers as suggested in the export
extern-C++ style section)

Remember that the ABI get changed and we need to compile our source files into the
new ABI format. This is the job of the additional part of the interface unit:

.. code-block:: c++

#if the number of .cpp files in your project are small
module :private;
#include "source_1.cpp"
#include "source_2.cpp"
...
#include "source_n.cpp"
#else // the number of .cpp files in your project are a lot
// Using all the declarations from thirdparty libraries which are
// used in the .cpp files.
namespace third_party_namespace {
using third_party_decl_used_in_cpp_1;
using third_party_decl_used_in_cpp_2;
...
using third_party_decl_used_in_cpp_n;
}
#endif

In case the number of your source files are small, we may put everything in the private
module fragment directly. (it is suggested to add conditional include to the source
files too). But it will make the compilation of the module interface unit to be slow
when the number of the source files are not small enough.

**Note that the private module fragment can only be in the primary module interface unit
and the primary module interface unit containing private module fragment should be the only
module unit of the corresponding module.**

In that case, you need to convert your source files (.cpp files) to module implementation units:

.. code-block:: c++

#ifndef IN_MODULE_INTERFACE
// List all the includes here.
#include "third_party/A/headers.h"
...
#include "header.h"
#endif

module your_library;

// Following off should be unchanged.
...

The module implementation unit will import the primary module implicitly.
We don't include any headers in the module implementation units
here since we want to avoid duplicated declarations between translation units.
This is the reason why we add non-exported using declarations from the third
party libraries in the primary module interface unit.

And if you provide your library as ``libyour_library.so``, you probably need to
provide a modular one ``libyour_library_modules.so`` since you changed the ABI.

What if there are headers only inclued by the source files
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The above practice may be problematic if there are headers only included by the source
files. If you're using private module fragment, you may solve the issue by including them
in the private module fragment. While it is OK to solve it by including the implementation
headers in the module purview if you're using implementation module units, it may be
suboptimal since the primary module interface units now containing entities not belongs
to the interface.

If you're a perfectionist, maybe you can improve it by introducing internal module partition unit.

The internal module partition unit is an importable module unit which is internal
to the module itself. The concept just meets the headers only included by the source files.

We don't show code snippet since it may be too verbose or not good or not general.
But it may not be too hard if you can understand the points of the section.

Providing a header to skip parsing redundant headers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

It is a problem for clang to handle redeclarations between translation units.
Also there is a long standing issue in clang (`problematic include after import <https://github.com/llvm/llvm-project/issues/61465>`_).
But even if the issue get fixed in clang someday, the users may still get slower compilation speed
and larger BMI size. So it is suggested to not include headers after importing the corresponding
library.

However, it is not easy for users if your library are included by other dependencies.

So the users may have to write codes like:

.. code-block:: c++

#include "third_party/A.h" // #include "your_library/a_header.h"
import your_library;

or

.. code-block:: c++

import your_library;
#include "third_party/A.h" // #include "your_library/a_header.h"

For such cases, we suggest the libraries providing modules and the headers at the same time
to provide a header to skip parsing all the headers in your libraries. So the users can
import your library as the following style to skip redundant handling:

.. code-block:: c++

import your_library;
#include "your_library_imported.h"
#include "third_party/A.h" // #include "your_library/a_header.h" but got skipped

The implementation of ``your_library_imported.h`` can be a set of controlling macros or
an overall controlling macro if you're using `#pragma once`. So you can convert your
headers to:

.. code-block:: c++

#pragma once
#ifndef YOUR_LIBRARY_IMPORTED
...
#endif

Importing modules
~~~~~~~~~~~~~~~~~

When there are dependent libraries providing modules, we suggest you to import that in
your module.

Most of the existing libraries would fall into this catagory once the std module gets available.

All dependent libraries providing modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Life gets easier if all the dependent libraries providing modules.

You need to convert your headers to include thirdparty headers conditionally.

Then for export-using style:

.. code-block:: c++

module;
import modules_from_third_party;
#define IN_MODULE_INTERFACE
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"
export module your_library;
export namespace your_namespace {
using decl_1;
using decl_2;
...
using decl_n;
}

For export extern-C++ style:

.. code-block:: c++

export module your_library;
import modules_from_third_party;
#define IN_MODULE_INTERFACE
extern "C++" {
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"
}

For ABI breaking style,

.. code-block:: c++

export module your_library;
import modules_from_third_party;
#define IN_MODULE_INTERFACE
#include "header_1.h"
#include "header_2.h"
...
#include "header_n.h"

#if the number of .cpp files in your project are small
module :private;
#include "source_1.cpp"
#include "source_2.cpp"
...
#include "source_n.cpp"
#endif

We don't need the non-exported using declarations if we're using implementation module
units now. We can import thirdparty modules directly in the implementation module
units.

Partial dependent libraries providing modules
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

In this case, we have to mix the use of ``include`` and ``import`` in the module of our
library. The key point here is still to remove duplicated declarations in translation
units as much as possible. If the imported modules provide headers to skip parsing their
headers, we should include that after the including. If the imported modules don't provide
the headers, we can make it ourselves if we still want to optimize it.

Known Problems
--------------

Expand Down

0 comments on commit c8fab88

Please sign in to comment.