Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
231 lines (183 sloc) 9.3 KB
<pre class='metadata'>
Markup Shorthands: markdown yes
Group: WG21
Status: P
Shortname: P1302
Revision: 1
Audience: EWG
Title: Implicit Module Partition Lookup
Editor: Isabella Muerte, https://twitter.com/slurpsmadrips
Editor: Richard Smith, richard@metafoo.co.uk
Date: 2019-01-21
URL: https://wg21.link/p1302r1
Abstract: We seek to enforce a particular form of module partition lookup as a
Abstract: "fast path" opportunity for building modules, with minimal impact to
Abstract: existing code, while allowing for incremental modularization of
Abstract: user's code bases. This is not a convention, but a requirement for
Abstract: conformance for all vendors whose code will run on an abstract machine
!Current Render: <a href="https://api.csswg.org/bikeshed/?force=1&url=https://git.io/fxG98">P1302</a>
!Current Source: <a href="https://git.io/fxGS1">slurps-mad-rips/papers/proposals/implicit-module-partition.bs</a>
</pre>
<style>
ins {background-color: #CCFFCC; text-decoration: underline;}
del {background-color: #FFCACA; text-decoration: line-through;}
</style>
# Revision History # {#changelog}
## Revision 1 ## {#r1}
Added actionable language for inclusion into the standard. We now state that
splayed layouts are the minimum requirement and any hierarchy or module name
enforcement is solely the responsibility of the build system.
## Revision 0 ## {#r0}
Initial Release 🎉
# Motivation # {#motivation}
As the advent of modules approaches, build systems and developers are still
waiting on the final bits that will make it into C++20. This paper seeks to
provide a path for build system optimizations based on existing tooling (unity
builds, amalgamation headers) that has been reworded for modules. This reduces
work required by build systems, general effort from compilers (e.g., they will
not need to implement a local socket based client for information passing, and
build system developers don't have to worry about CVEs because of said socket
clients), and to make the lives of developers easier as they will get to
experience a fairly consistent development process when moving between or
across projects.
# Design # {#design}
The design for this behavior is as follows:
A so-called *module entrypoint* is placed inside of a directory. A *module
entrypoint*, as defined in [[#wording]] is a file. The name of this
*entrypoint* is user defined, with a fallback to a file with the base name
`module` and possibly some file extension. We do not enforce the requirement of
a file extension so that operating systems without them can do as they please.
This also allows the community to eventually select a "better" file extension.
That said, the author recommends `cxx` as that is the environment variable used
by plenty of build systems to represent the C++ compiler, where as `CPP`
represents the **C** **P**re**P**rocessor. Additionally, moving libraries from
non-module APIs to modularized APIs can have a path of least resistance instead
of requiring something that has nothing to do with the name C++, like `cppm` or
`mxx`.
Therefore, if a module's interface has `import :list`, the compiler will look for a file with a base name of `list`, and (optionally) the same file extension as `module`.
This behavior only occurs during the construction of a BMI. If dependent
modules are not yet compiled, or if users are expecting an object file to
magically pop out of the compiler, they will be surprised when the compiler
gives an error.
The building of object files is still left up to build systems so that existing
distributed build system workflows are not interrupted (such as in the case of
tools like Icecream, distcc, or sccache).
Where things get interesting is when a user desires to import *another module*
into a *module entrypoint*. Given the following directory layout:
```
.
└── src
└── core
├── module.cxx
├── list.cxx
└── io
├── module.cxx
└── file.cxx
```
We can assume that, perhaps, the source for `core/module.cxx` looks something
like:
```cpp
export module name;
export import core.io;
import :list;
```
In other languages, this would imply that the compiler has to now recurse into
the `core/io` directory. **We do not do this**. Instead, the build system is
required to have seen the `export import` and passed that directory along to
the compiler first. This is a combination of the various modules systems, but
has the following properties:
1. It does not make the compiler a build system
2. Existing work that has been done to handle dependency management does not
need to be thrown away.
3. The `core.io` module *does not* have to exist in the directory `core/io`.
Rather, it can exist techically anywhere.
4. Build systems are free to enforce the name of a module to its location on
disk, while also permitting others to ignore it entirely.
5. Build systems can have a guaranteed fallback location if developers don't
want to have to manually specify the location of each and every module.
6. This doesn't actually tie the compiler to a filesystem approach, as this
is just a general convention.
7. Build systems are free to implement, additional conventions, such as the
Pitchfork or Coven filesystem layout and enforce it for modules having
legacy non-module code in the same project layout.
8. It allows developers to view modules as hierarchical, even if they aren't.
This means that, if treating modules as a hierarchy becomes widespread
enough, the standard could possibly enforce modules as hierarchies in the
future.
9. Platforms where launching processes are expensive can take advantage of
improved throughput when reading from files.
10. Build systems and compilers are free to take an optimization where only
the modified times of a directory are checked before the contents of each
directory are checked. On every operating system (yes, every operating
system), directories change their timestamp if any of the *files*
contained within change, but do not update if child directories do as
well. While some operating systems permit mounting drives and locations
without modified times, doing so breaks nearly every other build system
in existence. Thus we can safely assume that a build system does not need
to reparse or rebuild a module if its containing directory has not
changed.
# Examples # {#examples}
The following two examples show how implicit module partition lookup can be
used for both hierarchical and "splayed" directory layouts.
## Hierarchical ## {#example-hierarchy}
This sample borrows from the above example. Effectively, to import `core.io`,
one must build it before building `core` simply because the build system
assumes that `core.io` refers to a directory named `core/io` from the project
root.
```
.
└── src
└── core
├── module.cxx
├── list.cxx
└── io
├── module.cxx
└── file.cxx
```
This behavior is *not* enforced by the compiler, but rather by the build
system. If a build system does not support a *hierarchical* implicit lookup, it
can at least support a *splayed* implicit lookup
## Splayed ## {#example-splayed}
This approach is one that might be more commonly seen as C++ developers move
from headers to modules.
```
.
├── core
│ ├── module.cxx
│ └── list.cxx
└── io
├── module.cxx
└── file.cxx
```
In the above layout, `core.io` is located in `./io`, rather than under the
`./core` directory. A sufficiently simple build system could be told that
`core.io` resides under `./io` and not to rely on some kind of hierarchical
directory layout.
# Wording
The following is to be placed into the current working draft at a location
within close proximity to, or adjacent to, the current merged modules wording:
<ins>
<center>**General**</center>
<sup>1</sup> This subclause describes operations to be supported regarding
modules, their *implicit* partitions, and interactions within the filesystem.
<sup>2</sup> A *module container* is a collection of files represented by a *directory*. [fs.general]
<sup>3</sup> A *module entrypoint* is a filesystem object within a *module container* that holds the purview of an *exported module*.
<sup>4</sup>An *implicit module partition* is a filesystem object adjacent to a *module entrypoint*.
<center>**Conformance**</center>
<sup>1</sup>Conformance is required for all vendors whose implementations
create a final program to be run on the abstract machine
<sup>2</sup>Implementations must provide a flag for users to declare a *module
container* as a filesystem directory.
<sup>3</sup>Implementations must provide a flag for users to declare the name
of a *module entrypoint*
<sup>4</sup>Implementations must use a predefined *module entrypoint* if none
is provided by a user. This entrypoint must be a file with a base name of
*module*.
(*Note: The extension of the file is currently left unspecified, but should
be one of the several file extensions that have been historically used. (e.g.,
`c++`, `cpp`, `cxx`, and `cc`, et. al. *)
<center>**Behavior**</center>
<sup>1</sup>In the *module entrypoint*'s *preamble*, when an implementation
encounters a *module partition* import, it shall look for an *implicit
module partition* with the same base name as the *module partition*
</ins>
You can’t perform that action at this time.