Permalink
Switch branches/tags
Nothing to show
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
1174 lines (1004 sloc) 49.1 KB

Dependency-Carrying Declarations

Field Value
DIP: 1005
Author: Andrei Alexandrescu (andrei@erdani.com)
Review Count: 1 Most Recent
Implementation: n/a
Status: Postponed

Abstract

A Dependency-Carrying Declaration is a D declaration that does not require any import declaration to be present outside of it. Such declarations encapsulate their own dependencies, which makes dependency relationships more fine-grained than traditional module- and package-level dependencies.

Currently D allows definitions to carry their own dependencies by means of the recently-added scoped import declarations feature. However, this is not possible with symbols that are present in the symbol declaration itself, for example as function parameter types or template constraints. The limitation reduces the applicability and power of scoped imports. This DIP proposes a language addition called "inline import", which allows any function and aggregate D declarations to be transformed into a Dependency-Carrying Declaration.

Rationale

Consider the following D code:

import std.datetime;
import std.stdio;
void log(string message)
{
    writeln(Clock.currTime, ' ', message);
}

Traditionally (though not required by the language), imports are placed at the top of the module and then implicitly used by the declarations in the module. This has two consequences. First, the setup establishes a dependency of the current module on two other modules or packages (and by transitivity, on the transitive closure of the modules/packages those depend on). Second, it defines a relationship at distance between the log function and the imports at the top. As a immediate practical consequence, log cannot be moved across the codebase without ensuring the appropriate import declarations are present in the target module.

Let us compare and contrast the setup above with the following:

void log(string message)
{
    import std.datetime;
    import std.stdio;
    writeln(Clock.currTime, ' ', message);
}

This layout still preserves the dependency of the current module on the two std entities because the compiler would need them in order to compile log. However, the relationship at distance disappears---log encapsulates its dependencies, which migrate together with it. We call such a declaration that does not depend on imports outside of it, a Dependency-Carrying Declaration.

Consider now the case when log is a generic function:

void log(T)(T message)
{
    import std.datetime;
    import std.stdio;
    writeln(Clock.currTime, ' ', message);
}

In this case, the current module depends on std.datetime and std.stdio only if it uses log directly from within a non-template function (including a unittest). Otherwise, the log generic function is only parsed to an AST (no symbol lookup) and not processed further. Should another module import this module and use log, the dependency is realized because log needs to be compiled. This makes the module that actually uses log---and only it---dependent on std.datetime and std.stdio, in addition of course to the module that defines log.

The same reasoning applies to template struct, class, or interface definitions:

struct FileBuffer(Range)
{
    import std.stdio;
    private File output;
    ...
}

Such an entity only realizes the dependencies when actually instantiated, therefore moving the carried dependencies to the point of instantiation.

The analysis above reveals that Dependency-Carrying Declarations have multiple benefits:

  • Specifies dependencies at declaration level, not at module level. This allows reasoning about the dependency cost of declarations in separation instead of aggregated at module level.
  • If all declarations use Dependency-Carrying style and there is no top-level import, human reviewers and maintainers can immediately tell where each symbol in a given declaration comes from. This is a highly nontrivial exercise without specialized editor support in projects that pull several other modules and packages wholesale. Even a project newcomer could gather an understanding of a declaration without needing to absorb an arbitrary amount of implied context from the declaration at the top of the module.
  • Dependency-Carrying Declarations are easier to move around, making for simpler and faster refactorings.
  • Dependency-Carrying Declarations allow scalable template libraries. Large libraries (such as D's standard library itself) are customarily distributed in packages and modules grouped by functional areas, such that client code can use the library without needing to import many dozens of small modules, each for one specific declaration. Conversely, client code often imports a package or module to use just a small fraction of it. Distributing a template library in the form of Dependency-Carrying Declarations creates a scalable, pay-as-you-go setup: The upfront cost of importing such a module is only that of parsing the module source, which can reasonably be considered negligible in the economy of any build. Then, dependencies are pulled on a need basis depending on the declarations used by client code.

Dependency-Carrying Declarations also have drawbacks:

  • If most declarations in a module need the same imports, then factoring them outside the declarations at top level is simpler and better than repeating them.
  • Related, renaming one module is likely to require more edits in a Dependency-Carrying Declarations setup.
  • Traditional dependency-tracking tools such as make and other build systems assume file-level dependencies and need special tooling (such as rdmd) in order to work efficiently.
  • Dependencies at the top of a module are easier to inspect quickly than dependencies spread through the module.

On the whole, experience with using Dependency-Carrying Declarations in the D standard library suggests that the advantages outweigh disadvantages considerably. Of all import declarations in the D standard library, only about 10% are top-level---all others are local. Using local imports is considered good style in D code.

There are, however, declarations that cannot be reformulated in Dependency-Carrying Declaration form. Consider a simple example of a non-template function declaration:

import std.stdio;
void process(File input);

It is not possible to declare process without importing std.stdio outside of it. Another situation is that of template constraints:

import std.range;
struct Buffered(Range) if (isInputRange!Range) { ... }

There are combinations as well:

import std.range, std.stdio;
void fun(Range)(Range r, File f) if (isInputRange!Range) { ... }

In all of these cases the only way to state the declarations is to make the symbols they use visible in the scope outside it, which in turn requires the use of import statements separately from the declarations that use them.

This, combined with the ubiquitous use of static introspection and constrained templates, has led to an unpleasant situation in the D standard library whereby it is practically impossible to eliminate imports at the top level. To date, in spite of a large effort to place imports locally, the dependency structure of the D standard library has not clarified visibly because of this limitation.

Workaround: Increasing Granularity of Modules

The obvious workaround to the problem that dependencies must be module-level is to simply define many small modules---in the extreme, one per declaration. Each such small module would import the modules on which that declaration depends. For convenience, package.d modules may be provided to aggregate several modules.

This approach has the following tradeoffs:

  • Reduces unnecessary parsing: if used appropriately, only code that is used actually gets parsed.
  • Increases I/O: more small files cause more I/O activity. This may cause problems with large projects on shared network drives.
  • Library authors face a tension between organizing code in logical units pertaining to the problem domain, and organizing code according to low-level dependency details. They will also be forced to routinely navigate large file hierarchies with many files, which may not be the preferred project organization.
  • Client code must choose between using detailed import lists or convenient package.d imports.
    • If convenient grouped imports are used, the advantage of fine-grained dependency control is lost.
    • If detailed import lists are used, they are verbose and must be updated often. Because it is not an error to not use an imported symbol, over time the import lists will become a large set including the actually needed set, thus eroding the advantage of the approach in the first place. Special tooling and maintenance tasks would be needed to remove unneeded imports once in a while.

Such a project organization may be affordable for small and medium-sized projects and is not precluded by this proposal. An example of such an approach can be found in the mach.d library. It is organized as a small number of related declarations (such as canAdjoin, Adjoin, and AdjoinFlat) per module, along with documentation and unit tests. Currently mach.d has about 49 KLoC (as wc counts) distributed across 348 files. The average file length is 141 LoC and the median is 94 LoC. Each package offers collected package.d modules for convenience that import all small modules in the current package. Client code has the option of using more verbose single imports for precise dependencies, or these terse coarse-granular imports at the cost of less precision in dependency management.

Assuming module size is a project invariant, the number of files scales roughly with project size. This means mach.d would need 2000 files to scale up to the same size as the D standard library (about 6x larger) or about 7000 files to scale up to 1 MLoC. For comparison, Facebook's hhvm project includes about 1 MLoC of C++ code, distributed across 1235 headers and 1187 implementation files (median across all is 141 LoC/file, average 379 LoC/file without counting documentation or unittests). The prospect of tripling the number of files in the project would be tenuous, even if the payoff would be superior dependency management. (For another comparison point, the D Standard Library itself has about 282 KLoC distributed across 137 modules with median length 903 LoC and average length 2055 LoC; these include full documentation and unittests.)

We consider such a workaround nonscalable and undesirable for large-scale projects. It puts in tension the convenience of coarse-granular organization and the organizational advantage of of fine-grained dependencies. The workaround also adds additional project management chores (refreshing the lists of imports, enforcing disciplined use). This proposal eliminates the tension between the two, making them affordable simultaneously.

Workaround: Are Local Imports Good Enough?

A legitimate question to ask is whether consistent use of local imports wherever possible would be an appropriate approximation of the Dependency-Carrying Declarations goal with no change in the language at all. The reasoning is that most dependencies are actually needed by implementations, not declarations; once all local imports are moved where needed by the implementation, only a small residue of imports would remain at top level.

To verify that hypothesis, we ran a test against the D Standard Library. Fortunately this is made easier by the fact that a concerted effort has already been spent on making most imports local; for a non-exhaustive list, refer to PR4361, PR4365, PR4370, PR4373, PR4379, PR4392, PR4467. This work has taken the standard library from all imports being top-level to the current setup whereby 4605 import statements are nested and only 426 (under 8.5% of total) have remained at top level. (Note that nested imports need to be duplicated across different scopes, so before the refactoring there were less than 4605 extra imports at top level.)

It would appear that the process has been successful: only a small fraction of all imports remained at top level; the vast majority have been pushed down into implementation. This ought to have radically improved the dependency structure of the standard library. However, to estimate the real cost of top-level imports, transitivity must be taken into account: any imported module triggers in turn its own imports, transitively. The entirety of imported modules is of importance for dependency management and build times.

To assess the real cost of the 8.5% top-level imports, we ran the following experiment. We compiled in separation each module in the D standard library, monitoring the number of imports. Only imports from the standard library itself were considered, not the core runtime. This is because the refactoring effort pushed in only standard library imports. Also, modules that simply wrap C APIs have been eliminated as skewing the statistics. The compilation included the -unittest flag, which instantiates virtually all templates defined in the current module and therefore executes all local imports. The number obtained is the number of imports that must be transitively acquired assuming everything in the imported module is used. (It should be noted that the numbers obtained this way are slightly higher because some code uses version(unittest) to conditionally import modules used only during unittesting.)

We then repeated the compilation without -unittest, which estimates the cost (in transitive imports) of building the respective module separately.

Finally, we compiled a separate small file that simply imports the measured module without using any of it. This estimates the overhead (in transitive imports) of importing the respective module without using any of it.

Numbers in the first column reflect the total dependencies of the module, if every artifact in it is used by client code. Numbers in the second column are smaller because they reflect dependency deferral---only non-template functions are compiled. If local imports are used successfully, the difference between the first and second column should be significant if the module in question defines many template functions and relatively few non-template functions. Most importantly for this experiment, the third column is the direct measure of the efficacy of using local imports because it shows the true overhead in imports for an imported module that is not used at all. If local imports are good enough, that number should be 0 or close to 0.

The table in Appendix A lists all of these results, sorted by the count of unittest imports in descending order. The table below shows the median and average of the imports count for the entire standard library.

Aggregate Imports (unittest) Imports (compile) Imports (top)
Median 30 10 8
Average 28.0 13.8 10.5

Another matter we investigated is how reducing top-level imports influences build times and the size of the object files produced. We do not have an experimental implementation of this DIP, so measuring impact directly was not possible. We did the converse experiment---adding top-level imports.

In a separate branch of the standard library code, for each module we added all nested imports back to the top level. Some hand-editing was needed after that because of clashes in symbol names. Also, some imports needed to be removed because of circular dependencies and related limitations in the language's design and implementation. The resulting setup can be seen in PR4992. Then for each module in the standard library we compiled one file that consists of exactly one import declaration, monitoring compile time and object file size. Appendix B displays build times and size of object files produced by this experiment.

Aggregate Time (top-level) Object size (top-level) Time (nested) Object size (nested)
Median 320ms 13788 32ms 4296
Average 287.6ms 13437.2 64.6ms 5734.1

As expected, the experiment shows that both build times and object file sizes were improved by moving imports away from the top level. We estimate that eliminating the 10.5x slack dependency fan-in will bring import costs down to negligible and also bring object file size down.

The numbers show that good engineering improves matters significantly, as expected. However, the resulting state of affairs is far from optimum. On average, for one standard library import, 10.5 more modules in the standard library are imported. If the import is not used, and assuming in first approximation that the total overhead of one import is proportional with the total number of modules transitively imported, we're looking at an order of magnitude overhead in dependency fan-in. As artifacts in the module do get used, dependencies are realized and the waste goes down. Further improvements would require a major project reorganization. We conclude that local imports are not enough to ensure an efficient dependency structure, even after discounting the other claimed advantages for maintainability, documentation, and code clarity.

Inline imports

We propose an addition to the D language that allows the use of the keyword import as part of any function and aggregate declaration. When that syntax is used, it instructs the compiler to execute the import before looking up any names in the declaration. To clarify by means of example, the previous declarations would be rewritten as:

with (import std.stdio) void process(File input) ;
with (import std.range) struct Buffered(Range) if (isInputRange!Range)
{
    ...
}

With this syntax, the import is executed only if the declared name (process) is actually looked up. Of course, simple caching will make several imports of the same module as expensive as the first. The following section motivates the use of the existing with statement as a declaration.

Refresher on the with Statement

The with statement is mainly used for manipulating multiple fields of an elaborate value. However, with is more general, accepting a type or a template instance (which is essentially a symbol table) as an argument. Consider:

enum EnumType { enumValue = 42 }
struct StructType { static structValue = 43; alias T = int; }
class ClassType { static classValue = 44; alias T = double; }
template TemplateType(X) { auto templateValue = 45; alias T = X; }
void main()
{
    with (EnumType) { void fun(int x = enumValue); }
    with (StructType) { void gun(T x = structValue); }
    with (ClassType) { void hun(T x = classValue); }
    with (TemplateType!int) { void iun(T x = templateValue); }
}

These declarations all work as expected and depend on names scoped within the type or template instance passed to with. This brings the with statement semantically close to the lookup rules needed for this DIP.

We propose that with (Type) and with (TemplateInstance) are allowed as declarations (not only statements). The language rules would be changed as follows:

  • Inside any function, all uses of with are statements and obey the current language rules.
  • Everywhere else, with (expression) is not allowed. with (Type) and with (TemplateInstance) are always declarations and do not introduce a new scope. Lookup of symbols inside the with declarations is similar to lookup inside the with statement: symbols in the scopes of Type or TemplateInstance have priority (hide) symbols outside the with declaration.

In addition, we propose the statement and declaration with (import ImportList). ImportList is any syntactical construct currently accepted by the import declaration. The with (import ImportList) declaration obeys the following rules:

  • Inside any function, with (Import ImportList) is a statement that introduces a scope. Inside the with, lookup considers the import local to the declaration (similar to the current handling of nested imports).
  • Everywhere else, with (Import ImportList) is always a declaration and does not introduce a new scope. Lookup of symbols is the same as for the statement case.

This extension removes an unforced limitation of the current with syntax (allows it to occur at top level) and introduces a natural extension from symbol tables present in a type or template instance, to symbol tables imported from a module. The drawback of this choice is the potentially confusing handling of scopes: the with statement introduces a scope, whereas the with declaration does not.

The with Declaration

The usual grammar of the import ImportList; declaration applies inside the with (import ImportList) declaration, with the following consequences:

  • The usual lookup rules apply, for example either with (import std.range) or the more precise with (import std.range.primitives) may be used to look up isInputRange.
  • Specific imports can be present as in with (import std.range : isInputRange) or with (import std.range.primitives : isInputRange).
  • Renamed imports may be present as in with (import std.range : isInput = isInputRange). This specification precludes the use of isInputRange and requires the use of isInput instead.

The static import feature is also available with this syntax: with (static import ImportList).

Inline imports apply to all declarations (template or not) and may guard multiple declarations:

with (import module_a : A, B)
{
    struct Widget(T = A) { ... }
    alias C = B;
}
Widget!int g_widget;

As mentioned, with declarations do not introduce a scope so Widget above is visible outside the with declaration, but A and B are not.

Inline imports apply to all declarations. This includes the with declaration itself, having the consequence that multiple with import declarations may be applied in a cascading manner:

with (import module_a : A)
with (import module_b : B)
A fun(B) { ... }

Lookup rules

When the name of a Dependency-Carrying Declaration is found via lookup, its corresponding inline imports are executed. Then the name is resolved.

The visibility of the imported symbol(s) lasts through the end of the with declaration. That includes function contracts and bodies and also module constructors and inner functions.

The inline imports have priority over existing imports visible to the declaration. This is so as to avoid other names present in the scope to have equal footing with names immediately present in the declaration. The lookup is equivalent to placing the inline imports in a scope unique to the declaration, where they take precedence in name resolution just like scoped imports per the current language rules. Example:

import module_b;
with (import module_a) void fun(X value) { ... }

The name X is looked up as if the code was structured as follows:

import module_b;
{
    import module_a;
    void fun(X value) { ... }
}

This equivalent code, however, is not legal at top level. In that case we can artificially introduce an imaginary template to analyze lookup on compilable code:

import module_b;
template __unused()
{
    import module_a;
    void fun(X value) { ... }
}

The symbol X is looked up per the current language rules in the working code above.

If two or more imports within the same with declaration define the same name, name resolution across these is the same as if the imports were top-level.

If a module defines a symbol at top level and then imports a module, lookup proceeds similarly with local imports. Consider:

int writeln;
with (import std.stdio)
void main() { writeln("hello, world"); }

This code is in error because writeln has type int. However, the following code is correct because it specifies the symbol name explicitly:

int writeln;
with (import std.stdio : writeln)
void main() { writeln("hello, world"); }

Examples

Below are a few examples taken from the standard library:

with (import std.meta, std.range, std.traits)
auto uninitializedArray(T, I...)(I sizes) nothrow @system
if (isDynamicArray!T && allSatisfy!(isIntegral, I) &&
    hasIndirections!(ElementEncodingType!T))
{
    ...
}

Alternatively, the declaration may specify the exact symbols needed by using multiple imports:

with (import std.meta : allSatisfy)
with (import std.range : ElementEncodingType)
with (import std.traits : hasIndirections, isDynamicArray, isIntegral)
auto uninitializedArray(T, I...)(I sizes) nothrow @system
if (isDynamicArray!T && allSatisfy!(isIntegral, I) &&
    hasIndirections!(ElementEncodingType!T))
{
    ...
}

Alternative: Lazy Imports

Assume all imports are lazy without any change in the language. (This has already been implemented in the SDC compiler.) The way the scheme works, all imports seen are not yet executed but instead saved in a list of package/module names. Following that, the actual imports are triggered by one of two situations.

First, consider the current module looks up a fully specified name:

import module_a, module_b;
void fun(T)(T value) if (module_a.condition!T)
{
    return module_b.process(value);
}
void fun(T)(T value) if (is(T == int)) { ... }

In this situation:

  • If fun is never looked up, neither module_a nor module_b needs to be loaded.
  • If fun(42) is used, even though the second overload is a match, then module_a must be loaded in order to ensure that module_a.condition!int is false so as to avoid ambiguity.
  • If fun is called with a non-int value, module_a is loaded to evaluate the template constraint. If the constraint is true, then module_b is also loaded so as to look up process.

Let us note that full specification of symbols used may be enabled with ease by using the static import feature. We will henceforth refer to this setup as "the static import setup".

Second, consider the situation (arguably more frequent in today's D code) when the current module does not fully specify names used. Instead, it imports the appropriate modules and relies on lookup to resolve symbols appropriately:

import module_a, module_b;
void fun(T)(T value) if (condition!T)
{
    return process(value);
}
void fun(T)(T value) if (is(T == int)) { ... }

In this situation:

  • If fun is never used, neither module must be loaded.
  • If fun is looked up, it will trigger an unspecified lookup for condition. This will trigger loading of both module_a and module_b (and generally all imports in the current module) so as to look up condition and ensure no ambiguity.

The same applies to the setup in which condition is imported selectively from module_a but module_b is entirely imported:

import module_a : condition;
import module_b;
void fun(T)(T value) if (condition!T)
{
    return process(value);
}
void fun(T)(T value) if (is(T == int)) { ... }

In this case, module module_b still needs to be opened if fun is looked up to ensure no ambiguity exists for condition.

Finally, there is the case when all imports specify the list of symbols imported:

import module_a : condition;
import module_b : process;
void fun(T)(T value) if (condition!T)
{
    return process(value);
}
void fun(T)(T value) if (is(T == int)) { ... }

In this case, fine-grained loading of modules is possible: each module is loaded only if a symbol inside it is used. We refer to this setup as "the selective import setup".

To generalize the observations above, fine-grained loading of modules is possible under either (or a combination of) the following circumstances: (a) the static import setup; (b) the selective import setup.

The advantages of such approaches are:

  • Fine-grained loading of imports is achieved with no changes in the language definition, only the implementation.
  • Project discipline may be enforced with relative ease, either manually or by means of simple tools. The rule is: "All private imports must be either static import or selective import".

The disadvantages are:

  • The fine-grained dependency structure is not attained by the selective import approach. A declaration using unspecified names does not clarify which imports it implicitly relies on. The relationship at distance remains between the import and the use thereof.
  • The static import setup does not share the issue above, at the cost of being cumbersome to use---all imported symbols must use full lookup everywhere. A reasonable engineering approach would be to define shorter names:
    static import std.range.primitives;
    alias isInputRange = std.range.primitives.isInputRange;
    alias isForwardRange = std.range.primitives.isInputRange;
    ...

Such scaffolding is of course undesirable in the first place. Also, at least by the current language rules, such alias definitions would need to load the module anyway so as to ensure the name does exist. In order for this idiom to work, it would require subtle changes to the language that specify how certain alias declarations are exempt from early checking and delayed to the first actual use.

  • In either setup, imports are collapsed into their union, usually at the top of the module. Such lists grow out of sync with the actual code because during maintenance the programmer working on one declaration is not motivated to simultaneously alter a module-level import list shared by all declarations in the module. Over time, the imports grow into a superset of the actual depedencies used by the code, and do not reflect which declarations cause which imports even when accurate.
  • The "carrying" aspect is lost: any migration of a declaration to another module must be followed by awkwardly doing surgery on the import list of the receiving module. Again, the migration may leave unused imports in the module the declaration is taken from. The only recourse to keeping the import list in sync is special tooling or time-consuming discipline (search the module for uses, attempt recompilation).

Although we consider introducing lazy imports an improvement over the current state of affairs, our assessment is that such a feature would fall short of truly allowing a project to rein in its dependency structure.

We have experimented with converting the standard library module std.array to one of the two idioms. Conversion to either the "static import" form or the "selective import" form may be achieved by brute force through dedicated tooling: first generate code that enumerates all symbols in the module, then eliminate them one by one and attempt to rebuild. Such an approach is time-consuming and would be only used at rare intervals.

The manual conversion of std.array to the "static import" form is shown here. It leads to the expected lengthening of the symbols used in declarations, which appears to eliminate one disadvantage by introducing another. Also the manual conversion process turned out to be prohibitively difficult; we would only recommend this conversion using automated tooling.

The manual conversion of std.array to the "selective import" form is shown here. Conversion was successful but because it collapses all imports at the top, it does not make it much easier to identify e.g. what dependencies would be pulled if a given artifact in std.array were used. Again the manual process was highly nontrivial.

Syntactic Alternatives

There are a number of alternative approaches that have been (and some still are) considered.

  • Specify import in a manner reminiscent of attributes:

        void process(File input) import (std.stdio);
        struct Buffered(import std.range)(Range) if (isInputRange!Range)
        {
            ...
        }

    This form had significant differences from both the property syntax and the existing import syntax.

  • Add syntax to allow for an optional import declaration inside declarations:

        void process(import std.stdio)(File input);
        struct Buffered(import std.range)(Range) if (isInputRange!Range)
        {
            ...
        }

    This has the advantage of being less verbose in case the same module is looked up several times. The disadvantages are a heavier and more ambiguous syntax (two sets of parens for nontemplates, three for templates) and an unclear relationship between the imported entities and the symbols used in the declaration.

  • Use import as a pseudo-package such that symbols are written like this:

            void process(import.std.stdio.File input);
            struct Buffered(Range) if (import.std.range.isInputRange!Range)
            {
                ...
            }

    Such an option has an ambiguity problem shown by Timon Gehr: is import.std.range.isInputRange looking up symbol isInputRange in module/package std.range, or the symbol range.isInputRange (e.g. a struct member) in module/package std?

  • Stay as close to the existing import syntax as possible. This has the advantage of being instantly recognized, but the disadvantage of looking out of place within the declaration:

            void process(File input) import std.stdio;
            struct Buffered(Range) if (isInputRange!Range)
            import std.range;
            {
                ...
            }

    One syntactical issue is that in this case the semicolon ending the import may or may not end the declaration; the scanner (and human reader) would need to look ahead to figure whether a definition continues (by means of an open brace, the in keyword, or the out keyword), or the declaration ends there.

  • Alternatively, the semicolon might be omitted in the approach above. This causes no syntactical ambiguity but makes the hanging import declaration even more out of place:

            void process(File input) import std.stdio;
            struct Buffered(Range) if (isInputRange!Range)
            import std.range
            {
                ...
            }
  • Use property syntax.

        @deps!({import std.stdio; pragma(lib, "curl"); }):
        // applies to 1 below
        @deps!({import std.range})
        void fun(T)(isInputRange!T){} // depends on both deps
        void fun2(File file){}  // depends on 1st deps ending with ':'

    This adds no syntax to the language, only semantics. Another advantage is it supports other artifacts aside from import such as pragma illustrated above. The compiler would recognize @deps as a special attribute and would only allow certain constructs inside the lambda inside the attribute. The disadvantage of this approach is that of any non-specialized syntax---it is relatively unstructured and more difficult to follow.

  • Add no syntax at all; allow top-level braces. The most Spartan of all syntaxes simply allows top-level scopes:

    {
        import std.stdio;
        void fun(File f) { ... }
    }
    void gun(); // no access to std.stdio here

    We believe this syntax has similar advantages and disadvantages to C++ namespaces, which force indentation of everything within the scope. This has been long considered a nuisance of C++ namespaces, worked around in projects in one of two ways. One possibility is to define macros that enter and leave a namespace, see e.g. 1, 2, 3, 4. Another possibility is to not indent the code and have special editor indentation rules, see e.g. 1, 2, 3. We expect similar issues and workarounds with such a feature in D.

Breaking changes / deprecation process

We do not anticipate any breaking changes brought by this language addition. The syntactical construct proposed is currently not accepted.

The inline imports specified with a declaration do not affect its type (e.g. the function type for a function declaration).

The changes to declaration syntax will impact third-party documentation generators, so they would need to be updated. There is an advantage herein---documentation generators (including ddoc itself) can show the user the dependencies that each declaration would incur.

Future Possibilities and Directions

Inline and scoped imports offer the option of better handling of static module constructors. Currently, modules that mutually import one another (either directly or through a longer chain) cannot simultaneously define shared static this() constructors. The reason is that, again, dependencies are computed at module level.

If instead modules have no top-level dependencies, then the compiler is able to compute the narrow set of dependencies needed for executing the static module constructor. The static constructor may be (a) a part of a with declaration, (b) use local imports within, and (c) call other functions within the module that have their own dependencies. For example:

// assume no top-level import
with (module_a) void fun(T)()
{
    import module_b;
    return gun();
}
with (module_c)
static shared this()
{
    import module_d;
    fun!int;
}

In this case, the module constructor depends (only) on module_a, module_b, module_c, and module_d. The full information is confined within the current module so it is inferrable during separate compilation.

Copyright & License

Copyright (c) 2016 by the D Language Foundation

Licensed under Creative Commons Zero 1.0

Reviews

Informal forum review

Preliminary Review - Round 1

The language authors have decided to Postpone the implementation of this DIP until more experience can be accumulated with the self-important lookup idiom (which was discussed in the forums).

Appendix A: Imported Modules in the D Standard Library

The table below displays the total (transitive) imports needed for compiling separately each module in the D Standard Library. The first column is the module filename. The second column is the total number of imports needed to compile the module for unittesting (which instantiates virtually all templates and therefore pulls their local imports). The third column is the total number of imports needed to compile the module. The fourth column is the overhead in top-level imported modules needed to import the module without using it.

File Imports (unittest) Imports (compile) Imports (top)
std/net/curl.d 52 48 43
std/experimental/ndslice/slice.d 46 6 6
std/uuid.d 44 27 12
std/algorithm/mutation.d 44 5 5
std/experimental/allocator/package.d 43 17 14
std/experimental/ndslice/package.d 42 9 9
std/range/package.d 40 9 9
std/parallelism.d 40 30 20
std/experimental/logger/package.d 40 37 37
std/algorithm/iteration.d 40 5 5
std/experimental/logger/multilogger.d 39 35 35
std/datetime.d 39 36 25
std/regex/internal/generator.d 38 0 0
std/experimental/logger/nulllogger.d 38 35 35
std/experimental/logger/core.d 38 34 34
std/algorithm/sorting.d 38 11 11
std/algorithm/searching.d 38 6 6
std/typecons.d 37 3 3
std/regex/package.d 37 17 17
std/regex/internal/tests.d 37 20 20
std/regex/internal/tests3.d 37 26 26
std/regex/internal/tests2.d 37 33 33
std/net/isemail.d 37 20 5
std/experimental/logger/filelogger.d 37 34 34
std/algorithm/comparison.d 37 6 6
std/stdio.d 36 23 7
std/process.d 36 31 16
std/file.d 36 29 25
std/experimental/allocator/building_blocks/stats_collector.d 36 9 9
std/experimental/allocator/building_blocks/kernighan_ritchie.d 36 5 5
std/experimental/allocator/building_blocks/free_list.d 36 9 9
std/algorithm/setops.d 36 12 12
std/zip.d 35 32 28
std/regex/internal/shiftor.d 35 17 17
std/regex/internal/bitnfa.d 35 25 22
std/path.d 35 27 25
std/experimental/allocator/building_blocks/region.d 35 10 10
std/experimental/allocator/building_blocks/package.d 34 25 25
std/base64.d 34 5 5
std/uni.d 33 31 6
std/socket.d 33 24 20
std/regex/internal/parser.d 33 31 30
std/mmfile.d 33 28 26
std/experimental/ndslice/selection.d 33 7 7
std/experimental/allocator/building_blocks/affix_allocator.d 33 0 0
std/conv.d 33 13 5
std/xml.d 32 30 0
std/range/primitives.d 32 3 3
std/regex/internal/thompson.d 31 16 16
std/regex/internal/backtracking.d 31 29 25
std/random.d 31 11 4
std/numeric.d 31 12 12
std/experimental/allocator/typed.d 31 14 14
std/experimental/allocator/building_blocks/bitmapped_block.d 31 10 10
std/digest/digest.d 31 4 4
std/container/util.d 31 0 0
std/container/rbtree.d 31 8 8
std/container/array.d 31 5 5
std/regex/internal/ir.d 30 28 15
std/csv.d 30 10 7
std/bigint.d 30 20 15
std/utf.d 29 9 6
std/outbuffer.d 29 15 0
std/functional.d 29 4 3
std/format.d 29 5 5
std/experimental/allocator/showcase.d 29 20 13
std/experimental/allocator/building_blocks/allocator_list.d 29 11 11
std/bitmanip.d 29 19 12
std/array.d 29 6 6
std/experimental/typecons.d 28 4 4
std/digest/sha.d 28 19 5
std/string.d 27 21 14
std/meta.d 27 2 2
std/json.d 27 25 10
std/getopt.d 27 26 13
std/experimental/allocator/building_blocks/bucketizer.d 27 0 0
std/container/binaryheap.d 27 5 5
std/concurrency.d 27 23 22
std/variant.d 26 20 4
std/traits.d 26 2 2
std/experimental/ndslice/iteration.d 26 7 7
std/experimental/ndslice/internal.d 26 6 6
std/experimental/allocator/building_blocks/scoped_allocator.d 26 9 9
std/exception.d 26 4 4
std/encoding.d 26 22 22
std/uri.d 25 20 8
std/math.d 25 3 3
std/digest/ripemd.d 25 18 5
std/digest/crc.d 25 18 5
std/container/dlist.d 25 5 5
std/zlib.d 24 18 11
std/experimental/allocator/building_blocks/fallback_allocator.d 24 9 9
std/digest/md.d 24 18 5
std/container/slist.d 24 1 1
std/complex.d 24 21 6
std/experimental/allocator/mallocator.d 23 9 9
std/experimental/allocator/building_blocks/quantizer.d 23 9 9
std/experimental/allocator/building_blocks/free_tree.d 23 9 9
std/digest/hmac.d 23 5 5
std/experimental/allocator/gc_allocator.d 21 9 9
std/experimental/allocator/building_blocks/segregator.d 20 9 9
std/digest/murmurhash.d 19 5 5
std/ascii.d 16 0 0
std/range/interfaces.d 15 4 4
std/algorithm/package.d 15 15 15
std/container/package.d 13 13 13
std/experimental/allocator/common.d 9 9 8
std/signals.d 8 8 8
std/mathspecial.d 6 6 6
std/demangle.d 5 5 0
std/experimental/allocator/building_blocks/null_allocator.d 4 4 4
std/typetuple.d 2 2 2
std/system.d 0 0 0
std/stdiobase.d 0 0 0
std/stdint.d 0 0 0
std/experimental/allocator/mmap_allocator.d 0 0 0
std/concurrencybase.d 0 0 0
std/compiler.d 0 0 0
std/algorithm/internal.d 0 0 0

Appendix B: Module import times

Times and produced object file sizes if most nested imports were hoisted to top level in the standard library, compared to current master. All times are in milliseconds and all sizes are in bytes.

File Time to import (top-level) Object size (top-level) Time to import (current) Object size (current)
std/array.d 315 13780 26 4296
std/ascii.d 308 13780 12 4296
std/base64.d 305 13780 19 4296
std/bigint.d 308 13780 63 4464
std/bitmanip.d 303 13784 53 4296
std/compiler.d 8 4296 7 4296
std/complex.d 306 13784 32 4296
std/concurrencybase.d 16 4472 16 4472
std/concurrency.d 396 16512 286 9672
std/conv.d 302 13780 21 4296
std/csv.d 302 13780 25 4296
std/datetime.d 301 13784 215 13332
std/demangle.d 300 13784 7 4296
std/encoding.d 384 16764 239 8396
std/exception.d 302 13784 21 4296
std/file.d 298 13780 214 13328
std/format.d 302 13780 24 4296
std/functional.d 299 13788 17 4296
std/getopt.d 307 13780 61 4296
std/json.d 300 13780 38 4296
std/math.d 299 13780 26 4296
std/mathspecial.d 303 13788 28 4296
std/meta.d 51 4460 15 4296
std/mmfile.d 311 13780 215 13332
std/numeric.d 295 13784 62 4296
std/outbuffer.d 308 13784 8 4296
std/parallelism.d 404 16512 99 4468
std/path.d 299 13780 217 13328
std/process.d 297 13784 75 4464
std/random.d 299 13780 22 4296
std/signals.d 304 13784 36 4464
std/socket.d 306 13780 178 4464
std/stdint.d 11 4296 12 4296
std/stdiobase.d 8 4468 7 4468
std/stdio.d 297 13780 33 4464
std/string.d 297 13780 137 4296
std/system.d 7 4296 7 4296
std/traits.d 49 4464 14 4296
std/typecons.d 46 4464 25 4296
std/typetuple.d 52 4468 14 4296
std/uni.d 300 13780 106 4296
std/uri.d 300 13780 35 4296
std/utf.d 296 13780 32 4296
std/uuid.d 296 13780 51 4748
std/variant.d 299 13784 28 4296
std/xml.d 380 16760 13 4296
std/zip.d 300 13780 231 13328
std/zlib.d 293 13780 43 4460
std/regex/package.d 318 13780 167 4296
std/algorithm/comparison.d 298 13796 29 4296
std/algorithm/internal.d 299 13792 8 4296
std/algorithm/iteration.d 297 13796 20 4296
std/algorithm/mutation.d 299 13792 29 4296
std/algorithm/package.d 299 13784 56 4296
std/algorithm/searching.d 300 13796 33 4296
std/algorithm/setops.d 299 13792 51 4296
std/algorithm/sorting.d 307 13792 53 4296
std/container/array.d 294 13792 21 4296
std/container/binaryheap.d 296 13796 18 4296
std/container/dlist.d 299 13792 16 4296
std/container/package.d 294 13784 35 4296
std/container/rbtree.d 300 13792 28 4296
std/container/slist.d 298 13792 10 4296
std/container/util.d 306 13788 6 4296
std/digest/crc.d 294 13784 16 4296
std/digest/digest.d 296 13788 15 4296
std/digest/hmac.d 301 13788 16 4296
std/digest/md.d 299 13784 16 4296
std/digest/murmurhash.d 294 13792 16 4296
std/digest/ripemd.d 310 13788 15 4296
std/digest/sha.d 292 13784 22 4296
std/experimental/allocator/building_blocks/affix_allocator.d 299 13832 7 4296
std/experimental/allocator/building_blocks/allocator_list.d 305 13832 33 4296
std/experimental/allocator/building_blocks/bitmapped_block.d 307 13832 33 4296
std/experimental/allocator/building_blocks/bucketizer.d 390 16552 12 4296
std/experimental/allocator/building_blocks/fallback_allocator.d 304 13836 33 4296
std/experimental/allocator/building_blocks/free_list.d 400 16548 35 4296
std/experimental/allocator/building_blocks/free_tree.d 306 13824 33 4296
std/experimental/allocator/building_blocks/kernighan_ritchie.d 308 13836 29 4296
std/experimental/allocator/building_blocks/null_allocator.d 54 4512 25 4296
std/experimental/allocator/building_blocks/package.d 394 16540 45 4296
std/experimental/allocator/building_blocks/quantizer.d 308 13824 33 4296
std/experimental/allocator/building_blocks/region.d 311 13824 41 4296
std/experimental/allocator/building_blocks/scoped_allocator.d 302 13832 31 4296
std/experimental/allocator/building_blocks/segregator.d 387 16552 32 4296
std/experimental/allocator/building_blocks/stats_collector.d 399 16556 34 4296
std/experimental/allocator/common.d 305 13804 31 4296
std/experimental/allocator/gc_allocator.d 300 13812 33 4296
std/experimental/allocator/mallocator.d 308 13812 34 4296
std/experimental/allocator/mmap_allocator.d 11 4296 8 4296
std/experimental/allocator/package.d 302 13800 52 4480
std/experimental/allocator/showcase.d 305 13808 40 4296
std/experimental/allocator/typed.d 307 13804 53 4488
std/experimental/logger/core.d 388 18716 351 19408
std/experimental/logger/filelogger.d 379 18724 359 19416
std/experimental/logger/multilogger.d 392 18724 355 19416
std/experimental/logger/nulllogger.d 391 18724 347 19416
std/experimental/logger/package.d 399 18712 351 19404
std/experimental/ndslice/internal.d 317 13804 31 4296
std/experimental/ndslice/iteration.d 312 13808 33 4296
std/experimental/ndslice/package.d 308 13796 34 4296
std/experimental/ndslice/selection.d 320 13808 33 4296
std/experimental/ndslice/slice.d 304 13804 32 4296
std/experimental/typecons.d 308 13796 25 4296
std/net/curl.d 459 28468 410 28468
std/net/isemail.d 303 13788 30 4296
std/range/interfaces.d 297 13792 16 4296
std/range/package.d 300 13780 48 4296
std/range/primitives.d 296 13792 17 4296