Skip to content

Latest commit

 

History

History
404 lines (306 loc) · 15.9 KB

DIP1007.md

File metadata and controls

404 lines (306 loc) · 15.9 KB

"future symbol" Compiler Concept

Field Value
DIP: 1007
Review Count: 2 Most Recent
Author: Mihails Strasuns (mihails.strasuns.contractor@sociomantic.com)
Implementation: PR 6494
Status: Accepted

Abstract

It is currently not possible to introduce new symbols to D libraries without at least some risk of breaking user code. The existing deprecated keyword is only applicable for removing symbols, not for adding new ones.

For many libraries these problems are of very low importance, as normally projects upgrading their dependencies to a new feature release are expected to undergo some minor maintenance effort.

However, there are also certain libraries that set extremely high stability expectations, with the most notable example being DRuntime, the D standard runtime library. Even very minor breakage coming from such a project can affect many users and is likely to be unexpected.

Links

Description

Problem 1: import clash

Consider the maintenance of a simple library, foo, which was released with the following code:

module foo;
void someFunction () {}

It is widely used, with a typical import pattern looking like this:

void main ()
{
    import foo;
    someFunction();
}

Now the maintainer of the library adds a new feature:

module foo;
void someFunction () {}
void someOtherFunction () {}

Surprisingly, the addition may break (render uncompilable) an existing project if it was already using another library that defines a function with the same name:

void main ()
{
    import foo;
    import bar; // defined `someOtherFunction` before it was present in foo
    someFunction();
    someOtherFunction(); // after upgrading foo, symbols will clash here
}

Problem 2: clash of overridden methods

Consider another simple library, baz, which defines a class:

class Base
{
}

At some point the maintainer of the library wants to add a new method to this class, void method(), but doing so risks breaking user code too:

class Derived : Base
{
    void method(bool) { }
}

If a method with the same name was already present in the derived class, it will not compile, either because of a missing override keyword (if the signatures are compatible) or an attempt on a non-covariant override (if the signatures are unrelated).

Existing solutions

There are a few existing approaches to reduce the negative impact of adding new symbols:

Using a more obscure name

The most common approach is to simply pick names that are less likely to be already in use in downstream projects. It is a simple and practical solution that doesn't require any language support but has obvious scalability issues:

  • Results in unnecessarily baroque names, harming the learning curve
  • Picking unused names becomes harder as the number of downstream projects grows
  • Adding a new symbol has a high chance of being blocked on naming debates

Emulating namespaces

Another form of name obfuscation is using "namespace" aggregates:

final abstract class Foo
{
    static:
    void foo() {};
}

It makes naming clashes even less likely and the resulting names more obvious, but it suffers from different important issues:

  • Can't be used for new symbols only, as it's a fundamental library API design decision
  • Does not leverage the D module system
  • Can't be used to solve Problem 2

Deprecations

Contrary to adding a new symbol, removing or renaming one is a much easier task because D supports the deprecated keyword, ensuring that library users will get early notifications about a to-be-removed symbol before it is actually gone.

Similar warnings should also be suitable for adding new symbols to critical libraries, but the language currently lacks a way to express such semantics.

Proposal

This document proposes the introduction of a new kind of symbol concept into the compiler, the future symbol. In terms of DMD compiler internals it can be defined simply as an additional Boolean field added to DSymbol indicating if the symbol is future or not.

To make examples more readable, an imaginary @future attribute may be used to mark such symbols, for example @future void foo(). However introducing such an attribute (or any other) is NOT part of the proposal and serves only for illustration purposes.

Symbols marked as future must have special treatment during semantic analysis:

  1. Whenever the compiler performs a symbol lookup, it has to check if future symbol is present in the result. If present, a deprecation message shall be printed and everything else should continue as if it wasn't present.

    import moda; // provides foo
    import modb; // provides @future foo
    
    foo(); // Deprecation: upcoming adddition of modb.foo will result in a symbol
           // clash, please disambiguate
  2. Whenever the compiler has to check for the presence of a symbol, if there is a match which is future symbol, a deprecation message has to be printed and everything else should continue as if the respective declaration wasn't present.

    class A
    {
        void foo() @future;
    }
    
    class B : A
    {
        void foo(); // Deprecation: upcoming addition of A.foo will result in a
                    // symbol clash, please adjust
    }
  3. References to future symbol within the module in which it is declared should not result in any deprecation messages because the module can be adjusted at the same time the symbol is actually added.

    void foo (long a);
    void foo (int a) @future;
    
    foo(42); // No point in printing deprecation here as it will keep compiling
             // even when @future is removed
  4. Any form of static reflection that provides a sequence of results, for example __traits(allMembers) or .tupleof, shall ignore any future symbol completely. Such code does not normally rely on the presence of any specific symbol and thus the addition of a new one is unlikely to cause problems. On the other hand, deprecation warnings issued from such contexts cannot be addressed by adjusting client code.

  5. Any form of static reflection that implies access to a specific symbol, for example, __traits(getMember), should still print the deprecation message.

  6. The override keyword must not issue an error if it is already applied to a symbol in a derived class which matches the @future symbol in a base class. This is important to enable the adjustment of user code so that the deprecation dissapears in case the user does intend to override it eventually:

    class Base {
        string name () @future;
    }
    
    class Derived1 : Base {
        string name (); // Deprecation: upcoming addition of Base.name will
                        // result in a symbol clash, please disambugate
    }
    
    class Derived2 : Base {
        override string name (); // OK
    }

    Note that such a "fake" override still follows the normal rules, i.e. it is not allowed to relax the attributes of the base method. In principle, the functionality of the @future symbol can be extended to warn about changes in attributes, too, but it requires more consideration about the final syntax and is not strictly necessary for the purpose of this DIP, thus investigating this topic is postponed to a separate proposal.

As a first implementation step, this functionality should only be available as a hard list of symbols built into the compiler itself and reserved for DRuntime usage. See the breaking changes section for a detailed rationale.

It may also be advised to start with an implementation addressing only the two specific problems described in this DIP and discussing generalization later before turning it into a language feature. It is possible that those two cases cover the vast majority of problems and fixing only those will greatly simplify the implementation.

Once the concept is proven with DRuntime, it should be proposed as an actual language feature - by either updating this DIP or proposing a new one. For example, one can simply expose it with a new attribute called @future.

Rationale

  1. Currently there is no way to warn users of a library about the planned addition of new symbols.
  2. Existing workarounds don't eliminate the risk of breakage completely and all have various practical drawbacks.
  3. The problems described make it difficult to add new symbols to object.d because it is implicitly imported in all D modules of all D projects, increasing the chance of name clashes as the D userbase grows.
  4. The proposed change is not deemed of high importance for many D projects. Instead, it is proposed on the basis of being very important for those few that are in heavy use, making most of its impact indirect.
  5. Having a simple and reliable way to add new symbols to the runtime will reduce the amount of time spent debating names and improve the readability of the resulting APIs.
  6. Making new functionality available only to compiler developers will initially make it possible to adjust semantics or even completely revert the feature based on practical experience - without any risk that it will affect other users of the feature.

As an extra benefit, such a facility can provide a deprecation path for introducing new language keywords if the need arises, though this use case is beyond the scope of this DIP and won't be evaluated in detail.

Comparison against other languages

Existing programming languages can be separated into three main groups with regard to this issue:

With a single global namespace

Examples: C, early PHP

Such languages tend to rely on using obscure names that makes symbol clashes less likely, but in general the problem is still present. When the problem manifests, the situation becomes much worse than in D, because such languages lack the tools to disambiguate name clashes explicitly.

With namespaces, allowing non-qualified access

Examples: C++, C#

This category is most similar to D, meaning that the language provides tools to disambiguate any names but at the same time allows imports with non-qualified access to imported symbols.

In D this is the default import behavior.

In C++ : using namespace Namespace.Subnamespace.

In C# : using Namespace.Subnamespace

The importance of the problem varies depending on the prevalent idioms used in the language and other available tools. For example, the C# documentation recommends using the dedicated namespace access operator :: instead of the uniform . to reduce the risk of clashes.

However, there are a few aspects specific to D that make the issues somewhat more important:

  • In all the mentioned languages but D, a common convention is to only use unqualified access for symbols in the standard library. In D, plain import is the most widespread way to use all libraries, including those in the DUB registry.
  • The D runtime features the module object.d, which is implicitly imported everywhere, greatly increasing the chance for name clashes with any symbol it contains.

With namespaces, restricted non-qualified access

Examples: Rust, Go

In both Rust and Go, normal imports still require using the module name for qualification. For example, in Rust use Namespace::Subnamespace::Module still requires explicitly writing Module::foo() to call the foo function.

Both languages provide a way to enable unqualified access (import . "Module" in Go, use Namespace::Module::* in Rust), but this feature is heavily discouraged from usage in production projects exactly because of the problems described in this DIP.

Using qualified lookups consistently solves the issue described in this DIP but such an approach is too different to the established idiomatic D coding style to be considered. It is also not applicable to the problem with overriding/inheritance, which the mentioned languages don't have because they don't support classical class hierarchy-based polymorphism.

Breaking changes / deprecation process

The proposed semantics are purely additive and do not affect existing code. However, this DIP is likely to cause bugs during its initial implementation because symbol resolution in D is non-trivial.

Because of that, the intended implementation approach is to initially introduce the proposed concept as a feature internal to the compiler and use it for DRuntime symbols only. Otherwise, there is a certain chance of releasing the feature with bugs in symbol resolution semantics, which will become relied upon and widely used as a feature. If that happens, it will be much more difficult to fix the implementation to act as intended.

Once the implementation is confirmed to be working, adding actual syntax support is a trivial task unlikely to cause subsequent bugs.

Alternatively, if it proves to be simpler to implement, the feature can be added as an attribute with the reserved double-underscore prefix (i.e. @__future) with compiler checks ensuring that it is not used outside of core.*/std.*/object modules. If that course of action is chosen, such an attribute should be defined in the core.attribute module of DRuntime.

Examples

This proposal comes from Sociomantic's attempt to enhance the definition of Throwable in DRuntime to match internal changes: dlang/druntime#1445

The change itself was trivial (adding one new method to the Throwable class) but has resulted in at least one reported regression because of the override clash problem.

With the change proposed in this document it will become possible to add such a method by initially marking it as future symbol in one DMD release and actually adding the new method in the release after.

Copyright & License

Copyright (c) 2017 by the D Language Foundation

Licensed under Creative Commons Zero 1.0

Reviews

This DIP was accepted by the language authors. Before reaching the final decision, they consulted with Sociomantic to get feedback on their experience with the feature and clarify their understanding of it. They accepted the proposal with one exception: rather than restrict it to internal usage in DRuntime, it will be fully available upon implementation. As this is a purely additive feature, it should be enough to warn users initially that it's in "beta".

Formal Review Feedback

Preliminary Review Round 1