Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SR-5506] Change name mangling of subscripts to no longer contain the string "subscript" #48078

Closed
ahoppen opened this issue Jul 19, 2017 · 12 comments
Assignees
Labels
affects ABI Flag: Affects ABI compiler The Swift compiler in itself improvement

Comments

@ahoppen
Copy link
Contributor

ahoppen commented Jul 19, 2017

Previous ID SR-5506
Radar rdar://problem/33421242
Original Reporter @ahoppen
Type Improvement
Status Closed
Resolution Done
Additional Detail from JIRA
Votes 0
Component/s Compiler
Labels Improvement, AffectsABI
Assignee @ahoppen
Priority Medium

md5: 07c3a49891c6f12b15b2df18829ec01f

relates to:

  • SR-5568 Crash for class with subscript and member named "subscript" that have the same signature

Issue Description:

Subscripts are no longer represented internally by the identifier "subscript" but by a special DeclBaseName. In name mangling, however, the string "subscript" still surfaces (e.g. _T04test3FooC9subscriptyycfg). We should use a special flag like "fS" here instead, similar to how "fC" is used for constructors or "fD" for destructors.

@ahoppen
Copy link
Contributor Author

ahoppen commented Jul 19, 2017

Discussion on swift-dev: 1, 2

@ahoppen
Copy link
Contributor Author

ahoppen commented Jul 19, 2017

Before I start off, here's my plan for the new mangling scheme, @eeckstein, could you maybe have a look through and see if you've got any objections to it so far?

In the grammar, extend entity-spec by the following two rules for the subscript getters and setters respectively:

entity-spec ::= type 'fSg'
entity-spec ::= type 'fSs'

type specifies the function signature of the subscript and fSg/fSs denotes the enitity as a subscript getter/setter.

Example

In the following file named test.swift:

struct Input {}

struct Output {}

class Foo {
  subscript(myarg myarg: Input) -> Output {
    fatalError()
  }
}

The subscript getter would be mangled as _T04test3FooCAA6OutputVAA5InputV5myarg_tfSg.
It is currently mangled as _T04test3FooC9subscriptAA6OutputVAA5InputV5myarg_tcfg

@belkadan
Copy link
Contributor

Subscripts themselves also need to be mangled for SourceKit purposes: they get a unique USR, and act as a DeclContext for any generic parameters.

@eeckstein
Copy link
Member

We need to include an optional file discriminator in case the same subscript is defined in two files (with fileprivate):

entity-spec ::= type file-discriminator? 'fSg'
entity-spec ::= type file-discriminator? 'fSs'

@eeckstein
Copy link
Member

A comment on the demangler implementation: I suggest to not add a new demangler node (in DemangleNodes.def). Instead create the same node tree as now, e.g:

kind=Global
  kind=Getter
    kind=Structure
      kind=Module, text="m"
      kind=Identifier, text="Mystruct"
    kind=Identifier, text="subscript"
    kind=Type
      kind=FunctionType
        kind=ArgumentTuple
          kind=Type
            kind=Structure
              kind=Module, text="Swift"
              kind=Identifier, text="Int"
        kind=ReturnType
          kind=Type
            kind=Structure
              kind=Module, text="Swift"
              kind=Identifier, text="Int"

The re-mangler can then just make a special case for an "subscript" identifier.
This has the advantage that the old (< swift 4) de/re-mangling still works, which we still have to support.

@bob-wilson
Copy link

@swift-ci create

@ahoppen
Copy link
Contributor Author

ahoppen commented Jul 28, 2017

Thanks for the feedback. I've just tried to implement it and had a new idea that I like a lot better: Introduce a new level of indirection in the definition of decl-name that allows direct representation of special DeclNames:

decl-name ::= decl-base-name
decl-name ::= decl-base-name 'L' INDEX         // locally-discriminated declaration
decl-name ::= decl-base-name identifier 'LL'   // file-discriminated declaration

decl-base-name ::= identifier
decl-base-name ::= 'nS' // Special *n*ame *s*ubscript

Implementation would be trivial by just writing the 'nS' operator when mangling of a subscript DeclBaseName and adding support for parsing the 'nS' operator in the demangler. The only disadvantage I see is that we burn another character ('n') for mangling operators and it looks like the supply is already running low.

I feel like I need to create a new demangler node, though, so that we can differentiate between computed properties named 'subscript' and proper subscripts with the same signature (see SR-5568), but creating a new node "SubscriptName" (or a similar name) that replaces kind=Identifier, text="subscript" in the demangle tree should be fairly straightforward as well, even for old mangling.

Any thoughts and/or objections, especially for the 'nS' operator?

@eeckstein
Copy link
Member

> The only disadvantage I see is that we burn another character ('n') for mangling operators and it looks like the supply is already running low.

That's true. Therefore I suggest to use 'fSg'/'fSs'.
But this would still allow to do your other proposed change (make it part of decl-name).

I think if we go that way, it only makes sense if we also change 'fC'/'fc' to get rid of file-discriminator at all (and this would not be a trivial change).
Otherwise I prefer to have the subscript mangling consistent with 'fC'/'fc'.

@belkadan
Copy link
Contributor

I don't care where the file discriminator is, but you definitely need it. Otherwise two files can end up defining the same subscript when using private, which will break in WMO builds.

@ahoppen
Copy link
Contributor Author

ahoppen commented Jul 28, 2017

My idea was to still keep the 'fg', 'fs', or 'i' suffix to distinguish between subscript getters, setters and the subscript itself, mangling e.g. the getter in the example above as "_T04test3FooC nS AA6OutputVAA5InputV5myarg_tcfg". I thought this had the advantage that it exactly mirrors the internal representation of the subscript's getter being a getter to an entity that just happens to hava special name (which is now no longer an identifier) and that things like addressors (e.g. 'flo') would work out of the box, though I don't completely understand if they are necessary or even make sense.
Thinking about it a bit more, maybe jumping through a few extra hoops to specify the subscript at the end is the right decision. I'll try that implementation and see how much sense it makes.
Are there any more mangled names that include subscripts though apart from: getter, setter, the subscript itself?

@belkadan
Copy link
Contributor

There are other accessors besides getters and setters, but if you go through the generalized accessor code that should be fine.

Functions, closures, or types nested within subscript accessors will contain the subscript mangled name as their "context".

@ahoppen
Copy link
Contributor Author

ahoppen commented Jul 29, 2017

Having tried multiple approaches now, I decided to keep this as a public notepad of my ideas and thoughts. Feel free to comment if you've got any thoughts about it or just ignore it.

For the mangling of the subscript itself, we can just drop the "subscript" string from the mangled name since 'i' uniquely identifies that the mangled entity is a subscript. No problem here.

The function entity mangling (i.e. everything behind a 'f') is currently used for two different things:

  • Describing function unnamed entities on a type (e.g. constructor)

  • Describing function entities for named members of a type (e.g. property getter)

Thus only the last element in the mangled name string describes an unnamed entity (for 1. it's the constructor and for 2. it's the getter/setter part).

Subscript accessors now have two unnamed elements, namely the subscript and the accessor kind.

As far as I can see, this can be represented in the following ways:

  • Give the subscripts a special name that can't collide with any identifier

    • This was my second idea

    • Advangages:

      • Would be only a minor change to the existing implementation

      • The internal representation could stay nearly the same SubscriptName demangler node representing the newly introduced name

    • Disadvantage: It requires a new operator in the name mangling schem and I couldn't find any two-character group where this operator would fit into, so I'd suggest opening up a new one character group

    • Example subscript getter mangling: _T04test3FooC nS AA6OutputVAA5InputV5myarg_tcfg

  • Use the nearly unchanged internal representation from 1. but depict the mangled name as a subscript in the end

    • This was my first proposal

    • Example subscript getter mangling: _T04test3FooCAA6OutputVAA5InputV5myarg_tf S g

    • Advantage: Similar to constructor mangling

    • Disadvantage: Fairly complex and dirty implementation

      • The demangler would need to rewrite the internal data structure created so far to inject the `SubscriptName` into the demangle tree and the remangler would need to peek into the tree, to see if it contains a `SubscriptName` to decide if the function entity should be mangled with an 'S'
  • Make function entities refer to enities instead of names of entities

    • Currently, the accessor node in the demangle tree has three children: Context, name, type

    • Make the accessor node have a single child that is either a variable or a subscript

      • By definition, the subscript node would not have to contain a name
    • Example mangling of a subscript getter: _T04test3FooCAA6OutputVAA5InputV5myarg_tc i fg

    • Example mangling of a computed property getter: _T04main3FooC3barAA6OutputV v fg

    • Advantage:

      • Property and subscript accessors mangled exactly the same

      • Accurate representation of the accessors in the AST, IMHO

    • Disadvantages:

      • Increases the mangled name size of instance variable accessors by 1 byte

      • The internal representation of the mangled name needs to change with non-trivial changes to the old remangler being required

I am currently pursuing the third approach.

@swift-ci swift-ci transferred this issue from apple/swift-issues Apr 25, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects ABI Flag: Affects ABI compiler The Swift compiler in itself improvement
Projects
None yet
Development

No branches or pull requests

4 participants