Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[flang] Add the proposal document and rationale for the internal nami…
…ng module that was previously added. Summary: This document describes how uniquing of internal names is done. This name uniquing is done to support the constraints and invariants of the FIR dialect of MLIR. Reviewers: jeanPerier, mehdi_amini, DavidTruby, jdoerfert, sscalpone, kiranchandramohan Reviewed By: jeanPerier, sscalpone, kiranchandramohan Subscribers: tskeith, kiranchandramohan, rriddle, llvm-commits Tags: #llvm, #flang Differential Revision: https://reviews.llvm.org/D79089
- Loading branch information
1 parent
5d46e4b
commit 7875362
Showing
1 changed file
with
118 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
## Bijective Internal Name Uniquing | ||
|
||
FIR has a flat namespace. No two objects may have the same name at | ||
the module level. (These would be functions, globals, etc.) | ||
This necessitates some sort of encoding scheme to unique | ||
symbols from the front-end into FIR. | ||
|
||
Another requirement is | ||
to be able to reverse these unique names and recover the associated | ||
symbol in the symbol table. | ||
|
||
Fortran is case insensitive, which allows the compiler to convert the | ||
user's identifiers to all lower case. Such a universal conversion implies | ||
that all upper case letters are available for use in uniquing. | ||
|
||
### Prefix `_Q` | ||
|
||
All uniqued names have the prefix sequence `_Q` to indicate the name has | ||
been uniqued. (Q is chosen because it is a | ||
[low frequency letter](http://pi.math.cornell.edu/~mec/2003-2004/cryptography/subs/frequencies.html) | ||
in English.) | ||
|
||
### Scope Building | ||
|
||
Symbols can be scoped by the module, submodule, or procedure that contains | ||
that symbol. After the `_Q` sigil, names are constructed from outermost to | ||
innermost scope as | ||
|
||
* Module name prefixed with `M` | ||
* Submodule name prefixed with `S` | ||
* Procedure name prefixed with `F` | ||
|
||
Given: | ||
``` | ||
submodule (mod:s1mod) s2mod | ||
... | ||
subroutine sub | ||
... | ||
contains | ||
function fun | ||
``` | ||
|
||
The uniqued name of `fun` becomes: | ||
``` | ||
_QMmodSs1modSs2modFsubPfun | ||
``` | ||
|
||
### Common blocks | ||
|
||
* A common block name will be prefixed with `B` | ||
|
||
### Module scope global data | ||
|
||
* A global data entity is prefixed with `E` | ||
* A global entity that is constant (parameter) will be prefixed with `EC` | ||
|
||
### Procedures/Subprograms | ||
|
||
* A procedure/subprogram is prefixed with `P` | ||
|
||
Given: | ||
``` | ||
subroutine sub | ||
``` | ||
The uniqued name of `sub` becomes: | ||
``` | ||
_QPsub | ||
``` | ||
|
||
### Derived types and related | ||
|
||
* A derived type is prefixed with `T` | ||
* If a derived type has KIND parameters, they are listed in a consistent | ||
canonical order where each takes the form `Ki` and where _i_ is the | ||
compile-time constant value. (All type parameters are integer.) If _i_ | ||
is a negative value, the prefix `KN` will be used and _i_ will reflect | ||
the magnitude of the value. | ||
|
||
Given: | ||
``` | ||
module mymodule | ||
type mytype | ||
integer :: member | ||
end type | ||
... | ||
``` | ||
The uniqued name of `mytype` becomes: | ||
``` | ||
_QMmymoduleTmytype | ||
``` | ||
|
||
Given: | ||
``` | ||
type yourtype(k1,k2) | ||
integer, kind :: k1, k2 | ||
real :: mem1 | ||
complex :: mem2 | ||
end type | ||
``` | ||
|
||
The uniqued name of `yourtype` where `k1=4` and `k2=-6` (at compile-time): | ||
``` | ||
_QTyourtypeK4KN6 | ||
``` | ||
|
||
* A derived type dispatch table is prefixed with `D`. The dispatch table | ||
for `type t` would be `_QDTt` | ||
* A type descriptor instance is prefixed with `C`. Intrinsic types can | ||
be encoded with their names and kinds. The type descriptor for the | ||
type `yourtype` above would be `_QCTyourtypeK4KN6`. The type | ||
descriptor for `REAL(4)` would be `_QCrealK4`. | ||
|
||
### Compiler generated names | ||
|
||
Compiler generated names do not have to be mapped back to Fortran. These | ||
names will be prefixed with `_QQ` and followed by a unique compiler | ||
generated identifier. There is, of course, no mapping back to a symbol | ||
derived from the input source in this case as no such symbol exists. |