464 changes: 464 additions & 0 deletions flang/documentation/OpenMP-4.5-grammar.txt

Large diffs are not rendered by default.

670 changes: 670 additions & 0 deletions flang/documentation/OpenMP-semantics.md

Large diffs are not rendered by default.

1,339 changes: 1,339 additions & 0 deletions flang/documentation/OptionComparison.md

Large diffs are not rendered by default.

103 changes: 103 additions & 0 deletions flang/documentation/Overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
<!--===- documentation/Overview.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

# Overview of Compiler Phases

Each phase produces either correct output or fatal errors.

## Prescan and Preprocess

See: [Preprocessing.md](Preprocessing.md).

**Input:** Fortran source and header files, command line macro definitions,
set of enabled compiler directives (to be treated as directives rather than
comments).

**Output:**
- A "cooked" character stream: the entire program as a contiguous stream of
normalized Fortran source.
Extraneous whitespace and comments are removed (except comments that are
compiler directives that are not disabled) and case is normalized.
- Provenance information mapping each character back to the source it came from.
This is used in subsequent phases to issue errors messages that refer to source locations.

**Entry point:** `parser::Parsing::Prescan`

**Command:** `f18 -E src.f90` dumps the cooked character stream

## Parse

**Input:** Cooked character stream.

**Output:** A parse tree representing a syntactically correct program,
rooted at a `parser::Program`.
See: [Parsing.md](Parsing.md) and [ParserCombinators.md](ParserCombinators.md).

**Entry point:** `parser::Parsing::Parse`

**Command:**
- `f18 -fdebug-dump-parse-tree -fparse-only src.f90` dumps the parse tree
- `f18 -funparse src.f90` converts the parse tree to normalized Fortran

## Validate Labels and Canonicalize Do Statements

**Input:** Parse tree.

**Output:** The parse tree with label constraints and construct names checked,
and each `LabelDoStmt` converted to a `NonLabelDoStmt`.
See: [LabelResolution.md](LabelResolution.md).

**Entry points:** `semantics::ValidateLabels`, `parser::CanonicalizeDo`

## Resolve Names

**Input:** Parse tree (without `LabelDoStmt`) and `.mod` files from compilation
of USEd modules.

**Output:**
- Tree of scopes populated with symbols and types
- Parse tree with some refinements:
- each `parser::Name::symbol` field points to one of the symbols
- each `parser::TypeSpec::declTypeSpec` field points to one of the types
- array element references that were parsed as function references or
statement functions are corrected

**Entry points:** `semantics::ResolveNames`, `semantics::RewriteParseTree`

**Command:** `f18 -fdebug-dump-symbols -fparse-only src.f90` dumps the
tree of scopes and symbols in each scope

## Check DO CONCURRENT Constraints

**Input:** Parse tree with names resolved.

**Output:** Parse tree with semantically correct DO CONCURRENT loops.

## Write Module Files

**Input:** Parse tree with names resolved.

**Output:** For each module and submodule, a `.mod` file containing a minimal
Fortran representation suitable for compiling program units that depend on it.
See [ModFiles.md](ModFiles.md).

## Analyze Expressions and Assignments

**Input:** Parse tree with names resolved.

**Output:** Parse tree with `parser::Expr::typedExpr` filled in and semantic
checks performed on all expressions and assignment statements.

**Entry points**: `semantics::AnalyzeExpressions`, `semantics::AnalyzeAssignments`

## Produce the Intermediate Representation

**Input:** Parse tree with names and labels resolved.

**Output:** An intermediate representation of the executable program.
See [FortranIR.md](FortranIR.md).
164 changes: 164 additions & 0 deletions flang/documentation/ParserCombinators.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
<!--===- documentation/ParserCombinators.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

## Concept
The Fortran language recognizer here can be classified as an LL recursive
descent parser. It is composed from a *parser combinator* library that
defines a few fundamental parsers and a few ways to compose them into more
powerful parsers.

For our purposes here, a *parser* is any object that attempts to recognize
an instance of some syntax from an input stream. It may succeed or fail.
On success, it may return some semantic value to its caller.

In C++ terms, a parser is any instance of a class that
1. has a `constexpr` default constructor,
1. defines a type named `resultType`, and
1. provides a function (`const` member or `static`) that accepts a reference to a
`ParseState` as its argument and returns a `std::optional<resultType>` as a
result, with the presence or absence of a value in the `std::optional<>`
signifying success or failure, respectively.
```
std::optional<resultType> Parse(ParseState &) const;
```
The `resultType` of a parser is typically the class type of some particular
node type in the parse tree.

`ParseState` is a class that encapsulates a position in the source stream,
collects messages, and holds a few state flags that determine tokenization
(e.g., are we in a character literal?). Instances of `ParseState` are
independent and complete -- they are cheap to duplicate whenever necessary to
implement backtracking.

The `constexpr` default constructor of a parser is important. The functions
(below) that operate on instances of parsers are themselves all `constexpr`.
This use of compile-time expressions allows the entirety of a recursive
descent parser for a language to be constructed at compilation time through
the use of templates.

### Fundamental Predefined Parsers
These objects and functions are (or return) the fundamental parsers:

* `ok` is a trivial parser that always succeeds without advancing.
* `pure(x)` returns a trivial parser that always succeeds without advancing,
returning some value `x`.
* `fail<T>(msg)` denotes a trivial parser that always fails, emitting the
given message as a side effect. The template parameter is the type of
the value that the parser never returns.
* `cut` is a trivial parser that always fails silently.
* `nextCh` consumes the next character and returns its location,
and fails at EOF.
* `"xyz"_ch` succeeds if the next character consumed matches any of those
in the string and returns its location. Be advised that the source
will have been normalized to lower case (miniscule) letters outside
character and Hollerith literals and edit descriptors before parsing.

### Combinators
These functions and operators combine existing parsers to generate new parsers.
They are `constexpr`, so they should be viewed as type-safe macros.

* `!p` succeeds if p fails, and fails if p succeeds.
* `p >> q` fails if p does, otherwise running q and returning its value when
it succeeds.
* `p / q` fails if p does, otherwise running q and returning p's value
if q succeeds.
* `p || q` succeeds if p does, otherwise running q. The two parsers must
have the same type, and the value returned by the first succeeding parser
is the value of the combination.
* `first(p1, p2, ...)` returns the value of the first parser that succeeds.
All of the parsers in the list must return the same type.
It is essentially the same as `p1 || p2 || ...` but has a slightly
faster implementation and may be easier to format in your code.
* `lookAhead(p)` succeeds if p does, but doesn't modify any state.
* `attempt(p)` succeeds if p does, safely preserving state on failure.
* `many(p)` recognizes a greedy sequence of zero or more nonempty successes
of p, and returns `std::list<>` of their values. It always succeeds.
* `some(p)` recognized a greedy sequence of one or more successes of p.
It fails if p immediately fails.
* `skipMany(p)` is the same as `many(p)`, but it discards the results.
* `maybe(p)` tries to match p, returning an `std::optional<T>` value.
It always succeeds.
* `defaulted(p)` matches p, and when p fails it returns a
default-constructed instance of p's resultType. It always succeeds.
* `nonemptySeparated(p, q)` repeatedly matches "p q p q p q ... p",
returning a `std::list<>` of only the values of the p's. It fails if
p immediately fails.
* `extension(p)` parses p if strict standard compliance is disabled,
or with a warning if nonstandard usage warnings are enabled.
* `deprecated(p)` parses p if strict standard compliance is disabled,
with a warning if deprecated usage warnings are enabled.
* `inContext(msg, p)` runs p within an error message context; any
message that `p` generates will be tagged with `msg` as its
context. Contexts may nest.
* `withMessage(msg, p)` succeeds if `p` does, and if it does not,
it discards the messages from `p` and fails with the specified message.
* `recovery(p, q)` is equivalent to `p || q`, except that error messages
generated from the first parser are retained, and a flag is set in
the ParseState to remember that error recovery was necessary.
* `localRecovery(msg, p, q)` is equivalent to `recovery(withMessage(msg, p), defaulted(cut >> p) >> q)`. It is useful for targeted error recovery situations
within statements.

Note that
```
a >> b >> c / d / e
```
matches a sequence of five parsers, but returns only the result that was
obtained by matching `c`.

### Applicatives
The following *applicative* combinators combine parsers and modify or
collect the values that they return.

* `construct<T>(p1, p2, ...)` matches zero or more parsers in succession,
collecting their results and then passing them with move semantics to a
constructor for the type T if they all succeed.
If there is a single parser as the argument and it returns no usable
value but only success or failure (_e.g.,_ `"IF"_tok`), the default
nullary constructor of the type `T` is called.
* `sourced(p)` matches p, and fills in its `source` data member with the
locations of the cooked character stream that it consumed
* `applyFunction(f, p1, p2, ...)` matches one or more parsers in succession,
collecting their results and passing them as rvalue reference arguments to
some function, returning its result.
* `applyLambda([](&&x){}, p1, p2, ...)` is the same thing, but for lambdas
and other function objects.
* `applyMem(mf, p1, p2, ...)` is the same thing, but invokes a member
function of the result of the first parser for updates in place.

### Token Parsers
Last, we have these basic parsers on which the actual grammar of the Fortran
is built. All of the following parsers consume characters acquired from
`nextCh`.

* `space` always succeeds after consuming any spaces
* `spaceCheck` always succeeds after consuming any spaces, and can emit
a warning if there was no space in free form code before a character
that could continue a name or keyword
* `digit` matches one cooked decimal digit (0-9)
* `letter` matches one cooked letter (A-Z)
* `"..."_tok` match the content of the string, skipping spaces before and
after. Internal spaces are optional matches. The `_tok` suffix is
optional when the parser appears before the combinator `>>` or after
the combinator `/`.
* `"..."_sptok` is a string match in which the spaces are required in
free form source.
* `"..."_id` is a string match for a complete identifier (not a prefix of
a longer identifier or keyword).
* `parenthesized(p)` is shorthand for `"(" >> p / ")"`.
* `bracketed(p)` is shorthand for `"[" >> p / "]"`.
* `nonEmptyList(p)` matches a comma-separated list of one or more
instances of p.
* `nonEmptyList(errorMessage, p)` is equivalent to
`withMessage(errorMessage, nonemptyList(p))`, which allows one to supply
a meaningful error message in the event of an empty list.
* `optionalList(p)` is the same thing, but can be empty, and always succeeds.

### Debugging Parser
Last, a string literal `"..."_debug` denotes a parser that emits the string to
`llvm::errs` and succeeds. It is useful for tracing while debugging a parser but should
obviously not be committed for production code.
213 changes: 213 additions & 0 deletions flang/documentation/Parsing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,213 @@
<!--===- documentation/Parsing.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

The F18 Parser
==============
This program source code implements a parser for the Fortran programming
language.

The draft ISO standard for Fortran 2018 dated July 2017 was used as the
primary definition of the language. The parser also accepts many features
from previous versions of the standard that are no longer part of the Fortran
2018 language.

It also accepts many features that have never been part of any version
of the standard Fortran language but have been supported by previous
implementations and are known or suspected to remain in use. As a
general principle, we want to recognize and implement any such feature
so long as it does not conflict with requirements of the current standard
for Fortran.

The parser is implemented in standard ISO C++ and requires the 2017
edition of the language and library. The parser constitutes a reentrant
library with no mutable or constructed static data. Best modern C++
programming practices are observed to ensure that the ownership of
dynamic memory is clear, that value rather than object semantics are
defined for the data structures, that most functions are free from
invisible side effects, and that the strictest available type checking
is enforced by the C++ compiler when the Fortran parser is built.
Class inheritance is rare and dynamic polymorphism is avoided in favor
of modern discriminated unions. To the furthest reasonable extent, the
parser has been implemented in a declarative fashion that corresponds
closely to the text of the Fortran language standard.

The several major modules of the Fortran parser are composed into a
top-level Parsing class, by means of which one may drive the parsing of a
source file and receive its parse tree and error messages. The interfaces
of the Parsing class correspond to the two major passes of the parser,
which are described below.

Prescanning and Preprocessing
-----------------------------
The first pass is performed by an instance of the Prescanner class,
with help from an instance of Preprocessor.

The prescanner generates the "cooked character stream", implemented
by a CookedSource class instance, in which:
* line ends have been normalized to single ASCII LF characters (UNIX newlines)
* all `INCLUDE` files have been expanded
* all continued Fortran source lines have been unified
* all comments and insignificant spaces have been removed
* fixed form right margins have been clipped
* extra blank card columns have been inserted into character literals
and Hollerith constants
* preprocessing directives have been implemented
* preprocessing macro invocations have been expanded
* legacy `D` lines in fixed form source have been omitted or included
* except for the payload in character literals, Hollerith constants,
and character and Hollerith edit descriptors, all letters have been
normalized to lower case
* all original non-ASCII characters in Hollerith constants have been
decoded and re-encoded into UTF-8

Lines in the cooked character stream can be of arbitrary length.

The purpose of the cooked character stream is to enable the implementation
of a parser whose sole concern is the recognition of the Fortran language
from productions that closely correspond to the grammar that is presented
in the Fortran standard, without having to deal with the complexity of
all of the source-level concerns in the preceding list.

The implementation of the preprocessor interacts with the prescanner by
means of _token sequences_. These are partitionings of input lines into
contiguous virtual blocks of characters, and are the only place in this
Fortran compiler in which we have reified a tokenization of the program
source; the parser proper does not have a tokenizer. The prescanner
builds these token sequences out of source lines and supplies them
to the preprocessor, which interprets directives and expands macro
invocations. The token sequences returned by the preprocessor are then
marshaled to constitute the cooked character stream that is the output of
the prescanner.

The preprocessor and prescanner can both instantiate new temporary
instances of the Prescanner class to locate, open, and process any
include files.

The tight interaction and mutual design of the prescanner and preprocessor
enable a principled implementation of preprocessing for the Fortran
language that implements a reasonable facsimile of the C language
preprocessor that is fully aware of Fortran's source forms, line
continuation mechanisms, case insensitivity, token syntax, &c.

The preprocessor always runs. There's no good reason for it not to.

The content of the cooked character stream is available and useful
for debugging, being as it is a simple value forwarded from the first major
pass of the compiler to the second.

Source Provenance
-----------------
The prescanner constructs a chronicle of every file that is read by the
parser, viz. the original source file and all others that it directly
or indirectly includes. One copy of the content of each of these files
is mapped or read into the address space of the parser. Memory mapping
is used initially, but files with DOS line breaks or a missing terminal
newline are immediately normalized in a buffer when necessary.

The virtual input stream, which marshals every appearance of every file
and every expansion of every macro invocation, is not materialized as
an actual stream of bytes. There is, however, a mapping from each byte
position in this virtual input stream back to whence it came (maintained
by an instance of the AllSources class). Offsets into this virtual input
stream constitute values of the Provenance class. Provenance values,
and contiguous ranges thereof, are used to describe and delimit source
positions for messaging.

Further, every byte in the cooked character stream supplied by the
prescanner to the parser can be inexpensively mapped to its provenance.
Simple `const char *` pointers to characters in the cooked character
stream, or to contiguous ranges thereof, are used as source position
indicators within the parser and in the parse tree.

Messages
--------
Message texts, and snprintf-like formatting strings for constructing
messages, are instantiated in the various components of the parser with
C++ user defined character literals tagged with `_err_en_US` and `_en_US`
(signifying fatality and language, with the default being the dialect of
English used in the United States) so that they may be easily identified
for localization. As described above, messages are associated with
source code positions by means of provenance values.

The Parse Tree
--------------
Each of the ca. 450 numbered requirement productions in the standard
Fortran language grammar, as well as the productions implied by legacy
extensions and preserved obsolescent features, maps to a distinct class
in the parse tree so as to maximize the efficacy of static type checking
by the C++ compiler.

A transcription of the Fortran grammar appears with production requirement
numbers in the commentary before these class definitions, so that one
may easily refer to the standard (or to the parse tree definitions while
reading that document).

Three paradigms collectively implement most of the parse tree classes:
* *wrappers*, in which a single data member `v` has been encapsulated
in a new type
* *tuples* (or product types), in which several values of arbitrary
types have been encapsulated in a single data member `t` whose type
is an instance of `std::tuple<>`
* *discriminated unions* (or sum types), in which one value whose type is
a dynamic selection from a set of distinct types is saved in a data
member `u` whose type is an instance of `std::variant<>`

The use of these patterns is a design convenience, and exceptions to them
are not uncommon wherever it made better sense to write custom definitions.

Parse tree entities should be viewed as values, not objects; their
addresses should not be abused for purposes of identification. They are
assembled with C++ move semantics during parse tree construction.
Their default and copy constructors are deliberately deleted in their
declarations.

The std::list<> data type is used in the parse tree to reliably store pointers
to other relevant entries in the tree. Since the tree lists are moved and
spliced at certain points std::list<> provides the necessary guarantee of the
stability of pointers into these lists.

There is a general purpose library by means of which parse trees may
be traversed.

Parsing
-------
This compiler attempts to recognize the entire cooked character stream
(see above) as a Fortran program. It records the reductions made during
a successful recognition as a parse tree value. The recognized grammar
is that of a whole source file, not just of its possible statements,
so the parser has no global state that tracks the subprogram hierarchy
or the structure of their nested block constructs. The parser performs
no semantic analysis along the way, deferring all of that work to the
next pass of the compiler.

The resulting parse tree therefore necessarily contains ambiguous parses
that cannot be resolved without recourse to a symbol table. Most notably,
leading assignments to array elements can be misrecognized as statement
function definitions, and array element references can be misrecognized
as function calls. The semantic analysis phase of the compiler performs
local rewrites of the parse tree once it can be disambiguated by symbols
and types.

Formally speaking, this parser is based on recursive descent with
localized backtracking (specifically, it will not backtrack into a
successful reduction to try its other alternatives). It is not generated
as a table or code from a specification of the Fortran grammar; rather, it
_is_ the grammar, as declaratively respecified in C++ constant expressions
using a small collection of basic token recognition objects and a library
of "parser combinator" template functions that compose them to form more
complicated recognizers and their correspondences to the construction
of parse tree values.

Unparsing
---------
Parse trees can be converted back into free form Fortran source code.
This formatter is not really a classical "pretty printer", but is
more of a data structure dump whose output is suitable for compilation
by another compiler. It is also used for testing the parser, since a
reparse of an unparsed parse tree should be an identity function apart from
source provenance.
223 changes: 223 additions & 0 deletions flang/documentation/Preprocessing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,223 @@
<!--===- documentation/Preprocessing.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

Fortran Preprocessing
=====================

Behavior common to (nearly) all compilers:
------------------------------------------
* Macro and argument names are sensitive to case.
* Fixed form right margin clipping after column 72 (or 132)
has precedence over macro name recognition, and also over
recognition of function-like parentheses and arguments.
* Fixed form right margin clipping does not apply to directive lines.
* Macro names are not recognized as such when spaces are inserted
into their invocations in fixed form.
This includes spaces at the ends of lines that have been clipped
at column 72 (or whatever).
* Text is rescanned after expansion of macros and arguments.
* Macros are not expanded within quoted character literals or
quoted FORMAT edit descriptors.
* Macro expansion occurs before any effective token pasting via fixed form
space removal.
* C-like line continuations with backslash-newline are allowed in
directives, including the definitions of macro bodies.
* `/* Old style C comments */` are ignored in directives and
removed from the bodies of macro definitions.
* `// New style C comments` are not removed, since Fortran has OPERATOR(//).
* C-like line continuations with backslash-newline can appear in
old-style C comments in directives.
* After `#define FALSE TRUE`, `.FALSE.` is replaced by `.TRUE.`;
i.e., tokenization does not hide the names of operators or logical constants.
* `#define KWM c` allows the use of `KWM` in column 1 as a fixed form comment
line indicator.
* A `#define` directive intermixed with continuation lines can't
define a macro that's invoked earlier in the same continued statement.

Behavior that is not consistent over all extant compilers but which
probably should be uncontroversial:
-----------------------------------
* Invoked macro names can straddle a Fortran line continuation.
* ... unless implicit fixed form card padding intervenes; i.e.,
in fixed form, a continued macro name has to be split at column
72 (or 132).
* Comment lines may appear with continuations in a split macro names.
* Function-like macro invocations can straddle a Fortran fixed form line
continuation between the name and the left parenthesis, and comment and
directive lines can be there too.
* Function-like macro invocations can straddle a Fortran fixed form line
continuation between the parentheses, and comment lines can be there too.
* Macros are not expanded within Hollerith constants or Hollerith
FORMAT edit descriptors.
* Token pasting with `##` works in function-like macros.
* Argument stringization with `#` works in function-like macros.
* Directives can be capitalized (e.g., `#DEFINE`) in fixed form.
* Fixed form clipping after column 72 or 132 is done before macro expansion,
not after.
* C-like line continuation with backslash-newline can appear in the name of
a keyword-like macro definition.
* If `#` is in column 6 in fixed form, it's a continuation marker, not a
directive indicator.
* `#define KWM !` allows KWM to signal a comment.

Judgement calls, where precedents are unclear:
----------------------------------------------
* Expressions in `#if` and `#elif` should support both Fortran and C
operators; e.g., `#if 2 .LT. 3` should work.
* If a function-like macro does not close its parentheses, line
continuation should be assumed.
* ... However, the leading parenthesis has to be on the same line as
the name of the function-like macro, or on a continuation line thereof.
* If macros expand to text containing `&`, it doesn't work as a free form
line continuation marker.
* `#define c 1` does not allow a `c` in column 1 to be used as a label
in fixed form, rather than as a comment line indicator.
* IBM claims to be ISO C compliant and therefore recognizes trigraph sequences.
* Fortran comments in macro actual arguments should be respected, on
the principle that a macro call should work like a function reference.
* If a `#define` or `#undef` directive appears among continuation
lines, it may or may not affect text in the continued statement that
appeared before the directive.

Behavior that few compilers properly support (or none), but should:
-------------------------------------------------------------------
* A macro invocation can straddle free form continuation lines in all of their
forms, with continuation allowed in the name, before the arguments, and
within the arguments.
* Directives can be capitalized in free form, too.
* `__VA_ARGS__` and `__VA_OPT__` work in variadic function-like macros.

In short, a Fortran preprocessor should work as if:
---------------------------------------------------
1. Fixed form lines are padded up to column 72 (or 132) and clipped thereafter.
2. Fortran comments are removed.
3. C-style line continuations are processed in preprocessing directives.
4. C old-style comments are removed from directives.
5. Fortran line continuations are processed (outside preprocessing directives).
Line continuation rules depend on source form.
Comment lines that are enabled compiler directives have their line
continuations processed.
Conditional compilation preprocessing directives (e.g., `#if`) may be
appear among continuation lines, and have their usual effects upon them.
6. Other preprocessing directives are processed and macros expanded.
Along the way, Fortran `INCLUDE` lines and preprocessor `#include` directives
are expanded, and all these steps applied recursively to the introduced text.
7. Any Fortran comments created by macro replacement are removed.

Steps 5 and 6 are interleaved with respect to the preprocessing state.
Conditional compilation preprocessing directives always reflect only the macro
definition state produced by the active `#define` and `#undef` preprocessing directives
that precede them.

If the source form is changed by means of a compiler directive (i.e.,
`!DIR$ FIXED` or `FREE`) in an included source file, its effects cease
at the end of that file.

Last, if the preprocessor is not integrated into the Fortran compiler,
new Fortran continuation line markers should be introduced into the final
text.

OpenMP-style directives that look like comments are not addressed by
this scheme but are obvious extensions.

Appendix
========
`N` in the table below means "not supported"; this doesn't
mean a bug, it just means that a particular behavior was
not observed.
`E` signifies "error reported".

The abbreviation `KWM` stands for "keyword macro" and `FLM` means
"function-like macro".

The first block of tests (`pp0*.F`) are all fixed-form source files;
the second block (`pp1*.F90`) are free-form source files.

```
f18
| pgfortran
| | ifort
| | | gfortran
| | | | xlf
| | | | | nagfor
| | | | | |
. . . . . . pp001.F keyword macros
. . . . . . pp002.F #undef
. . . . . . pp003.F function-like macros
. . . . . . pp004.F KWMs case-sensitive
. N . N N . pp005.F KWM split across continuation, implicit padding
. N . N N . pp006.F ditto, but with intervening *comment line
N N N N N N pp007.F KWM split across continuation, clipped after column 72
. . . . . . pp008.F KWM with spaces in name at invocation NOT replaced
. N . N N . pp009.F FLM call split across continuation, implicit padding
. N . N N . pp010.F ditto, but with intervening *comment line
N N N N N N pp011.F FLM call name split across continuation, clipped
. N . N N . pp012.F FLM call name split across continuation
. E . N N . pp013.F FLM call split between name and (
. N . N N . pp014.F FLM call split between name and (, with intervening *comment
. E . N N . pp015.F FLM call split between name and (, clipped
. E . N N . pp016.F FLM call split between name and ( and in argument
. . . . . . pp017.F KLM rescan
. . . . . . pp018.F KLM rescan with #undef (so rescan is after expansion)
. . . . . . pp019.F FLM rescan
. . . . . . pp020.F FLM expansion of argument
. . . . . . pp021.F KWM NOT expanded in 'literal'
. . . . . . pp022.F KWM NOT expanded in "literal"
. . E E . E pp023.F KWM NOT expanded in 9HHOLLERITH literal
. . . E . . pp024.F KWM NOT expanded in Hollerith in FORMAT
. . . . . . pp025.F KWM expansion is before token pasting due to fixed-form space removal
. . . E . E pp026.F ## token pasting works in FLM
E . . E E . pp027.F #DEFINE works in fixed form
. N . N N . pp028.F fixed-form clipping done before KWM expansion on source line
. . . . . . pp029.F \ newline allowed in #define
. . . . . . pp030.F /* C comment */ erased from #define
E E E E E E pp031.F // C++ comment NOT erased from #define
. . . . . . pp032.F /* C comment */ \ newline erased from #define
. . . . . . pp033.F /* C comment \ newline */ erased from #define
. . . . . N pp034.F \ newline allowed in name on KWM definition
. E . E E . pp035.F #if 2 .LT. 3 works
. . . . . . pp036.F #define FALSE TRUE ... .FALSE. -> .TRUE.
N N N N N N pp037.F fixed-form clipping NOT applied to #define
. . E . E E pp038.F FLM call with closing ')' on next line (not a continuation)
E . E . E E pp039.F FLM call with '(' on next line (not a continuation)
. . . . . . pp040.F #define KWM c, then KWM works as comment line initiator
E . E . . E pp041.F use KWM expansion as continuation indicators
N N N . . N pp042.F #define c 1, then use c as label in fixed-form
. . . . N . pp043.F #define with # in column 6 is a continuation line in fixed-form
E . . . . . pp044.F #define directive amid continuations
. . . . . . pp101.F90 keyword macros
. . . . . . pp102.F90 #undef
. . . . . . pp103.F90 function-like macros
. . . . . . pp104.F90 KWMs case-sensitive
. N N N N N pp105.F90 KWM call name split across continuation, with leading &
. N N N N N pp106.F90 ditto, with & ! comment
N N E E N . pp107.F90 KWM call name split across continuation, no leading &, with & ! comment
N N E E N . pp108.F90 ditto, but without & ! comment
. N N N N N pp109.F90 FLM call name split with leading &
. N N N N N pp110.F90 ditto, with & ! comment
N N E E N . pp111.F90 FLM call name split across continuation, no leading &, with & ! comment
N N E E N . pp112.F90 ditto, but without & ! comment
. N N N N E pp113.F90 FLM call split across continuation between name and (, leading &
. N N N N E pp114.F90 ditto, with & ! comment, leading &
N N N N N . pp115.F90 ditto, with & ! comment, no leading &
N N N N N . pp116.F90 FLM call split between name and (, no leading &
. . . . . . pp117.F90 KWM rescan
. . . . . . pp118.F90 KWM rescan with #undef, proving rescan after expansion
. . . . . . pp119.F90 FLM rescan
. . . . . . pp120.F90 FLM expansion of argument
. . . . . . pp121.F90 KWM NOT expanded in 'literal'
. . . . . . pp122.F90 KWM NOT expanded in "literal"
. . E E . E pp123.F90 KWM NOT expanded in Hollerith literal
. . E E . E pp124.F90 KWM NOT expanded in Hollerith in FORMAT
E . . E E . pp125.F90 #DEFINE works in free form
. . . . . . pp126.F90 \ newline works in #define
N . E . E E pp127.F90 FLM call with closing ')' on next line (not a continuation)
E . E . E E pp128.F90 FLM call with '(' on next line (not a continuation)
. . N . . N pp129.F90 #define KWM !, then KWM works as comment line initiator
E . E . . E pp130.F90 #define KWM &, use for continuation w/o pasting (ifort and nag seem to continue #define)
```
47 changes: 47 additions & 0 deletions flang/documentation/PullRequestChecklist.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
<!--===- documentation/PullRequestChecklist.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

# Pull request checklist
Please review the following items before submitting a pull request. This list
can also be used when reviewing pull requests.
* Verify that new files have a license with correct file name.
* Run `git diff` on all modified files to look for spurious changes such as
`#include <iostream>`.
* If you added code that causes the compiler to emit a new error message, make
sure that you also added a test that causes that error message to appear
and verifies its correctness.
* Annotate the code and tests with appropriate references to constraint and
requirement numbers from the Fortran standard. Do not include the text of
the constraint or requirement, just its number.
* Alphabetize arbitrary lists of names.
* Check dereferences of pointers and optionals where necessary.
* Ensure that the scopes of all functions and variables are as local as
possible.
* Try to make all functions fit on a screen (40 lines).
* Build and test with both GNU and clang compilers.
* When submitting an update to a pull request, review previous pull request
comments and make sure that you've actually made all of the changes that
were requested.

## Follow the style guide
The following items are taken from the [C++ style guide](C++style.md). But
even though I've read the style guide, they regularly trip me up.
* Run clang-format using the git-clang-format script from LLVM HEAD.
* Make sure that all source lines have 80 or fewer characters. Note that
clang-format will do this for most code. But you may need to break up long
strings.
* Review declarations for proper use of `constexpr` and `const`.
* Follow the C++ [naming guidelines](C++style.md#naming).
* Ensure that the names evoke their purpose and are consistent with existing code.
* Used braced initializers.
* Review pointer and reference types to make sure that you're using them
appropriately. Note that the [C++ style guide](C++style.md) contains a
section that describes all of the pointer types along with their
characteristics.
* Declare non-member functions ```static``` when possible. Prefer
```static``` functions over functions in anonymous namespaces.
436 changes: 436 additions & 0 deletions flang/documentation/RuntimeDescriptor.md

Large diffs are not rendered by default.

156 changes: 156 additions & 0 deletions flang/documentation/Semantics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
<!--===- documentation/Semantics.md
Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
See https://llvm.org/LICENSE.txt for license information.
SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
-->

# Semantic Analysis

The semantic analysis pass determines if a syntactically correct Fortran
program is is legal by enforcing the constraints of the language.

The input is a parse tree with a `Program` node at the root;
and a "cooked" character stream, a contiguous stream of characters
containing a normalized form of the Fortran source.

The semantic analysis pass takes a parse tree for a syntactically
correct Fortran program and determines whether it is legal by enforcing
the constraints of the language.

If the program is not legal, the results of the semantic pass will be a list of
errors associated with the program.

If the program is legal, the semantic pass will produce a (possibly modified)
parse tree for the semantically correct program with each name mapped to a symbol
and each expression fully analyzed.

All user errors are detected either prior to or during semantic analysis.
After it completes successfully the program should compile with no error messages.
There may still be warnings or informational messages.

## Phases of Semantic Analysis

1. [Validate labels](#validate-labels) -
Check all constraints on labels and branches
2. [Rewrite DO loops](#rewrite-do-loops) -
Convert all occurrences of `LabelDoStmt` to `DoConstruct`.
3. [Name resolution](#name-resolution) -
Analyze names and declarations, build a tree of Scopes containing Symbols,
and fill in the `Name::symbol` data member in the parse tree
4. [Rewrite parse tree](#rewrite-parse-tree) -
Fix incorrect parses based on symbol information
5. [Expression analysis](#expression-analysis) -
Analyze all expressions in the parse tree and fill in `Expr::typedExpr` and
`Variable::typedExpr` with analyzed expressions; fix incorrect parses
based on the result of this analysis
6. [Statement semantics](#statement-semantics) -
Perform remaining semantic checks on the execution parts of subprograms
7. [Write module files](#write-module-files) -
If no errors have occurred, write out `.mod` files for modules and submodules

If phase 1 or phase 2 encounter an error on any of the program units,
compilation terminates. Otherwise, phases 3-6 are all performed even if
errors occur.
Module files are written (phase 7) only if there are no errors.

### Validate labels

Perform semantic checks related to labels and branches:
- check that any labels that are referenced are defined and in scope
- check branches into loop bodies
- check that labeled `DO` loops are properly nested
- check labels in data transfer statements

### Rewrite DO loops

This phase normalizes the parse tree by removing all unstructured `DO` loops
and replacing them with `DO` constructs.

### Name resolution

The name resolution phase walks the parse tree and constructs the symbol table.

The symbol table consists of a tree of `Scope` objects rooted at the global scope.
The global scope is owned by the `SemanticsContext` object.
It contains a `Scope` for each program unit in the compilation.

Each `Scope` in the scope tree contains child scopes representing other scopes
lexically nested in it.
Each `Scope` also contains a map of `CharBlock` to `Symbol` representing names
declared in that scope. (All names in the symbol table are represented as
`CharBlock` objects, i.e. as substrings of the cooked character stream.)

All `Symbol` objects are owned by the symbol table data structures.
They should be accessed as `Symbol *` or `Symbol &` outside of the symbol
table classes as they can't be created, copied, or moved.
The `Symbol` class has functions and data common across all symbols, and a
`details` field that contains more information specific to that type of symbol.
Many symbols also have types, represented by `DeclTypeSpec`.
Types are also owned by scopes.

Name resolution happens on the parse tree in this order:
1. Process the specification of a program unit:
1. Create a new scope for the unit
2. Create a symbol for each contained subprogram containing just the name
3. Process the opening statement of the unit (`ModuleStmt`, `FunctionStmt`, etc.)
4. Process the specification part of the unit
2. Apply the same process recursively to nested subprograms
3. Process the execution part of the program unit
4. Process the execution parts of nested subprograms recursively

After the completion of this phase, every `Name` corresponds to a `Symbol`
unless an error occurred.

### Rewrite parse tree

The parser cannot build a completely correct parse tree without symbol information.
This phase corrects mis-parses based on symbols:
- Array element assignments may be parsed as statement functions: `a(i) = ...`
- Namelist group names without `NML=` may be parsed as format expressions
- A file unit number expression may be parsed as a character variable

This phase also produces an internal error if it finds a `Name` that does not
have its `symbol` data member filled in. This error is suppressed if other
errors have occurred because in that case a `Name` corresponding to an erroneous
symbol may not be resolved.

### Expression analysis

Expressions that occur in the specification part are analyzed during name
resolution, for example, initial values, array bounds, type parameters.
Any remaining expressions are analyzed in this phase.

For each `Variable` and top-level `Expr` (i.e. one that is not nested below
another `Expr` in the parse tree) the analyzed form of the expression is saved
in the `typedExpr` data member. After this phase has completed, the analyzed
expression can be accessed using `semantics::GetExpr()`.

This phase also corrects mis-parses based on the result of expression analysis:
- An expression like `a(b)` is parsed as a function reference but may need
to be rewritten to an array element reference (if `a` is an object entity)
or to a structure constructor (if `a` is a derive type)
- An expression like `a(b:c)` is parsed as an array section but may need to be
rewritten as a substring if `a` is an object with type CHARACTER

### Statement semantics

Multiple independent checkers driven by the `SemanticsVisitor` framework
perform the remaining semantic checks.
By this phase, all names and expressions that can be successfully resolved
have been. But there may be names without symbols or expressions without
analyzed form if errors occurred earlier.

### Write module files

Separate compilation information is written out on successful compilation
of modules and submodules. These are used as input to name resolution
in program units that `USE` the modules.

Module files are stripped down Fortran source for the module.
Parts that aren't needed to compile dependent program units (e.g. action statements)
are omitted.

The module file for module `m` is named `m.mod` and the module file for
submodule `s` of module `m` is named `m-s.mod`.
801 changes: 801 additions & 0 deletions flang/documentation/f2018-grammar.txt

Large diffs are not rendered by default.

38 changes: 38 additions & 0 deletions flang/documentation/flang-c-style.el
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
;;===-- documentation/flang-c-style.el ------------------------------------===;;
;;
;; Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
;; See https://llvm.org/LICENSE.txt for license information.
;; SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
;;
;;===----------------------------------------------------------------------===;;

;; Define a cc-mode style for editing C++ codes in Flang.
;;
;; Inspired from LLVM style in
;; https://github.com/llvm-mirror/llvm/blob/master/utils/emacs/emacs.el
;;

(c-add-style "flang"
'("gnu"
(fill-column . 80)
(c++-indent-level . 2)
(c-basic-offset . 2)
(indent-tabs-mode . nil)
(c-offsets-alist .
((arglist-intro . ++)
(innamespace . 0)
(member-init-intro . ++)
))
))


;;
;; Use the following to make it the default.
;;

(defun flang-c-mode-hook ()
(c-set-style "flang")
)

(add-hook 'c-mode-hook 'flang-c-mode-hook)
(add-hook 'c++-mode-hook 'flang-c-mode-hook)
1 change: 1 addition & 0 deletions flang/include/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
add_subdirectory(flang)
3 changes: 3 additions & 0 deletions flang/include/flang/CMakeLists.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
if(LINK_WITH_FIR)
add_subdirectory(Optimizer)
endif()
65 changes: 65 additions & 0 deletions flang/include/flang/Common/Fortran-features.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
//===-- include/flang/Common/Fortran-features.h -----------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_FORTRAN_FEATURES_H_
#define FORTRAN_COMMON_FORTRAN_FEATURES_H_

#include "flang/Common/Fortran.h"
#include "flang/Common/enum-set.h"
#include "flang/Common/idioms.h"

namespace Fortran::common {

ENUM_CLASS(LanguageFeature, BackslashEscapes, OldDebugLines,
FixedFormContinuationWithColumn1Ampersand, LogicalAbbreviations,
XOROperator, PunctuationInNames, OptionalFreeFormSpace, BOZExtensions,
EmptyStatement, AlternativeNE, ExecutionPartNamelist, DECStructures,
DoubleComplex, Byte, StarKind, QuadPrecision, SlashInitialization,
TripletInArrayConstructor, MissingColons, SignedComplexLiteral,
OldStyleParameter, ComplexConstructor, PercentLOC, SignedPrimary, FileName,
Convert, Dispose, IOListLeadingComma, AbbreviatedEditDescriptor,
ProgramParentheses, PercentRefAndVal, OmitFunctionDummies, CrayPointer,
Hollerith, ArithmeticIF, Assign, AssignedGOTO, Pause, OpenMP,
CruftAfterAmpersand, ClassicCComments, AdditionalFormats, BigIntLiterals,
RealDoControls, EquivalenceNumericWithCharacter, AdditionalIntrinsics,
AnonymousParents, OldLabelDoEndStatements, LogicalIntegerAssignment,
EmptySourceFile, ProgramReturn)

using LanguageFeatures = EnumSet<LanguageFeature, LanguageFeature_enumSize>;

class LanguageFeatureControl {
public:
LanguageFeatureControl() {
// These features must be explicitly enabled by command line options.
disable_.set(LanguageFeature::OldDebugLines);
disable_.set(LanguageFeature::OpenMP);
// These features, if enabled, conflict with valid standard usage,
// so there are disabled here by default.
disable_.set(LanguageFeature::BackslashEscapes);
disable_.set(LanguageFeature::LogicalAbbreviations);
disable_.set(LanguageFeature::XOROperator);
}
LanguageFeatureControl(const LanguageFeatureControl &) = default;
void Enable(LanguageFeature f, bool yes = true) { disable_.set(f, !yes); }
void EnableWarning(LanguageFeature f, bool yes = true) { warn_.set(f, yes); }
void WarnOnAllNonstandard(bool yes = true) { warnAll_ = yes; }
bool IsEnabled(LanguageFeature f) const { return !disable_.test(f); }
bool ShouldWarn(LanguageFeature f) const {
return (warnAll_ && f != LanguageFeature::OpenMP) || warn_.test(f);
}
// Return all spellings of operators names, depending on features enabled
std::vector<const char *> GetNames(LogicalOperator) const;
std::vector<const char *> GetNames(RelationalOperator) const;

private:
LanguageFeatures disable_;
LanguageFeatures warn_;
bool warnAll_{false};
};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_FORTRAN_FEATURES_H_
72 changes: 72 additions & 0 deletions flang/include/flang/Common/Fortran.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
//===-- include/flang/Common/Fortran.h --------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_FORTRAN_H_
#define FORTRAN_COMMON_FORTRAN_H_

// Fortran language concepts that are used in many phases are defined
// once here to avoid redundancy and needless translation.

#include "idioms.h"
#include <cinttypes>
#include <vector>

namespace Fortran::common {

// Fortran has five kinds of intrinsic data types, plus the derived types.
ENUM_CLASS(TypeCategory, Integer, Real, Complex, Character, Logical, Derived)

constexpr bool IsNumericTypeCategory(TypeCategory category) {
return category == TypeCategory::Integer || category == TypeCategory::Real ||
category == TypeCategory::Complex;
}

// Kinds of IMPORT statements. Default means IMPORT or IMPORT :: names.
ENUM_CLASS(ImportKind, Default, Only, None, All)

// The attribute on a type parameter can be KIND or LEN.
ENUM_CLASS(TypeParamAttr, Kind, Len)

ENUM_CLASS(NumericOperator, Power, Multiply, Divide, Add, Subtract)
const char *AsFortran(NumericOperator);

ENUM_CLASS(LogicalOperator, And, Or, Eqv, Neqv, Not)
const char *AsFortran(LogicalOperator);

ENUM_CLASS(RelationalOperator, LT, LE, EQ, NE, GE, GT)
const char *AsFortran(RelationalOperator);

ENUM_CLASS(Intent, Default, In, Out, InOut)

ENUM_CLASS(IoStmtKind, None, Backspace, Close, Endfile, Flush, Inquire, Open,
Print, Read, Rewind, Wait, Write)

// Union of specifiers for all I/O statements.
ENUM_CLASS(IoSpecKind, Access, Action, Advance, Asynchronous, Blank, Decimal,
Delim, Direct, Encoding, End, Eor, Err, Exist, File, Fmt, Form, Formatted,
Id, Iomsg, Iostat, Name, Named, Newunit, Nextrec, Nml, Number, Opened, Pad,
Pending, Pos, Position, Read, Readwrite, Rec, Recl, Round, Sequential, Sign,
Size, Status, Stream, Unformatted, Unit, Write,
Convert, // nonstandard
Dispose, // nonstandard
)

// Floating-point rounding modes; these are packed into a byte to save
// room in the runtime's format processing context structure.
enum class RoundingMode : std::uint8_t {
TiesToEven, // ROUND=NEAREST, RN - default IEEE rounding
ToZero, // ROUND=ZERO, RZ - truncation
Down, // ROUND=DOWN, RD
Up, // ROUND=UP, RU
TiesAwayFromZero, // ROUND=COMPATIBLE, RC - ties round away from zero
};

// Fortran arrays may have up to 15 dimensions (See Fortran 2018 section 5.4.6).
static constexpr int maxRank{15};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_FORTRAN_H_
87 changes: 87 additions & 0 deletions flang/include/flang/Common/bit-population-count.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,87 @@
//===-- include/flang/Common/bit-population-count.h -------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_BIT_POPULATION_COUNT_H_
#define FORTRAN_COMMON_BIT_POPULATION_COUNT_H_

// Fast and portable functions that implement Fortran's POPCNT and POPPAR
// intrinsic functions. POPCNT returns the number of bits that are set (1)
// in its argument. POPPAR is a parity function that returns true
// when POPCNT is odd.

#include <cinttypes>

namespace Fortran::common {

inline constexpr int BitPopulationCount(std::uint64_t x) {
// In each of the 32 2-bit fields, count the bits that were present.
// This leaves a value [0..2] in each of these 2-bit fields.
x = (x & 0x5555555555555555) + ((x >> 1) & 0x5555555555555555);
// Combine into 16 4-bit fields, each holding [0..4]
x = (x & 0x3333333333333333) + ((x >> 2) & 0x3333333333333333);
// Now 8 8-bit fields, each with [0..8] in their lower 4 bits.
x = (x & 0x0f0f0f0f0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f0f0f0f0f);
// Now 4 16-bit fields, each with [0..16] in their lower 5 bits.
x = (x & 0x001f001f001f001f) + ((x >> 8) & 0x001f001f001f001f);
// Now 2 32-bit fields, each with [0..32] in their lower 6 bits.
x = (x & 0x0000003f0000003f) + ((x >> 16) & 0x0000003f0000003f);
// Last step: 1 64-bit field, with [0..64]
return (x & 0x7f) + (x >> 32);
}

inline constexpr int BitPopulationCount(std::uint32_t x) {
// In each of the 16 2-bit fields, count the bits that were present.
// This leaves a value [0..2] in each of these 2-bit fields.
x = (x & 0x55555555) + ((x >> 1) & 0x55555555);
// Combine into 8 4-bit fields, each holding [0..4]
x = (x & 0x33333333) + ((x >> 2) & 0x33333333);
// Now 4 8-bit fields, each with [0..8] in their lower 4 bits.
x = (x & 0x0f0f0f0f) + ((x >> 4) & 0x0f0f0f0f);
// Now 2 16-bit fields, each with [0..16] in their lower 5 bits.
x = (x & 0x001f001f) + ((x >> 8) & 0x001f001f);
// Last step: 1 32-bit field, with [0..32]
return (x & 0x3f) + (x >> 16);
}

inline constexpr int BitPopulationCount(std::uint16_t x) {
// In each of the 8 2-bit fields, count the bits that were present.
// This leaves a value [0..2] in each of these 2-bit fields.
x = (x & 0x5555) + ((x >> 1) & 0x5555);
// Combine into 4 4-bit fields, each holding [0..4]
x = (x & 0x3333) + ((x >> 2) & 0x3333);
// Now 2 8-bit fields, each with [0..8] in their lower 4 bits.
x = (x & 0x0f0f) + ((x >> 4) & 0x0f0f);
// Last step: 1 16-bit field, with [0..16]
return (x & 0x1f) + (x >> 8);
}

inline constexpr int BitPopulationCount(std::uint8_t x) {
// In each of the 4 2-bit fields, count the bits that were present.
// This leaves a value [0..2] in each of these 2-bit fields.
x = (x & 0x55) + ((x >> 1) & 0x55);
// Combine into 2 4-bit fields, each holding [0..4]
x = (x & 0x33) + ((x >> 2) & 0x33);
// Last step: 1 8-bit field, with [0..8]
return (x & 0xf) + (x >> 4);
}

template <typename UINT> inline constexpr bool Parity(UINT x) {
return BitPopulationCount(x) & 1;
}

// "Parity is for farmers." -- Seymour R. Cray

template <typename UINT> inline constexpr int TrailingZeroBitCount(UINT x) {
if ((x & 1) != 0) {
return 0; // fast path for odd values
} else {
return BitPopulationCount(static_cast<UINT>(x ^ (x - 1))) - !!x;
}
}
} // namespace Fortran::common
#endif // FORTRAN_COMMON_BIT_POPULATION_COUNT_H_
147 changes: 147 additions & 0 deletions flang/include/flang/Common/constexpr-bitset.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
//===-- include/flang/Common/constexpr-bitset.h -----------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_CONSTEXPR_BITSET_H_
#define FORTRAN_COMMON_CONSTEXPR_BITSET_H_

// Implements a replacement for std::bitset<> that is suitable for use
// in constexpr expressions. Limited to elements in [0..63].

#include "bit-population-count.h"
#include <cstddef>
#include <cstdint>
#include <initializer_list>
#include <optional>
#include <type_traits>

namespace Fortran::common {

template <int BITS> class BitSet {
static_assert(BITS > 0 && BITS <= 64);
static constexpr bool partialWord{BITS != 32 && BITS != 64};
using Word = std::conditional_t<(BITS > 32), std::uint64_t, std::uint32_t>;
static constexpr Word allBits{
partialWord ? (static_cast<Word>(1) << BITS) - 1 : ~static_cast<Word>(0)};

constexpr BitSet(Word b) : bits_{b} {}

public:
constexpr BitSet() {}
constexpr BitSet(const std::initializer_list<int> &xs) {
for (auto x : xs) {
set(x);
}
}
constexpr BitSet(const BitSet &) = default;
constexpr BitSet(BitSet &&) = default;
constexpr BitSet &operator=(const BitSet &) = default;
constexpr BitSet &operator=(BitSet &&) = default;

constexpr BitSet &operator&=(const BitSet &that) {
bits_ &= that.bits_;
return *this;
}
constexpr BitSet &operator&=(BitSet &&that) {
bits_ &= that.bits_;
return *this;
}
constexpr BitSet &operator^=(const BitSet &that) {
bits_ ^= that.bits_;
return *this;
}
constexpr BitSet &operator^=(BitSet &&that) {
bits_ ^= that.bits_;
return *this;
}
constexpr BitSet &operator|=(const BitSet &that) {
bits_ |= that.bits_;
return *this;
}
constexpr BitSet &operator|=(BitSet &&that) {
bits_ |= that.bits_;
return *this;
}

constexpr BitSet operator~() const { return ~bits_; }
constexpr BitSet operator&(const BitSet &that) const {
return bits_ & that.bits_;
}
constexpr BitSet operator&(BitSet &&that) const { return bits_ & that.bits_; }
constexpr BitSet operator^(const BitSet &that) const {
return bits_ ^ that.bits_;
}
constexpr BitSet operator^(BitSet &&that) const { return bits_ & that.bits_; }
constexpr BitSet operator|(const BitSet &that) const {
return bits_ | that.bits_;
}
constexpr BitSet operator|(BitSet &&that) const { return bits_ | that.bits_; }

constexpr bool operator==(const BitSet &that) const {
return bits_ == that.bits_;
}
constexpr bool operator==(BitSet &&that) const { return bits_ == that.bits_; }
constexpr bool operator!=(const BitSet &that) const {
return bits_ != that.bits_;
}
constexpr bool operator!=(BitSet &&that) const { return bits_ != that.bits_; }

static constexpr std::size_t size() { return BITS; }
constexpr bool test(std::size_t x) const {
return x < BITS && ((bits_ >> x) & 1) != 0;
}

constexpr bool all() const { return bits_ == allBits; }
constexpr bool any() const { return bits_ != 0; }
constexpr bool none() const { return bits_ == 0; }

constexpr std::size_t count() const { return BitPopulationCount(bits_); }

constexpr BitSet &set() {
bits_ = allBits;
return *this;
}
constexpr BitSet set(std::size_t x, bool value = true) {
if (!value) {
return reset(x);
} else {
bits_ |= static_cast<Word>(1) << x;
return *this;
}
}
constexpr BitSet &reset() {
bits_ = 0;
return *this;
}
constexpr BitSet &reset(std::size_t x) {
bits_ &= ~(static_cast<Word>(1) << x);
return *this;
}
constexpr BitSet &flip() {
bits_ ^= allBits;
return *this;
}
constexpr BitSet &flip(std::size_t x) {
bits_ ^= static_cast<Word>(1) << x;
return *this;
}

constexpr std::optional<std::size_t> LeastElement() const {
if (bits_ == 0) {
return std::nullopt;
} else {
return {TrailingZeroBitCount(bits_)};
}
}

Word bits() const { return bits_; }

private:
Word bits_{0};
};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_CONSTEXPR_BITSET_H_
61 changes: 61 additions & 0 deletions flang/include/flang/Common/default-kinds.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
//===-- include/flang/Common/default-kinds.h --------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_DEFAULT_KINDS_H_
#define FORTRAN_COMMON_DEFAULT_KINDS_H_

#include "flang/Common/Fortran.h"
#include <cstdint>

namespace Fortran::common {

// All address calculations in generated code are 64-bit safe.
// Compile-time folding of bounds, subscripts, and lengths
// consequently uses 64-bit signed integers. The name reflects
// this usage as a subscript into a constant array.
using ConstantSubscript = std::int64_t;

// Represent the default values of the kind parameters of the
// various intrinsic types. Most of these can be configured by
// means of the compiler command line.
class IntrinsicTypeDefaultKinds {
public:
IntrinsicTypeDefaultKinds();
int subscriptIntegerKind() const { return subscriptIntegerKind_; }
int sizeIntegerKind() const { return sizeIntegerKind_; }
int doublePrecisionKind() const { return doublePrecisionKind_; }
int quadPrecisionKind() const { return quadPrecisionKind_; }

IntrinsicTypeDefaultKinds &set_defaultIntegerKind(int);
IntrinsicTypeDefaultKinds &set_subscriptIntegerKind(int);
IntrinsicTypeDefaultKinds &set_sizeIntegerKind(int);
IntrinsicTypeDefaultKinds &set_defaultRealKind(int);
IntrinsicTypeDefaultKinds &set_doublePrecisionKind(int);
IntrinsicTypeDefaultKinds &set_quadPrecisionKind(int);
IntrinsicTypeDefaultKinds &set_defaultCharacterKind(int);
IntrinsicTypeDefaultKinds &set_defaultLogicalKind(int);

int GetDefaultKind(TypeCategory) const;

private:
// Default REAL just simply has to be IEEE-754 single precision today.
// It occupies one numeric storage unit by definition. The default INTEGER
// and default LOGICAL intrinsic types also have to occupy one numeric
// storage unit, so their kinds are also forced. Default COMPLEX must always
// comprise two default REAL components.
int defaultIntegerKind_{4};
int subscriptIntegerKind_{8};
int sizeIntegerKind_{4}; // SIZE(), UBOUND(), &c. default KIND=
int defaultRealKind_{defaultIntegerKind_};
int doublePrecisionKind_{2 * defaultRealKind_};
int quadPrecisionKind_{2 * doublePrecisionKind_};
int defaultCharacterKind_{1};
int defaultLogicalKind_{defaultIntegerKind_};
};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_DEFAULT_KINDS_H_
224 changes: 224 additions & 0 deletions flang/include/flang/Common/enum-set.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,224 @@
//===-- include/flang/Common/enum-set.h -------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_ENUM_SET_H_
#define FORTRAN_COMMON_ENUM_SET_H_

// Implements a set of enums as a std::bitset<>. APIs from bitset<> and set<>
// can be used on these sets, whichever might be more clear to the user.
// This class template facilitates the use of the more type-safe C++ "enum
// class" feature without loss of convenience.

#include "constexpr-bitset.h"
#include "idioms.h"
#include "llvm/Support/raw_ostream.h"
#include <bitset>
#include <cstddef>
#include <initializer_list>
#include <optional>
#include <string>
#include <type_traits>

namespace Fortran::common {

template <typename ENUM, std::size_t BITS> class EnumSet {
static_assert(BITS > 0);

public:
// When the bitset fits in a word, use a custom local bitset class that is
// more amenable to constexpr evaluation than the current std::bitset<>.
using bitsetType =
std::conditional_t<(BITS <= 64), common::BitSet<BITS>, std::bitset<BITS>>;
using enumerationType = ENUM;

constexpr EnumSet() {}
constexpr EnumSet(const std::initializer_list<enumerationType> &enums) {
for (auto x : enums) {
set(x);
}
}
constexpr EnumSet(const EnumSet &) = default;
constexpr EnumSet(EnumSet &&) = default;

constexpr EnumSet &operator=(const EnumSet &) = default;
constexpr EnumSet &operator=(EnumSet &&) = default;

const bitsetType &bitset() const { return bitset_; }

constexpr EnumSet &operator&=(const EnumSet &that) {
bitset_ &= that.bitset_;
return *this;
}
constexpr EnumSet &operator&=(EnumSet &&that) {
bitset_ &= that.bitset_;
return *this;
}
constexpr EnumSet &operator|=(const EnumSet &that) {
bitset_ |= that.bitset_;
return *this;
}
constexpr EnumSet &operator|=(EnumSet &&that) {
bitset_ |= that.bitset_;
return *this;
}
constexpr EnumSet &operator^=(const EnumSet &that) {
bitset_ ^= that.bitset_;
return *this;
}
constexpr EnumSet &operator^=(EnumSet &&that) {
bitset_ ^= that.bitset_;
return *this;
}

constexpr EnumSet operator~() const {
EnumSet result;
result.bitset_ = ~bitset_;
return result;
}
constexpr EnumSet operator&(const EnumSet &that) const {
EnumSet result{*this};
result.bitset_ &= that.bitset_;
return result;
}
constexpr EnumSet operator&(EnumSet &&that) const {
EnumSet result{*this};
result.bitset_ &= that.bitset_;
return result;
}
constexpr EnumSet operator|(const EnumSet &that) const {
EnumSet result{*this};
result.bitset_ |= that.bitset_;
return result;
}
constexpr EnumSet operator|(EnumSet &&that) const {
EnumSet result{*this};
result.bitset_ |= that.bitset_;
return result;
}
constexpr EnumSet operator^(const EnumSet &that) const {
EnumSet result{*this};
result.bitset_ ^= that.bitset_;
return result;
}
constexpr EnumSet operator^(EnumSet &&that) const {
EnumSet result{*this};
result.bitset_ ^= that.bitset_;
return result;
}

constexpr bool operator==(const EnumSet &that) const {
return bitset_ == that.bitset_;
}
constexpr bool operator==(EnumSet &&that) const {
return bitset_ == that.bitset_;
}
constexpr bool operator!=(const EnumSet &that) const {
return bitset_ != that.bitset_;
}
constexpr bool operator!=(EnumSet &&that) const {
return bitset_ != that.bitset_;
}

// N.B. std::bitset<> has size() for max_size(), but that's not the same
// thing as std::set<>::size(), which is an element count.
static constexpr std::size_t max_size() { return BITS; }
constexpr bool test(enumerationType x) const {
return bitset_.test(static_cast<std::size_t>(x));
}
constexpr bool all() const { return bitset_.all(); }
constexpr bool any() const { return bitset_.any(); }
constexpr bool none() const { return bitset_.none(); }

// N.B. std::bitset<> has count() as an element count, while
// std::set<>::count(x) returns 0 or 1 to indicate presence.
constexpr std::size_t count() const { return bitset_.count(); }
constexpr std::size_t count(enumerationType x) const {
return test(x) ? 1 : 0;
}

constexpr EnumSet &set() {
bitset_.set();
return *this;
}
constexpr EnumSet &set(enumerationType x, bool value = true) {
bitset_.set(static_cast<std::size_t>(x), value);
return *this;
}
constexpr EnumSet &reset() {
bitset_.reset();
return *this;
}
constexpr EnumSet &reset(enumerationType x) {
bitset_.reset(static_cast<std::size_t>(x));
return *this;
}
constexpr EnumSet &flip() {
bitset_.flip();
return *this;
}
constexpr EnumSet &flip(enumerationType x) {
bitset_.flip(static_cast<std::size_t>(x));
return *this;
}

constexpr bool empty() const { return none(); }
void clear() { reset(); }
void insert(enumerationType x) { set(x); }
void insert(enumerationType &&x) { set(x); }
void emplace(enumerationType &&x) { set(x); }
void erase(enumerationType x) { reset(x); }
void erase(enumerationType &&x) { reset(x); }

constexpr std::optional<enumerationType> LeastElement() const {
if (empty()) {
return std::nullopt;
} else if constexpr (std::is_same_v<bitsetType, common::BitSet<BITS>>) {
return {static_cast<enumerationType>(bitset_.LeastElement().value())};
} else {
// std::bitset: just iterate
for (std::size_t j{0}; j < BITS; ++j) {
auto enumerator{static_cast<enumerationType>(j)};
if (bitset_.test(enumerator)) {
return {enumerator};
}
}
die("EnumSet::LeastElement(): no bit found in non-empty std::bitset");
}
}

template <typename FUNC> void IterateOverMembers(const FUNC &f) const {
EnumSet copy{*this};
while (auto least{copy.LeastElement()}) {
f(*least);
copy.erase(*least);
}
}

llvm::raw_ostream &Dump(
llvm::raw_ostream &o, std::string EnumToString(enumerationType)) const {
char sep{'{'};
IterateOverMembers([&](auto e) {
o << sep << EnumToString(e);
sep = ',';
});
return o << (sep == '{' ? "{}" : "}");
}

private:
bitsetType bitset_{};
};
} // namespace Fortran::common

template <typename ENUM, std::size_t values>
struct std::hash<Fortran::common::EnumSet<ENUM, values>> {
std::size_t operator()(
const Fortran::common::EnumSet<ENUM, values> &x) const {
return std::hash(x.bitset());
}
};
#endif // FORTRAN_COMMON_ENUM_SET_H_
845 changes: 845 additions & 0 deletions flang/include/flang/Common/format.h

Large diffs are not rendered by default.

166 changes: 166 additions & 0 deletions flang/include/flang/Common/idioms.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,166 @@
//===-- include/flang/Common/idioms.h ---------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_IDIOMS_H_
#define FORTRAN_COMMON_IDIOMS_H_

// Defines anything that might ever be useful in more than one source file
// or that is too weird or too specific to the host C++ compiler to be
// exposed elsewhere.

#ifndef __cplusplus
#error this is a C++ program
#endif
#if __cplusplus < 201703L
#error this is a C++17 program
#endif
#if !__clang__ && defined __GNUC__ && __GNUC__ < 7
#error g++ >= 7.2 is required
#endif

#include <functional>
#include <list>
#include <optional>
#include <string>
#include <tuple>
#include <type_traits>
#include <variant>

#if __GNUC__ == 7
// Avoid a deduction bug in GNU 7.x headers by forcing the answer.
namespace std {
template <typename A>
struct is_trivially_copy_constructible<list<A>> : false_type {};
template <typename A>
struct is_trivially_copy_constructible<optional<list<A>>> : false_type {};
} // namespace std
#endif

// enable "this is a std::string"s with the 's' suffix
using namespace std::literals::string_literals;

namespace Fortran::common {

// Helper templates for combining a list of lambdas into an anonymous
// struct for use with std::visit() on a std::variant<> sum type.
// E.g.: std::visit(visitors{
// [&](const firstType &x) { ... },
// [&](const secondType &x) { ... },
// ...
// [&](const auto &catchAll) { ... }}, variantObject);

template <typename... LAMBDAS> struct visitors : LAMBDAS... {
using LAMBDAS::operator()...;
};

template <typename... LAMBDAS> visitors(LAMBDAS... x) -> visitors<LAMBDAS...>;

// Calls std::fprintf(stderr, ...), then abort().
[[noreturn]] void die(const char *, ...);

#define DIE(x) Fortran::common::die(x " at " __FILE__ "(%d)", __LINE__)

// For switch statement default: labels.
#define CRASH_NO_CASE DIE("no case")

// clang-format off
// For switch statements whose cases have return statements for
// all possibilities. Clang emits warnings if the default: is
// present, gcc emits warnings if it is absent.
#if __clang__
#define SWITCH_COVERS_ALL_CASES
#else
#define SWITCH_COVERS_ALL_CASES default: CRASH_NO_CASE;
#endif
// clang-format on

// For cheap assertions that should be applied in production.
// To disable, compile with '-DCHECK=(void)'
#ifndef CHECK
#define CHECK(x) ((x) || (DIE("CHECK(" #x ") failed"), false))
#endif

// User-defined type traits that default to false:
// Invoke CLASS_TRAIT(traitName) to define a trait, then put
// using traitName = std::true_type; (or false_type)
// into the appropriate class definitions. You can then use
// typename std::enable_if_t<traitName<...>, ...>
// in template specialization definitions.
#define CLASS_TRAIT(T) \
namespace class_trait_ns_##T { \
template <typename A> std::true_type test(typename A::T *); \
template <typename A> std::false_type test(...); \
template <typename A> \
constexpr bool has_trait{decltype(test<A>(nullptr))::value}; \
template <typename A> constexpr bool trait_value() { \
if constexpr (has_trait<A>) { \
using U = typename A::T; \
return U::value; \
} else { \
return false; \
} \
} \
} \
template <typename A> constexpr bool T{class_trait_ns_##T::trait_value<A>()};

#if !defined ATTRIBUTE_UNUSED && (__clang__ || __GNUC__)
#define ATTRIBUTE_UNUSED __attribute__((unused))
#endif

// Define enum class NAME with the given enumerators, a static
// function EnumToString() that maps enumerators to std::string,
// and a constant NAME_enumSize that captures the number of items
// in the enum class.

std::string EnumIndexToString(int index, const char *names);

template <typename A> struct ListItemCount {
constexpr ListItemCount(std::initializer_list<A> list) : value{list.size()} {}
const std::size_t value;
};

#define ENUM_CLASS(NAME, ...) \
enum class NAME { __VA_ARGS__ }; \
ATTRIBUTE_UNUSED static constexpr std::size_t NAME##_enumSize{[] { \
enum { __VA_ARGS__ }; \
return Fortran::common::ListItemCount{__VA_ARGS__}.value; \
}()}; \
ATTRIBUTE_UNUSED static inline std::string EnumToString(NAME e) { \
return Fortran::common::EnumIndexToString( \
static_cast<int>(e), #__VA_ARGS__); \
}

// Check that a pointer is non-null and dereference it
#define DEREF(p) Fortran::common::Deref(p, __FILE__, __LINE__)

template <typename T> constexpr T &Deref(T *p, const char *file, int line) {
if (!p) {
Fortran::common::die("nullptr dereference at %s(%d)", file, line);
}
return *p;
}

// Given a const reference to a value, return a copy of the value.
template <typename A> A Clone(const A &x) { return x; }

// C++ does a weird and dangerous thing when deducing template type parameters
// from function arguments: lvalue references are allowed to match rvalue
// reference arguments. Template function declarations like
// template<typename A> int foo(A &&);
// need to be protected against this C++ language feature when functions
// may modify such arguments. Use these type functions to invoke SFINAE
// on a result type via
// template<typename A> common::IfNoLvalue<int, A> foo(A &&);
// or, for constructors,
// template<typename A, typename = common::NoLvalue<A>> int foo(A &&);
// This works with parameter packs too.
template <typename A, typename... B>
using IfNoLvalue = std::enable_if_t<(... && !std::is_lvalue_reference_v<B>), A>;
template <typename... RVREF> using NoLvalue = IfNoLvalue<void, RVREF...>;
} // namespace Fortran::common
#endif // FORTRAN_COMMON_IDIOMS_H_
141 changes: 141 additions & 0 deletions flang/include/flang/Common/indirection.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,141 @@
//===-- include/flang/Common/indirection.h ----------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_INDIRECTION_H_
#define FORTRAN_COMMON_INDIRECTION_H_

// Define a smart pointer class template that is rather like
// non-nullable std::unique_ptr<>. Indirection<> is, like a C++ reference
// type, restricted to be non-null when constructed or assigned.
// Indirection<> optionally supports copy construction and copy assignment.
//
// To use Indirection<> with forward-referenced types, add
// extern template class Fortran::common::Indirection<FORWARD_TYPE>;
// outside any namespace in a header before use, and
// template class Fortran::common::Indirection<FORWARD_TYPE>;
// in one C++ source file later where a definition of the type is visible.

#include "idioms.h"
#include <memory>
#include <type_traits>
#include <utility>

namespace Fortran::common {

// The default case does not support (deep) copy construction or assignment.
template <typename A, bool COPY = false> class Indirection {
public:
using element_type = A;
Indirection() = delete;
Indirection(A *&&p) : p_{p} {
CHECK(p_ && "assigning null pointer to Indirection");
p = nullptr;
}
Indirection(A &&x) : p_{new A(std::move(x))} {}
Indirection(Indirection &&that) : p_{that.p_} {
CHECK(p_ && "move construction of Indirection from null Indirection");
that.p_ = nullptr;
}
~Indirection() {
delete p_;
p_ = nullptr;
}
Indirection &operator=(Indirection &&that) {
CHECK(that.p_ && "move assignment of null Indirection to Indirection");
auto tmp{p_};
p_ = that.p_;
that.p_ = tmp;
return *this;
}

A &value() { return *p_; }
const A &value() const { return *p_; }

bool operator==(const A &that) const { return *p_ == that; }
bool operator==(const Indirection &that) const { return *p_ == *that.p_; }

template <typename... ARGS>
static common::IfNoLvalue<Indirection, ARGS...> Make(ARGS &&... args) {
return {new A(std::move(args)...)};
}

private:
A *p_{nullptr};
};

// Variant with copy construction and assignment
template <typename A> class Indirection<A, true> {
public:
using element_type = A;

Indirection() = delete;
Indirection(A *&&p) : p_{p} {
CHECK(p_ && "assigning null pointer to Indirection");
p = nullptr;
}
Indirection(const A &x) : p_{new A(x)} {}
Indirection(A &&x) : p_{new A(std::move(x))} {}
Indirection(const Indirection &that) {
CHECK(that.p_ && "copy construction of Indirection from null Indirection");
p_ = new A(*that.p_);
}
Indirection(Indirection &&that) : p_{that.p_} {
CHECK(p_ && "move construction of Indirection from null Indirection");
that.p_ = nullptr;
}
~Indirection() {
delete p_;
p_ = nullptr;
}
Indirection &operator=(const Indirection &that) {
CHECK(that.p_ && "copy assignment of Indirection from null Indirection");
*p_ = *that.p_;
return *this;
}
Indirection &operator=(Indirection &&that) {
CHECK(that.p_ && "move assignment of null Indirection to Indirection");
auto tmp{p_};
p_ = that.p_;
that.p_ = tmp;
return *this;
}

A &value() { return *p_; }
const A &value() const { return *p_; }

bool operator==(const A &that) const { return *p_ == that; }
bool operator==(const Indirection &that) const { return *p_ == *that.p_; }

template <typename... ARGS>
static common::IfNoLvalue<Indirection, ARGS...> Make(ARGS &&... args) {
return {new A(std::move(args)...)};
}

private:
A *p_{nullptr};
};

template <typename A> using CopyableIndirection = Indirection<A, true>;

// For use with std::unique_ptr<> when declaring owning pointers to
// forward-referenced types, here's a minimal custom deleter that avoids
// some of the drama with std::default_delete<>. Invoke DEFINE_DELETER()
// later in exactly one C++ source file where a complete definition of the
// type is visible. Be advised, std::unique_ptr<> does not have copy
// semantics; if you need ownership, copy semantics, and nullability,
// std::optional<CopyableIndirection<>> works.
template <typename A> class Deleter {
public:
void operator()(A *) const;
};
} // namespace Fortran::common
#define DEFINE_DELETER(A) \
template <> void Fortran::common::Deleter<A>::operator()(A *p) const { \
delete p; \
}
#endif // FORTRAN_COMMON_INDIRECTION_H_
115 changes: 115 additions & 0 deletions flang/include/flang/Common/interval.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
//===-- include/flang/Common/interval.h -------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_INTERVAL_H_
#define FORTRAN_COMMON_INTERVAL_H_

// Defines a generalized template class Interval<A> to represent
// the half-open interval [x .. x+n).

#include "idioms.h"
#include <algorithm>
#include <cstddef>
#include <utility>

namespace Fortran::common {

template <typename A> class Interval {
public:
using type = A;
constexpr Interval() {}
constexpr Interval(const A &s, std::size_t n = 1) : start_{s}, size_{n} {}
constexpr Interval(A &&s, std::size_t n = 1)
: start_{std::move(s)}, size_{n} {}
constexpr Interval(const Interval &) = default;
constexpr Interval(Interval &&) = default;
constexpr Interval &operator=(const Interval &) = default;
constexpr Interval &operator=(Interval &&) = default;

constexpr bool operator==(const Interval &that) const {
return start_ == that.start_ && size_ == that.size_;
}
constexpr bool operator!=(const Interval &that) const {
return !(*this == that);
}

constexpr const A &start() const { return start_; }
constexpr std::size_t size() const { return size_; }
constexpr bool empty() const { return size_ == 0; }

constexpr bool Contains(const A &x) const {
return start_ <= x && x < start_ + size_;
}
constexpr bool Contains(const Interval &that) const {
return Contains(that.start_) && Contains(that.start_ + (that.size_ - 1));
}
constexpr bool IsDisjointWith(const Interval &that) const {
return that.NextAfter() <= start_ || NextAfter() <= that.start_;
}
constexpr bool ImmediatelyPrecedes(const Interval &that) const {
return NextAfter() == that.start_;
}
void Annex(const Interval &that) {
size_ = (that.start_ + that.size_) - start_;
}
bool AnnexIfPredecessor(const Interval &that) {
if (ImmediatelyPrecedes(that)) {
size_ += that.size_;
return true;
}
return false;
}
void ExtendToCover(const Interval &that) {
if (size_ == 0) {
*this = that;
} else if (that.size_ != 0) {
const auto end{std::max(NextAfter(), that.NextAfter())};
start_ = std::min(start_, that.start_);
size_ = end - start_;
}
}

std::size_t MemberOffset(const A &x) const {
CHECK(Contains(x));
return x - start_;
}
A OffsetMember(std::size_t n) const {
CHECK(n < size_);
return start_ + n;
}

constexpr A Last() const { return start_ + (size_ - 1); }
constexpr A NextAfter() const { return start_ + size_; }
constexpr Interval Prefix(std::size_t n) const {
return {start_, std::min(size_, n)};
}
Interval Suffix(std::size_t n) const {
CHECK(n <= size_);
return {start_ + n, size_ - n};
}

constexpr Interval Intersection(const Interval &that) const {
if (that.NextAfter() <= start_) {
return {};
} else if (that.start_ <= start_) {
auto skip{start_ - that.start_};
return {start_, std::min(size_, that.size_ - skip)};
} else if (NextAfter() <= that.start_) {
return {};
} else {
auto skip{that.start_ - start_};
return {that.start_, std::min(that.size_, size_ - skip)};
}
}

private:
A start_;
std::size_t size_{0};
};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_INTERVAL_H_
96 changes: 96 additions & 0 deletions flang/include/flang/Common/leading-zero-bit-count.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
//===-- include/flang/Common/leading-zero-bit-count.h -----------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_LEADING_ZERO_BIT_COUNT_H_
#define FORTRAN_COMMON_LEADING_ZERO_BIT_COUNT_H_

// A fast and portable function that implements Fortran's LEADZ intrinsic
// function, which counts the number of leading (most significant) zero bit
// positions in an integer value. (If the most significant bit is set, the
// leading zero count is zero; if no bit is set, the leading zero count is the
// word size in bits; otherwise, it's the largest left shift count that
// doesn't reduce the number of bits in the word that are set.)

#include <cinttypes>

namespace Fortran::common {
namespace {
// The following magic constant is a binary deBruijn sequence.
// It has the remarkable property that if one extends it
// (virtually) on the right with 5 more zero bits, then all
// of the 64 contiguous framed blocks of six bits in the
// extended 69-bit sequence are distinct. Consequently,
// if one shifts it left by any shift count [0..63] with
// truncation and extracts the uppermost six bit field
// of the shifted value, each shift count maps to a distinct
// field value. That means that we can map those 64 field
// values back to the shift counts that produce them,
// and (the point) this means that we can shift this value
// by an unknown bit count in [0..63] and then figure out
// what that count must have been.
// 0 7 e d d 5 e 5 9 a 4 e 2 8 c 2
// 0000011111101101110101011110010110011010010011100010100011000010
static constexpr std::uint64_t deBruijn{0x07edd5e59a4e28c2};
static constexpr std::uint8_t mapping[64]{63, 0, 58, 1, 59, 47, 53, 2, 60, 39,
48, 27, 54, 33, 42, 3, 61, 51, 37, 40, 49, 18, 28, 20, 55, 30, 34, 11, 43,
14, 22, 4, 62, 57, 46, 52, 38, 26, 32, 41, 50, 36, 17, 19, 29, 10, 13, 21,
56, 45, 25, 31, 35, 16, 9, 12, 44, 24, 15, 8, 23, 7, 6, 5};
} // namespace

inline constexpr int LeadingZeroBitCount(std::uint64_t x) {
if (x == 0) {
return 64;
} else {
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x |= x >> 32;
// All of the bits below the uppermost set bit are now also set.
x -= x >> 1; // All of the bits below the uppermost are now clear.
// x now has exactly one bit set, so it is a power of two, so
// multiplication by x is equivalent to a left shift by its
// base-2 logarithm. We calculate that unknown base-2 logarithm
// by shifting the deBruijn sequence and mapping the framed value.
int base2Log{mapping[(x * deBruijn) >> 58]};
return 63 - base2Log; // convert to leading zero count
}
}

inline constexpr int LeadingZeroBitCount(std::uint32_t x) {
return LeadingZeroBitCount(static_cast<std::uint64_t>(x)) - 32;
}

inline constexpr int LeadingZeroBitCount(std::uint16_t x) {
return LeadingZeroBitCount(static_cast<std::uint64_t>(x)) - 48;
}

namespace {
static constexpr std::uint8_t eightBitLeadingZeroBitCount[256]{8, 7, 6, 6, 5, 5,
5, 5, 4, 4, 4, 4, 4, 4, 4, 4, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3,
3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
}

inline constexpr int LeadingZeroBitCount(std::uint8_t x) {
return eightBitLeadingZeroBitCount[x];
}

template <typename A> inline constexpr int BitsNeededFor(A x) {
return 8 * sizeof x - LeadingZeroBitCount(x);
}
} // namespace Fortran::common
#endif // FORTRAN_COMMON_LEADING_ZERO_BIT_COUNT_H_
102 changes: 102 additions & 0 deletions flang/include/flang/Common/real.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
//===-- include/flang/Common/real.h -----------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_REAL_H_
#define FORTRAN_COMMON_REAL_H_

// Characteristics of IEEE-754 & related binary floating-point numbers.
// The various representations are distinguished by their binary precisions
// (number of explicit significand bits and any implicit MSB in the fraction).

#include <cinttypes>

namespace Fortran::common {

// Total representation size in bits for each type
static constexpr int BitsForBinaryPrecision(int binaryPrecision) {
switch (binaryPrecision) {
case 8:
return 16; // IEEE single (truncated): 1+8+7
case 11:
return 16; // IEEE half precision: 1+5+10
case 24:
return 32; // IEEE single precision: 1+8+23
case 53:
return 64; // IEEE double precision: 1+11+52
case 64:
return 80; // x87 extended precision: 1+15+64
case 106:
return 128; // "double-double": 2*(1+11+52)
case 113:
return 128; // IEEE quad precision: 1+15+112
default:
return -1;
}
}

// Number of significant decimal digits in the fraction of the
// exact conversion of the least nonzero (subnormal) value
// in each type; i.e., a 128-bit quad value can be formatted
// exactly with FORMAT(E0.22981).
static constexpr int MaxDecimalConversionDigits(int binaryPrecision) {
switch (binaryPrecision) {
case 8:
return 93;
case 11:
return 17;
case 24:
return 105;
case 53:
return 751;
case 64:
return 11495;
case 106:
return 2 * 751;
case 113:
return 11530;
default:
return -1;
}
}

template <int BINARY_PRECISION> class RealDetails {
private:
// Converts bit widths to whole decimal digits
static constexpr int LogBaseTwoToLogBaseTen(int logb2) {
constexpr std::int64_t LogBaseTenOfTwoTimesTenToThe12th{301029995664};
constexpr std::int64_t TenToThe12th{1000000000000};
std::int64_t logb10{
(logb2 * LogBaseTenOfTwoTimesTenToThe12th) / TenToThe12th};
return static_cast<int>(logb10);
}

public:
static constexpr int binaryPrecision{BINARY_PRECISION};
static constexpr int bits{BitsForBinaryPrecision(binaryPrecision)};
static constexpr bool isImplicitMSB{binaryPrecision != 64 /*x87*/};
static constexpr int significandBits{binaryPrecision - isImplicitMSB};
static constexpr int exponentBits{bits - significandBits - 1 /*sign*/};
static constexpr int maxExponent{(1 << exponentBits) - 1};
static constexpr int exponentBias{maxExponent / 2};

static constexpr int decimalPrecision{
LogBaseTwoToLogBaseTen(binaryPrecision - 1)};
static constexpr int decimalRange{LogBaseTwoToLogBaseTen(exponentBias - 1)};

// Number of significant decimal digits in the fraction of the
// exact conversion of the least nonzero subnormal.
static constexpr int maxDecimalConversionDigits{
MaxDecimalConversionDigits(binaryPrecision)};

static_assert(binaryPrecision > 0);
static_assert(exponentBits > 1);
static_assert(exponentBits <= 15);
};

} // namespace Fortran::common
#endif // FORTRAN_COMMON_REAL_H_
76 changes: 76 additions & 0 deletions flang/include/flang/Common/reference-counted.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
//===-- include/flang/Common/reference-counted.h ----------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

#ifndef FORTRAN_COMMON_REFERENCE_COUNTED_H_
#define FORTRAN_COMMON_REFERENCE_COUNTED_H_

// A class template of smart pointers to objects with their own
// reference counting object lifetimes that's lighter weight
// than std::shared_ptr<>. Not thread-safe.

namespace Fortran::common {

// A base class for reference-counted objects. Must be public.
template <typename A> class ReferenceCounted {
public:
ReferenceCounted() {}
void TakeReference() { ++references_; }
void DropReference() {
if (--references_ == 0) {
delete static_cast<A *>(this);
}
}

private:
int references_{0};
};

// A reference to a reference-counted object.
template <typename A> class CountedReference {
public:
using type = A;
CountedReference() {}
CountedReference(type *m) : p_{m} { Take(); }
CountedReference(const CountedReference &c) : p_{c.p_} { Take(); }
CountedReference(CountedReference &&c) : p_{c.p_} { c.p_ = nullptr; }
CountedReference &operator=(const CountedReference &c) {
c.Take();
Drop();
p_ = c.p_;
return *this;
}
CountedReference &operator=(CountedReference &&c) {
A *p{c.p_};
c.p_ = nullptr;
Drop();
p_ = p;
return *this;
}
~CountedReference() { Drop(); }
operator bool() const { return p_ != nullptr; }
type *get() const { return p_; }
type &operator*() const { return *p_; }
type *operator->() const { return p_; }

private:
void Take() const {
if (p_) {
p_->TakeReference();
}
}
void Drop() {
if (p_) {
p_->DropReference();
p_ = nullptr;
}
}

type *p_{nullptr};
};
} // namespace Fortran::common
#endif // FORTRAN_COMMON_REFERENCE_COUNTED_H_
63 changes: 63 additions & 0 deletions flang/include/flang/Common/reference.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
//===-- include/flang/Common/reference.h ------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// Implements a better std::reference_wrapper<> template class with
// move semantics, equality testing, and member access.
// Use Reference<A> in place of a real A& reference when assignability is
// required; safer than a bare pointer because it's guaranteed to not be null.

#ifndef FORTRAN_COMMON_REFERENCE_H_
#define FORTRAN_COMMON_REFERENCE_H_
#include <type_traits>
namespace Fortran::common {
template <typename A> class Reference {
public:
using type = A;
Reference(type &x) : p_{&x} {}
Reference(const Reference &that) : p_{that.p_} {}
Reference(Reference &&that) : p_{that.p_} {}
Reference &operator=(const Reference &that) {
p_ = that.p_;
return *this;
}
Reference &operator=(Reference &&that) {
p_ = that.p_;
return *this;
}

// Implicit conversions to references are supported only for
// const-qualified types in order to avoid any pernicious
// creation of a temporary copy in cases like:
// Reference<type> ref;
// const Type &x{ref}; // creates ref to temp copy!
operator std::conditional_t<std::is_const_v<type>, type &, void>()
const noexcept {
if constexpr (std::is_const_v<type>) {
return *p_;
}
}

type &get() const noexcept { return *p_; }
type *operator->() const { return p_; }
type &operator*() const { return *p_; }

bool operator==(std::add_const_t<A> &that) const {
return p_ == &that || *p_ == that;
}
bool operator!=(std::add_const_t<A> &that) const { return !(*this == that); }
bool operator==(const Reference &that) const {
return p_ == that.p_ || *this == *that.p_;
}
bool operator!=(const Reference &that) const { return !(*this == that); }

private:
type *p_; // never null
};
template <typename A> Reference(A &) -> Reference<A>;
} // namespace Fortran::common
#endif
46 changes: 46 additions & 0 deletions flang/include/flang/Common/restorer.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
//===-- include/flang/Common/restorer.h -------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//

// Utility: before overwriting a variable, capture its value and
// ensure that it will be restored when the Restorer goes out of scope.
//
// int x{3};
// {
// auto save{common::ScopedSet(x, 4)};
// // x is now 4
// }
// // x is back to 3

#ifndef FORTRAN_COMMON_RESTORER_H_
#define FORTRAN_COMMON_RESTORER_H_
#include "idioms.h"
namespace Fortran::common {
template <typename A> class Restorer {
public:
explicit Restorer(A &p) : p_{p}, original_{std::move(p)} {}
~Restorer() { p_ = std::move(original_); }

private:
A &p_;
A original_;
};

template <typename A, typename B>
common::IfNoLvalue<Restorer<A>, B> ScopedSet(A &to, B &&from) {
Restorer<A> result{to};
to = std::move(from);
return result;
}
template <typename A, typename B>
common::IfNoLvalue<Restorer<A>, B> ScopedSet(A &to, const B &from) {
Restorer<A> result{to};
to = from;
return result;
}
} // namespace Fortran::common
#endif // FORTRAN_COMMON_RESTORER_H_
Loading