Skip to content

Commit

Permalink
[TableGen] Eliminate the 'code' type
Browse files Browse the repository at this point in the history
Update the documentation.

Rework various backends that relied on the code type.

Differential Revision: https://reviews.llvm.org/D92269
  • Loading branch information
Paul C. Anagnostopoulos committed Dec 3, 2020
1 parent 1365718 commit 415fab6
Show file tree
Hide file tree
Showing 31 changed files with 278 additions and 285 deletions.
2 changes: 0 additions & 2 deletions clang/utils/TableGen/ClangOptionDocEmitter.cpp
Expand Up @@ -217,8 +217,6 @@ std::string getRSTStringWithTextFallback(const Record *R, StringRef Primary,
StringRef Value;
if (auto *SV = dyn_cast_or_null<StringInit>(V->getValue()))
Value = SV->getValue();
else if (auto *CV = dyn_cast_or_null<CodeInit>(V->getValue()))
Value = CV->getValue();
if (!Value.empty())
return Field == Primary ? Value.str() : escapeRST(Value);
}
Expand Down
29 changes: 21 additions & 8 deletions llvm/docs/TableGen/BackEnds.rst
Expand Up @@ -693,8 +693,8 @@ This class provides six fields.
table that holds the entries. If unspecified, the ``FilterClass`` name is
used.

* ``list<string> Fields``. A list of the names of the fields in the
collected records that contain the data for the table entries. The order of
* ``list<string> Fields``. A list of the names of the fields *in the
collected records* that contain the data for the table entries. The order of
this list determines the order of the values in the C++ initializers. See
below for information about the types of these fields.

Expand All @@ -706,13 +706,26 @@ This class provides six fields.

* ``bit PrimaryKeyEarlyOut``. See the third example below.

TableGen attempts to deduce the type of each of the table fields. It can
deduce ``bit``, ``bits<n>``, ``string``, ``Intrinsic``, and ``Instruction``.
These can be used in the primary key. TableGen also deduces ``code``, but it
cannot be used in the primary key. Any other field types must be specified
TableGen attempts to deduce the type of each of the table fields so that it
can format the C++ initializers in the emitted table. It can deduce ``bit``,
``bits<n>``, ``string``, ``Intrinsic``, and ``Instruction``. These can be
used in the primary key. Any other field types must be specified
explicitly; this is done as shown in the second example below. Such fields
cannot be used in the primary key.

One special case of the field type has to do with code. Arbitrary code is
represented by a string, but has to be emitted as a C++ initializer without
quotes. If the code field was defined using a code literal (``[{...}]``),
then TableGen will know to emit it without quotes. However, if it was
defined using a string literal or complex string expression, then TableGen
will not know. In this case, you can force TableGen to treat the field as
code by including the following line in the ``GenericTable`` record, where
*xxx* is the code field name.

.. code-block:: text
string TypeOf_xxx = "code";
Here is an example where TableGen can deduce the field types. Note that the
table entry records are anonymous; the names of entry records are
irrelevant.
Expand Down Expand Up @@ -793,7 +806,7 @@ pointer if no entry is found.

This example includes a field whose type TableGen cannot deduce. The ``Kind``
field uses the enumerated type ``CEnum`` defined above. To inform TableGen
of the type, the class derived from ``GenericTable`` must include a field
of the type, the record derived from ``GenericTable`` must include a string field
named ``TypeOf_``\ *field*, where *field* is the name of the field whose type
is required.

Expand All @@ -802,7 +815,7 @@ is required.
def CTable : GenericTable {
let FilterClass = "CEntry";
let Fields = ["Name", "Kind", "Encoding"];
GenericEnum TypeOf_Kind = CEnum;
string TypeOf_Kind = "CEnum";
let PrimaryKey = ["Encoding"];
let PrimaryKeyName = "lookupCEntryByEncoding";
}
Expand Down
18 changes: 3 additions & 15 deletions llvm/docs/TableGen/BackGuide.rst
Expand Up @@ -287,9 +287,9 @@ value. The static function ``get()`` can be used to obtain the singleton

This class, a subclass of ``Init``, acts as the parent class of the classes
that represent specific value types (except for the unset value). These
classes include ``BitInit``, ``BitsInit``, ``CodeInit``, ``DagInit``,
``DefInit``, ``IntInit``, ``ListInit``, and ``StringInit``. (There are
additional derived types used by the TableGen parser.)
classes include ``BitInit``, ``BitsInit``, ``DagInit``, ``DefInit``,
``IntInit``, ``ListInit``, and ``StringInit``. (There are additional derived
types used by the TableGen parser.)

This class includes a data member that specifies the ``RecTy`` type of the
value. It provides a function to get that ``RecTy`` type.
Expand Down Expand Up @@ -330,18 +330,6 @@ The class provides the following additional functions.

* A function that gets a bit specified by an integer index.

``CodeInit``
~~~~~~~~~~~~

The ``CodeInit`` class is a subclass of ``TypedInit``. Its instances
represent arbitrary-length strings produced from ``code`` literals in the
TableGen files. It includes a data member that contains a ``StringRef`` of
the value.

The class provides the usual ``get()`` and ``getValue()`` functions. The
latter function returns the ``StringRef``.


``DagInit``
~~~~~~~~~~~

Expand Down
29 changes: 13 additions & 16 deletions llvm/docs/TableGen/ProgRef.rst
Expand Up @@ -167,10 +167,11 @@ TableGen has two kinds of string literals:

.. productionlist::
TokString: '"' (non-'"' characters and escapes) '"'
TokCodeFragment: "[{" (shortest text not containing "}]") "}]"
TokCode: "[{" (shortest text not containing "}]") "}]"

A :token:`TokCodeFragment` is nothing more than a multi-line string literal
delimited by ``[{`` and ``}]``. It can break across lines.
A :token:`TokCode` is nothing more than a multi-line string literal
delimited by ``[{`` and ``}]``. It can break across lines and the
line breaks are retained in the string.

The current implementation accepts the following escape sequences::

Expand Down Expand Up @@ -254,7 +255,7 @@ high-level types (e.g., ``dag``). This flexibility allows you to describe a
wide range of records conveniently and compactly.

.. productionlist::
Type: "bit" | "int" | "string" | "code" | "dag"
Type: "bit" | "int" | "string" | "dag"
:| "bits" "<" `TokInteger` ">"
:| "list" "<" `Type` ">"
:| `ClassID`
Expand All @@ -271,11 +272,6 @@ wide range of records conveniently and compactly.
The ``string`` type represents an ordered sequence of characters of arbitrary
length.

``code``
The ``code`` type represents a code fragment. The values are the same as
those for the ``string`` type; the ``code`` type is provided just to indicate
the programmer's intention.

``bits<``\ *n*\ ``>``
The ``bits`` type is a fixed-sized integer of arbitrary length *n* that
is treated as separate bits. These bits can be accessed individually.
Expand Down Expand Up @@ -348,12 +344,12 @@ Simple values
The :token:`SimpleValue` has a number of forms.

.. productionlist::
SimpleValue: `TokInteger` | `TokString`+ | `TokCodeFragment`
SimpleValue: `TokInteger` | `TokString`+ | `TokCode`

A value can be an integer literal, a string literal, or a code fragment
literal. Multiple adjacent string literals are concatenated as in C/C++; the
simple value is the concatenation of the strings. Code fragments become
strings and then are indistinguishable from them.
A value can be an integer literal, a string literal, or a code literal.
Multiple adjacent string literals are concatenated as in C/C++; the simple
value is the concatenation of the strings. Code literals become strings and
are then indistinguishable from them.

.. productionlist::
SimpleValue2: "true" | "false"
Expand Down Expand Up @@ -616,14 +612,15 @@ name of a multiclass.

.. productionlist::
Body: ";" | "{" `BodyItem`* "}"
BodyItem: `Type` `TokIdentifier` ["=" `Value`] ";"
BodyItem: (`Type` | "code") `TokIdentifier` ["=" `Value`] ";"
:| "let" `TokIdentifier` ["{" `RangeList` "}"] "=" `Value` ";"
:| "defvar" `TokIdentifier` "=" `Value` ";"

A field definition in the body specifies a field to be included in the class
or record. If no initial value is specified, then the field's value is
uninitialized. The type must be specified; TableGen will not infer it from
the value.
the value. The keyword ``code`` may be used to emphasize that the field
has a string value that is code.

The ``let`` form is used to reset a field to a new value. This can be done
for fields defined directly in the body or fields inherited from
Expand Down
2 changes: 2 additions & 0 deletions llvm/include/llvm/TableGen/Error.h
Expand Up @@ -22,6 +22,7 @@ namespace llvm {
void PrintNote(const Twine &Msg);
void PrintNote(ArrayRef<SMLoc> NoteLoc, const Twine &Msg);

LLVM_ATTRIBUTE_NORETURN void PrintFatalNote(const Twine &Msg);
LLVM_ATTRIBUTE_NORETURN void PrintFatalNote(ArrayRef<SMLoc> ErrorLoc,
const Twine &Msg);
LLVM_ATTRIBUTE_NORETURN void PrintFatalNote(const Record *Rec,
Expand All @@ -37,6 +38,7 @@ void PrintError(const Twine &Msg);
void PrintError(ArrayRef<SMLoc> ErrorLoc, const Twine &Msg);
void PrintError(const char *Loc, const Twine &Msg);
void PrintError(const Record *Rec, const Twine &Msg);
void PrintError(const RecordVal *RecVal, const Twine &Msg);

LLVM_ATTRIBUTE_NORETURN void PrintFatalError(const Twine &Msg);
LLVM_ATTRIBUTE_NORETURN void PrintFatalError(ArrayRef<SMLoc> ErrorLoc,
Expand Down
87 changes: 22 additions & 65 deletions llvm/include/llvm/TableGen/Record.h
Expand Up @@ -58,7 +58,6 @@ class RecTy {
enum RecTyKind {
BitRecTyKind,
BitsRecTyKind,
CodeRecTyKind,
IntRecTyKind,
StringRecTyKind,
ListRecTyKind,
Expand Down Expand Up @@ -138,24 +137,6 @@ class BitsRecTy : public RecTy {
bool typeIsA(const RecTy *RHS) const override;
};

/// 'code' - Represent a code fragment
class CodeRecTy : public RecTy {
static CodeRecTy Shared;

CodeRecTy() : RecTy(CodeRecTyKind) {}

public:
static bool classof(const RecTy *RT) {
return RT->getRecTyKind() == CodeRecTyKind;
}

static CodeRecTy *get() { return &Shared; }

std::string getAsString() const override { return "code"; }

bool typeIsConvertibleTo(const RecTy *RHS) const override;
};

/// 'int' - Represent an integer value of no particular size
class IntRecTy : public RecTy {
static IntRecTy Shared;
Expand Down Expand Up @@ -306,7 +287,6 @@ class Init {
IK_FirstTypedInit,
IK_BitInit,
IK_BitsInit,
IK_CodeInit,
IK_DagInit,
IK_DefInit,
IK_FieldInit,
Expand Down Expand Up @@ -597,16 +577,18 @@ class IntInit : public TypedInit {

/// "foo" - Represent an initialization by a string value.
class StringInit : public TypedInit {
//// enum StringFormat {
//// SF_String, // Format as "text"
//// SF_Code, // Format as [{text}]
//// };
public:
enum StringFormat {
SF_String, // Format as "text"
SF_Code, // Format as [{text}]
};

private:
StringRef Value;
//// StringFormat Format;
StringFormat Format;

explicit StringInit(StringRef V)
: TypedInit(IK_StringInit, StringRecTy::get()), Value(V) {}
explicit StringInit(StringRef V, StringFormat Fmt)
: TypedInit(IK_StringInit, StringRecTy::get()), Value(V), Format(Fmt) {}

public:
StringInit(const StringInit &) = delete;
Expand All @@ -616,48 +598,25 @@ class StringInit : public TypedInit {
return I->getKind() == IK_StringInit;
}

static StringInit *get(StringRef);

StringRef getValue() const { return Value; }

Init *convertInitializerTo(RecTy *Ty) const override;
static StringInit *get(StringRef, StringFormat Fmt = SF_String);

bool isConcrete() const override { return true; }
std::string getAsString() const override { return "\"" + Value.str() + "\""; }

std::string getAsUnquotedString() const override {
return std::string(Value);
}

Init *getBit(unsigned Bit) const override {
llvm_unreachable("Illegal bit reference off string");
}
};

class CodeInit : public TypedInit {
StringRef Value;

explicit CodeInit(StringRef V)
: TypedInit(IK_CodeInit, static_cast<RecTy *>(CodeRecTy::get())),
Value(V) {}

public:
CodeInit(const StringInit &) = delete;
CodeInit &operator=(const StringInit &) = delete;

static bool classof(const Init *I) {
return I->getKind() == IK_CodeInit;
static StringFormat determineFormat(StringFormat Fmt1, StringFormat Fmt2) {
return (Fmt1 == SF_Code || Fmt2 == SF_Code) ? SF_Code : SF_String;
}

static CodeInit *get(StringRef);

StringRef getValue() const { return Value; }
StringFormat getFormat() const { return Format; }
bool hasCodeFormat() const { return Format == SF_Code; }

Init *convertInitializerTo(RecTy *Ty) const override;

bool isConcrete() const override { return true; }

std::string getAsString() const override {
return "[{" + Value.str() + "}]";
if (Format == SF_String)
return "\"" + Value.str() + "\"";
else
return "[{" + Value.str() + "}]";
}

std::string getAsUnquotedString() const override {
Expand Down Expand Up @@ -1438,6 +1397,9 @@ class RecordVal {
/// Get the type of the field value as a RecTy.
RecTy *getType() const { return TyAndPrefix.getPointer(); }

/// Get the type of the field for printing purposes.
std::string getPrintType() const;

/// Get the value of the field as an Init.
Init *getValue() const { return Value; }

Expand Down Expand Up @@ -1675,11 +1637,6 @@ class Record {
/// not a string and llvm::Optional() if the field does not exist.
llvm::Optional<StringRef> getValueAsOptionalString(StringRef FieldName) const;

/// This method looks up the specified field and returns
/// its value as a string, throwing an exception if the field if the value is
/// not a code block and llvm::Optional() if the field does not exist.
llvm::Optional<StringRef> getValueAsOptionalCode(StringRef FieldName) const;

/// This method looks up the specified field and returns
/// its value as a BitsInit, throwing an exception if the field does not exist
/// or if the value is not the right type.
Expand Down
22 changes: 13 additions & 9 deletions llvm/include/llvm/TableGen/SearchableTable.td
Expand Up @@ -67,9 +67,13 @@ class GenericTable {
// List of the names of fields of collected records that contain the data for
// table entries, in the order that is used for initialization in C++.
//
// For each field of the table named XXX, TableGen will look for a value
// called TypeOf_XXX and use that as a more detailed description of the
// type of the field if present. This is required for fields whose type
// TableGen needs to know the type of the fields so that it can format
// the initializers correctly. It can infer the type of bit, bits, string,
// Intrinsic, and Instruction values.
//
// For each field of the table named xxx, TableGen will look for a field
// named TypeOf_xxx and use that as a more detailed description of the
// type of the field. This is required for fields whose type
// cannot be deduced automatically, such as enum fields. For example:
//
// def MyEnum : GenericEnum {
Expand All @@ -85,15 +89,15 @@ class GenericTable {
// def MyTable : GenericTable {
// let FilterClass = "MyTableEntry";
// let Fields = ["V", ...];
// GenericEnum TypeOf_V = MyEnum;
// string TypeOf_V = "MyEnum";
// }
//
// Fields of type bit, bits<N>, string, Intrinsic, and Instruction (or
// derived classes of those) are supported natively.
// If a string field was initialized with a code literal, TableGen will
// emit the code verbatim. However, if a string field was initialized
// in some other way, but should be interpreted as code, then a TypeOf_xxx
// field is necessary, with a value of "code":
//
// Additionally, fields of type `code` can appear, where the value is used
// verbatim as an initializer. However, these fields cannot be used as
// search keys.
// string TypeOf_Predicate = "code";
list<string> Fields;

// (Optional) List of fields that make up the primary key.
Expand Down

0 comments on commit 415fab6

Please sign in to comment.