Skip to content

Commit

Permalink
[FileCheck] Add precision to format specifier
Browse files Browse the repository at this point in the history
Add printf-style precision specifier to pad numbers to a given number of
digits when matching them if the value is smaller than the given
precision. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:

[[#%.<precision><format specifier>, ...]

where <format specifier> is optional and ... can be a variable
definition or not with an empty expression or not. In the absence of a
precision specifier, a variable definition will accept leading zeros.

Reviewed By: jhenderson, grimar

Differential Revision: https://reviews.llvm.org/D81667
  • Loading branch information
Thomas Preud'homme committed Aug 30, 2020
1 parent 719548d commit 998709b
Show file tree
Hide file tree
Showing 5 changed files with 329 additions and 116 deletions.
76 changes: 46 additions & 30 deletions llvm/docs/CommandGuide/FileCheck.rst
Original file line number Diff line number Diff line change
Expand Up @@ -730,35 +730,60 @@ numeric expression constraint based on those variables via a numeric
substitution. This allows ``CHECK:`` directives to verify a numeric relation
between two numbers, such as the need for consecutive registers to be used.

The syntax to define a numeric variable is ``[[#%<fmtspec>,<NUMVAR>:]]`` where:
The syntax to capture a numeric value is
``[[#%<fmtspec>,<NUMVAR>:]]`` where:

* ``%<fmtspec>`` is an optional scanf-style matching format specifier to
indicate what number format to match (e.g. hex number). Currently accepted
format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent, the
format specifier defaults to ``%u``.
* ``%<fmtspec>,`` is an optional format specifier to indicate what number
format to match and the minimum number of digits to expect.

* ``<NUMVAR>:`` is an optional definition of variable ``<NUMVAR>`` from the
captured value.

The syntax of ``<fmtspec>`` is: ``.<precision><conversion specifier>`` where:

* ``.<precision>`` is an optional printf-style precision specifier in which
``<precision>`` indicates the minimum number of digits that the value matched
must have, expecting leading zeros if needed.

* ``<conversion specifier>`` is an optional scanf-style conversion specifier
to indicate what number format to match (e.g. hex number). Currently
accepted format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent,
the format specifier defaults to ``%u``.

* ``<NUMVAR>`` is the name of the numeric variable to define to the matching
value.

For example:

.. code-block:: llvm
; CHECK: mov r[[#REG:]], 0x[[#%X,IMM:]]
; CHECK: mov r[[#REG:]], 0x[[#%.8X,ADDR:]]
would match ``mov r5, 0xF0F0`` and set ``REG`` to the value ``5`` and ``IMM``
to the value ``0xF0F0``.
would match ``mov r5, 0x0000FEFE`` and set ``REG`` to the value ``5`` and
``ADDR`` to the value ``0xFEFE``. Note that due to the precision it would fail
to match ``mov r5, 0xFEFE``.

The syntax of a numeric substitution is
``[[#%<fmtspec>: <constraint> <expr>]]`` where:
As a result of the numeric variable definition being optional, it is possible
to only check that a numeric value is present in a given format. This can be
useful when the value itself is not useful, for instance:

* ``%<fmtspec>`` is the same matching format specifier as for defining numeric
variables but acting as a printf-style format to indicate how a numeric
expression value should be matched against. If absent, the format specifier
is inferred from the matching format of the numeric variable(s) used by the
expression constraint if any, and defaults to ``%u`` if no numeric variable
is used. In case of conflict between matching formats of several numeric
variables the format specifier is mandatory.
.. code-block:: gas
; CHECK-NOT: mov r0, r[[#]]
to check that a value is synthesized rather than moved around.


The syntax of a numeric substitution is
``[[#%<fmtspec>, <constraint> <expr>]]`` where:

* ``<fmtspec>`` is the same format specifier as for defining a variable but
in this context indicating how a numeric expression value should be matched
against. If absent, both components of the format specifier are inferred from
the matching format of the numeric variable(s) used by the expression
constraint if any, and defaults to ``%u`` if no numeric variable is used,
denoting that the value should be unsigned with no leading zeros. In case of
conflict between format specifiers of several numeric variables, the
conversion specifier becomes mandatory but the precision specifier remains
optional.

* ``<constraint>`` is the constraint describing how the value to match must
relate to the value of the numeric expression. The only currently accepted
Expand Down Expand Up @@ -824,20 +849,11 @@ but would not match the text:
Due to ``7`` being unequal to ``5 + 1`` and ``a0463443`` being unequal to
``a0463440 + 7``.

The syntax also supports an empty expression, equivalent to writing {{[0-9]+}},
for cases where the input must contain a numeric value but the value itself
does not matter:

.. code-block:: gas
; CHECK-NOT: mov r0, r[[#]]
to check that a value is synthesized rather than moved around.

A numeric variable can also be defined to the result of a numeric expression,
in which case the numeric expression constraint is checked and if verified the
variable is assigned to the value. The unified syntax for both defining numeric
variables and checking a numeric expression is thus
variable is assigned to the value. The unified syntax for both checking a
numeric expression and capturing its value into a numeric variable is thus
``[[#%<fmtspec>,<NUMVAR>: <constraint> <expr>]]`` with each element as
described previously. One can use this syntax to make a testcase more
self-describing by using variables instead of values:
Expand Down
132 changes: 91 additions & 41 deletions llvm/lib/Support/FileCheck.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,16 +43,28 @@ StringRef ExpressionFormat::toString() const {
llvm_unreachable("unknown expression format");
}

Expected<StringRef> ExpressionFormat::getWildcardRegex() const {
Expected<std::string> ExpressionFormat::getWildcardRegex() const {
auto CreatePrecisionRegex = [this](StringRef S) {
return (S + Twine('{') + Twine(Precision) + "}").str();
};

switch (Value) {
case Kind::Unsigned:
return StringRef("[0-9]+");
if (Precision)
return CreatePrecisionRegex("([1-9][0-9]*)?[0-9]");
return std::string("[0-9]+");
case Kind::Signed:
return StringRef("-?[0-9]+");
if (Precision)
return CreatePrecisionRegex("-?([1-9][0-9]*)?[0-9]");
return std::string("-?[0-9]+");
case Kind::HexUpper:
return StringRef("[0-9A-F]+");
if (Precision)
return CreatePrecisionRegex("([1-9A-F][0-9A-F]*)?[0-9A-F]");
return std::string("[0-9A-F]+");
case Kind::HexLower:
return StringRef("[0-9a-f]+");
if (Precision)
return CreatePrecisionRegex("([1-9a-f][0-9a-f]*)?[0-9a-f]");
return std::string("[0-9a-f]+");
default:
return createStringError(std::errc::invalid_argument,
"trying to match value with invalid format");
Expand All @@ -61,27 +73,47 @@ Expected<StringRef> ExpressionFormat::getWildcardRegex() const {

Expected<std::string>
ExpressionFormat::getMatchingString(ExpressionValue IntegerValue) const {
uint64_t AbsoluteValue;
StringRef SignPrefix = IntegerValue.isNegative() ? "-" : "";

if (Value == Kind::Signed) {
Expected<int64_t> SignedValue = IntegerValue.getSignedValue();
if (!SignedValue)
return SignedValue.takeError();
return itostr(*SignedValue);
if (*SignedValue < 0)
AbsoluteValue = cantFail(IntegerValue.getAbsolute().getUnsignedValue());
else
AbsoluteValue = *SignedValue;
} else {
Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue();
if (!UnsignedValue)
return UnsignedValue.takeError();
AbsoluteValue = *UnsignedValue;
}

Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue();
if (!UnsignedValue)
return UnsignedValue.takeError();
std::string AbsoluteValueStr;
switch (Value) {
case Kind::Unsigned:
return utostr(*UnsignedValue);
case Kind::Signed:
AbsoluteValueStr = utostr(AbsoluteValue);
break;
case Kind::HexUpper:
return utohexstr(*UnsignedValue, /*LowerCase=*/false);
case Kind::HexLower:
return utohexstr(*UnsignedValue, /*LowerCase=*/true);
AbsoluteValueStr = utohexstr(AbsoluteValue, Value == Kind::HexLower);
break;
default:
return createStringError(std::errc::invalid_argument,
"trying to match value with invalid format");
}

if (Precision > AbsoluteValueStr.size()) {
unsigned LeadingZeros = Precision - AbsoluteValueStr.size();
return (Twine(SignPrefix) + std::string(LeadingZeros, '0') +
AbsoluteValueStr)
.str();
}

return (Twine(SignPrefix) + AbsoluteValueStr).str();
}

Expected<ExpressionValue>
Expand Down Expand Up @@ -720,41 +752,59 @@ Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(
StringRef DefExpr = StringRef();
DefinedNumericVariable = None;
ExpressionFormat ExplicitFormat = ExpressionFormat();
unsigned Precision = 0;

// Parse format specifier (NOTE: ',' is also an argument seperator).
size_t FormatSpecEnd = Expr.find(',');
size_t FunctionStart = Expr.find('(');
if (FormatSpecEnd != StringRef::npos && FormatSpecEnd < FunctionStart) {
Expr = Expr.ltrim(SpaceChars);
if (!Expr.consume_front("%"))
StringRef FormatExpr = Expr.take_front(FormatSpecEnd);
Expr = Expr.drop_front(FormatSpecEnd + 1);
FormatExpr = FormatExpr.trim(SpaceChars);
if (!FormatExpr.consume_front("%"))
return ErrorDiagnostic::get(
SM, Expr, "invalid matching format specification in expression");

// Check for unknown matching format specifier and set matching format in
// class instance representing this expression.
SMLoc fmtloc = SMLoc::getFromPointer(Expr.data());
switch (popFront(Expr)) {
case 'u':
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Unsigned);
break;
case 'd':
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Signed);
break;
case 'x':
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexLower);
break;
case 'X':
ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexUpper);
break;
default:
return ErrorDiagnostic::get(SM, fmtloc,
"invalid format specifier in expression");
SM, FormatExpr,
"invalid matching format specification in expression");

// Parse precision.
if (FormatExpr.consume_front(".")) {
if (FormatExpr.consumeInteger(10, Precision))
return ErrorDiagnostic::get(SM, FormatExpr,
"invalid precision in format specifier");
}

Expr = Expr.ltrim(SpaceChars);
if (!Expr.consume_front(","))
if (!FormatExpr.empty()) {
// Check for unknown matching format specifier and set matching format in
// class instance representing this expression.
SMLoc FmtLoc = SMLoc::getFromPointer(FormatExpr.data());
switch (popFront(FormatExpr)) {
case 'u':
ExplicitFormat =
ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);
break;
case 'd':
ExplicitFormat =
ExpressionFormat(ExpressionFormat::Kind::Signed, Precision);
break;
case 'x':
ExplicitFormat =
ExpressionFormat(ExpressionFormat::Kind::HexLower, Precision);
break;
case 'X':
ExplicitFormat =
ExpressionFormat(ExpressionFormat::Kind::HexUpper, Precision);
break;
default:
return ErrorDiagnostic::get(SM, FmtLoc,
"invalid format specifier in expression");
}
}

FormatExpr = FormatExpr.ltrim(SpaceChars);
if (!FormatExpr.empty())
return ErrorDiagnostic::get(
SM, Expr, "invalid matching format specification in expression");
SM, FormatExpr,
"invalid matching format specification in expression");
}

// Save variable definition expression if any.
Expand Down Expand Up @@ -814,7 +864,7 @@ Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(
Format = *ImplicitFormat;
}
if (!Format)
Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned);
Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);

std::unique_ptr<Expression> ExpressionPointer =
std::make_unique<Expression>(std::move(ExpressionASTPointer), Format);
Expand Down Expand Up @@ -948,7 +998,7 @@ bool Pattern::parsePattern(StringRef PatternStr, StringRef Prefix,
bool IsLegacyLineExpr = false;
StringRef DefName;
StringRef SubstStr;
StringRef MatchRegexp;
std::string MatchRegexp;
size_t SubstInsertIdx = RegExStr.size();

// Parse string variable or legacy @LINE expression.
Expand Down Expand Up @@ -992,7 +1042,7 @@ bool Pattern::parsePattern(StringRef PatternStr, StringRef Prefix,
return true;
}
DefName = Name;
MatchRegexp = MatchStr;
MatchRegexp = MatchStr.str();
} else {
if (IsPseudo) {
MatchStr = OrigMatchStr;
Expand Down
20 changes: 12 additions & 8 deletions llvm/lib/Support/FileCheckImpl.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,15 +53,17 @@ struct ExpressionFormat {

private:
Kind Value;
unsigned Precision = 0;

public:
/// Evaluates a format to true if it can be used in a match.
explicit operator bool() const { return Value != Kind::NoFormat; }

/// Define format equality: formats are equal if neither is NoFormat and
/// their kinds are the same.
/// their kinds and precision are the same.
bool operator==(const ExpressionFormat &Other) const {
return Value != Kind::NoFormat && Value == Other.Value;
return Value != Kind::NoFormat && Value == Other.Value &&
Precision == Other.Precision;
}

bool operator!=(const ExpressionFormat &Other) const {
Expand All @@ -76,12 +78,14 @@ struct ExpressionFormat {
StringRef toString() const;

ExpressionFormat() : Value(Kind::NoFormat){};
explicit ExpressionFormat(Kind Value) : Value(Value){};

/// \returns a wildcard regular expression StringRef that matches any value
/// in the format represented by this instance, or an error if the format is
/// NoFormat.
Expected<StringRef> getWildcardRegex() const;
explicit ExpressionFormat(Kind Value) : Value(Value), Precision(0){};
explicit ExpressionFormat(Kind Value, unsigned Precision)
: Value(Value), Precision(Precision){};

/// \returns a wildcard regular expression string that matches any value in
/// the format represented by this instance and no other value, or an error
/// if the format is NoFormat.
Expected<std::string> getWildcardRegex() const;

/// \returns the string representation of \p Value in the format represented
/// by this instance, or an error if conversion to this format failed or the
Expand Down
Loading

0 comments on commit 998709b

Please sign in to comment.