-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[TableGen][DecoderEmitter] Add extractBits() overloads #159405
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-mc Author: Sergei Barannikov (s-barannikov) ChangesThese differ from Using them in the generated code significantly speeds up compilation in release builds (up to 4 times faster, depending on the used compiler). Full diff: https://github.com/llvm/llvm-project/pull/159405.diff 2 Files Affected:
diff --git a/llvm/include/llvm/MC/MCDecoder.h b/llvm/include/llvm/MC/MCDecoder.h
index 175f6a9591558..770941da5f31c 100644
--- a/llvm/include/llvm/MC/MCDecoder.h
+++ b/llvm/include/llvm/MC/MCDecoder.h
@@ -58,6 +58,24 @@ uint64_t fieldFromInstruction(const std::bitset<N> &Insn, unsigned StartBit,
return ((Insn >> StartBit) & Mask).to_ullong();
}
+template <unsigned StartBit, unsigned NumBits, typename T>
+inline std::enable_if_t<std::is_unsigned_v<T>, T> extractBits(T Val) {
+ static_assert(StartBit + NumBits <= std::numeric_limits<T>::digits);
+ return (Val >> StartBit) & maskTrailingOnes<T>(NumBits);
+}
+
+template <unsigned StartBit, unsigned NumBits, size_t N>
+uint64_t extractBits(const std::bitset<N> &Val) {
+ static_assert(StartBit + NumBits <= N);
+ std::bitset<N> Mask = maskTrailingOnes<uint64_t>(NumBits);
+ return ((Val >> StartBit) & Mask).to_ullong();
+}
+
+template <unsigned StartBit, unsigned NumBits>
+uint64_t extractBits(const APInt &Val) {
+ return Val.extractBitsAsZExtValue(NumBits, StartBit);
+}
+
} // namespace llvm::MCD
#endif // LLVM_MC_MCDECODER_H
diff --git a/llvm/utils/TableGen/DecoderEmitter.cpp b/llvm/utils/TableGen/DecoderEmitter.cpp
index 3a464e01042dc..5a087eefaaf8d 100644
--- a/llvm/utils/TableGen/DecoderEmitter.cpp
+++ b/llvm/utils/TableGen/DecoderEmitter.cpp
@@ -1031,8 +1031,7 @@ static void emitBinaryParser(raw_ostream &OS, indent Indent,
// One variable part and no/zero constant part. Initialize `tmp` with the
// variable part.
auto [Base, Width, Offset] = OpInfo.fields().front();
- OS << Indent << "tmp = fieldFromInstruction(insn, " << Base << ", " << Width
- << ')';
+ OS << Indent << "tmp = extractBits<" << Base << ", " << Width << ">(insn)";
if (Offset)
OS << " << " << Offset;
OS << ";\n";
@@ -1042,8 +1041,8 @@ static void emitBinaryParser(raw_ostream &OS, indent Indent,
OS << Indent << "tmp = " << format_hex(OpInfo.InitValue.value_or(0), 0)
<< ";\n";
for (auto [Base, Width, Offset] : OpInfo.fields()) {
- OS << Indent << "tmp |= fieldFromInstruction(insn, " << Base << ", "
- << Width << ')';
+ OS << Indent << "tmp |= extractBits<" << Base << ", " << Width
+ << ">(insn)";
if (Offset)
OS << " << " << Offset;
OS << ";\n";
|
@llvm/pr-subscribers-tablegen Author: Sergei Barannikov (s-barannikov) ChangesThese differ from Using them in the generated code significantly speeds up compilation in release builds (up to 4 times faster, depending on the used compiler). Full diff: https://github.com/llvm/llvm-project/pull/159405.diff 2 Files Affected:
diff --git a/llvm/include/llvm/MC/MCDecoder.h b/llvm/include/llvm/MC/MCDecoder.h
index 175f6a9591558..770941da5f31c 100644
--- a/llvm/include/llvm/MC/MCDecoder.h
+++ b/llvm/include/llvm/MC/MCDecoder.h
@@ -58,6 +58,24 @@ uint64_t fieldFromInstruction(const std::bitset<N> &Insn, unsigned StartBit,
return ((Insn >> StartBit) & Mask).to_ullong();
}
+template <unsigned StartBit, unsigned NumBits, typename T>
+inline std::enable_if_t<std::is_unsigned_v<T>, T> extractBits(T Val) {
+ static_assert(StartBit + NumBits <= std::numeric_limits<T>::digits);
+ return (Val >> StartBit) & maskTrailingOnes<T>(NumBits);
+}
+
+template <unsigned StartBit, unsigned NumBits, size_t N>
+uint64_t extractBits(const std::bitset<N> &Val) {
+ static_assert(StartBit + NumBits <= N);
+ std::bitset<N> Mask = maskTrailingOnes<uint64_t>(NumBits);
+ return ((Val >> StartBit) & Mask).to_ullong();
+}
+
+template <unsigned StartBit, unsigned NumBits>
+uint64_t extractBits(const APInt &Val) {
+ return Val.extractBitsAsZExtValue(NumBits, StartBit);
+}
+
} // namespace llvm::MCD
#endif // LLVM_MC_MCDECODER_H
diff --git a/llvm/utils/TableGen/DecoderEmitter.cpp b/llvm/utils/TableGen/DecoderEmitter.cpp
index 3a464e01042dc..5a087eefaaf8d 100644
--- a/llvm/utils/TableGen/DecoderEmitter.cpp
+++ b/llvm/utils/TableGen/DecoderEmitter.cpp
@@ -1031,8 +1031,7 @@ static void emitBinaryParser(raw_ostream &OS, indent Indent,
// One variable part and no/zero constant part. Initialize `tmp` with the
// variable part.
auto [Base, Width, Offset] = OpInfo.fields().front();
- OS << Indent << "tmp = fieldFromInstruction(insn, " << Base << ", " << Width
- << ')';
+ OS << Indent << "tmp = extractBits<" << Base << ", " << Width << ">(insn)";
if (Offset)
OS << " << " << Offset;
OS << ";\n";
@@ -1042,8 +1041,8 @@ static void emitBinaryParser(raw_ostream &OS, indent Indent,
OS << Indent << "tmp = " << format_hex(OpInfo.InitValue.value_or(0), 0)
<< ";\n";
for (auto [Base, Width, Offset] : OpInfo.fields()) {
- OS << Indent << "tmp |= fieldFromInstruction(insn, " << Base << ", "
- << Width << ')';
+ OS << Indent << "tmp |= extractBits<" << Base << ", " << Width
+ << ">(insn)";
if (Offset)
OS << " << " << Offset;
OS << ";\n";
|
These differ from `fieldFromInstruction()` in that StartBit and NumBits are the template parameters. Using them in the generated code significantly speeds up compilation in release builds (up to 4 times faster, depending on the used compiler).
7ddc9c1
to
07c9725
Compare
I'll have to drop |
Any idea why compilation is faster? I'd expect templating code will slow down compilation |
I've seen InstCombine, SimplifyCFG, RegisterCoalescer and TwoAddressInstructionPass in the time report. I'll try to collect more data. |
Nah, I was just curious. Trying this on our downstream target is not that easy as we are a currently quite a bit behind from upstream. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sorry, I can no longer reproduce the announced speedup. There is a difference, but it is small. |
It looks like I compared with and without the |
These differ from
fieldFromInstruction()
in that StartBit and NumBits are the template parameters.Using them in the generated code significantly speeds up compilation in release builds (up to 4 times faster, depending on the used compiler).