-
Notifications
You must be signed in to change notification settings - Fork 14.4k
Minimal unwinding information (DWARF CFI) checker #145633
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…mple CFI analysis
…/Write conditions
- Changing the register and not informing it using cfi directives - Changing the CFA and report it wrong - Load a register from another register stored location
/// way to modify it is to append a new row by updating it with a CFI directive, | ||
/// and the only way to read from it is to read the last row (i.e., current CFI | ||
/// state) from the table. The fetched row is constant and should not be | ||
/// modified or deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding all the comments!
This comment still leaves me confused about why it's necessary to store a table of rows at all. Why doesn't this class just store a single dwarf::UnwindRow
(or perhaps an optional
one), update it in place in update()
, and return it by value from getCurrentUnwindRow()
? What is the purpose of storing the whole history of previous rows, if there's no API to get them out again, and nothing inside the class implementation needs to refer to them to compute the new rows?
Perhaps you have an intention for future development which will make use of the stored history? If so, that seems like a thing to put in the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the main reason for the previous iterator
design was that when reporting the error to the user, we want to be able to refer to the exact directive that caused this error. To do that, the only way is to iterate the table from the last row to the current row, one by one, find the row that applies the change, and report the error to the corresponding directive.
But for now, in this version, errors are reported to the instruction. I can:
- Try to figure out a way to report the error to the directive: The logic is not very simple. To do so, I need to add a row <-> directive mapping. Additionally, between the last row and the current row, a value may change multiple times, which means we must choose a policy to determine which changes correspond to the reporting changes.
- Store a single
shared_pointer<UnwindRow>
as the current row, the analysis can read it, and it will be freed when no one has access to it. Aside from using smart pointers, I don't see a way to properly implement this.
Let me know which one I should pursue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see; so you don't need to keep all the rows right back to the start of the function, only the ones between the PrevRow
and NextRow
rows in DWARFCFIAnalysis::update
?
So it would be enough to give DWARFCFIState
a method to nearly empty its vector (reducing it to just one entry which was previously the final entry). Then DWARFCFIAnalysis::update
could start by calling that. And now the size of DWARFCFIState
doesn't grow in proportion to the size of the whole function, only the number of table rows generated by a single directive at a time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But whichever approach you take, if you're storing information in a vector which is not currently used for anything, please write down in a comment why you're storing it at all, and what it's intended to be used for in future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
The state now only stores the last row, and the other rows are copied instead of being shared with a smart pointer.
Sharing through a pointer might save one copy (which may be optimized away by the compiler), and that one extra copy is OK to have. Each row size is linear to the number of register bytes.
Analyzer(std::move(Analyzer)), LastDirectiveIndex(0) { | ||
assert(this->Analyzer && "Analyzer should not be null"); | ||
Receiver(std::move(Receiver)), LastDirectiveIndex(0) { | ||
assert(this->Receiver && "Analyzer should not be null"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the rename of Analyzer
to the more purpose-neutral Receiver
, but you left one unchanged in the assert failure message here 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
} | ||
|
||
// Each case is annotated with its corresponding number as described in the | ||
// `DWARFCFIAnalysis`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a reference to the specific file, please? "As described in llvm/include/llvm/DWARFCFIChecker/DWARFCFIAnalysis.h
." If you're going to carefully map the code here to the comment in another file, it should be as easy as possible for someone to find that comment to compare against.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, I guess this was the only place I referenced another file documentation. Let me know if there is another place that I can fix.
// Case 2 | ||
if (PrevLoc.getLocation() != NextLoc.getLocation()) { // Case 2.a | ||
Context->reportWarning(Inst.getLoc(), | ||
formatv("uncheckable change happened to register " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A small wording point: could you make sure this checker takes the blame for not being able to check this?
The current wording is ambiguous enough that a user might think it's their own fault. "Is my change uncheckable? Should I have written a different one to make it easier to check?" But in fact many of these changes are checkable in principle, it's just that this particular checker doesn't have the ability to check them yet.
I think wording along the lines of "checking correctness of changes to register unwinding rules is not yet implemented" would communicate more clearly what the user needs to know:
- it's not my fault
- there's nothing I can change in my code to make this warning go away
- the checker isn't telling me my rule is wrong, just that it doesn't know how to make sure it's right
- maybe in a future version this will be better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
/// way to modify it is to append a new row by updating it with a CFI directive, | ||
/// and the only way to read from it is to read the last row (i.e., current CFI | ||
/// state) from the table. The fetched row is constant and should not be | ||
/// modified or deleted. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But whichever approach you take, if you're storing information in a vector which is not currently used for anything, please write down in a comment why you're storing it at all, and what it's intended to be used for in future.
…m#145804) All these changes are being used in [PR#145633](llvm#145633) `CFIProgram`: - `addInstruction` methods already exists, but more convenient ones are private, this PR makes them public `UnwindLocation`: - Added a field accessor method for `Dereference` like other field access methods.
llvm/tools/llvm-mc/llvm-mc.cpp
Outdated
if (FileType == OFT_AssemblyFile) { | ||
if (ValidateCFI) { | ||
assert(FileType == OFT_Null); | ||
auto *FFS = new CFIFunctionFrameStreamer(Ctx, std::move(FFA)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
prefer make_unique here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
… class definition
…ammer's fault and the validation is not implemented yet. Change the tests accordingly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the fixes!
I've run out of things to complain about now 🙂 but I see that Petr has just nominated a couple more reviewers, so please give those people time to have thoughts as well.
/// instantiates a `DWARFCFIAnalysis` for each frame. The errors/warnings are | ||
/// emitted through the `MCContext` instance to the constructor. If a frame | ||
/// finishes without being started or if all the frames are not finished before | ||
/// this classes is destructured, the program fails through an assertion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"destructed".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
|
||
void CFIFunctionFrameStreamer::emitInstruction(const MCInst &Inst, | ||
const MCSubtargetInfo &STI) { | ||
updateReceiver(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels weird to me. updateReceiver works on LastInstruction, but not the instruction that came via emitInstruction. So updateReceiver defers analyzing Inst until the next call. It seems likely that there is a good reason for this, but I can't tell what it is.
At the very least this needs some kind of comment explaining what is going on. I might even have updateReciever take the instruction to analyze as a parameter, which makes things even more clear that current instruction is being deferred.
By my count, updateReciever has three call sites, and I can't tell off the top of my head which ones I would expect LastInstruction to be not present. I think two? Given that it is only used at the end of the function, perhaps it makes sense to split that portion into a new function that is called more selectively? (But perhaps that isn't possible.
In any event. Needs comments. Especially on why the very last LastInstruction will get analyzed even though it is deferred here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks to your comment, I was able to find a bug/design choice in MCStreamer
that the frames are always forming a stack, which is not always true. But this analysis, like MCStreamer
assumes it too.
The main reason for the delay is that the directives associated with an instruction come after the instruction, so when the streamer encounters an instruction, it means that the directives appended from the last instruction to now are the directives associated with the last instruction, not the new instruction.
I moved the updateDirectivesRange
method to the updateReceiver
and added documentation explaining its purpose and potential side effects. Also, I added documentation for each internal state stored in CFIFunctionFrameStreamer
and how updateReceiver
maintains them.
//===----------------------------------------------------------------------===// | ||
/// | ||
/// \file | ||
/// This file declares CFIFunctionFrameReceiver class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What motivates the abstract base class here? it feels a little yagni, and adds complexity that may not be realistically needed. But perhaps there are plans for multiple implementations later?
llvm/tools/llvm-mc/llvm-mc.cpp
Outdated
@@ -519,8 +526,15 @@ int main(int argc, char **argv) { | |||
std::unique_ptr<MCInstrInfo> MCII(TheTarget->createMCInstrInfo()); | |||
assert(MCII && "Unable to create instruction info!"); | |||
|
|||
auto FFA = std::make_unique<CFIFunctionFrameAnalyzer>(Ctx, *MCII); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this constructed here, rather than created down at line 534 - the only place it's used?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved.
llvm/lib/DWARFCFIChecker/Registers.h
Outdated
inline MCPhysReg getSuperReg(const MCRegisterInfo *MCRI, MCPhysReg Reg) { | ||
if (isSuperReg(MCRI, Reg)) | ||
return Reg; | ||
for (auto SuperReg : MCRI->superregs(Reg)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some inconsistency with bracing in this patch - like this loop has braces, but a very similar loop at line 49 doesn't... I think technically according to the style guide, both should omit braces, but I could go either way.
(there's also some other rather long-bodied for/if{}
without the braces on the for which get a bit of a stretch - but, yeah, the rules are pretty vague)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Tried to change them to match the styling guide.
This PR adds a minimal version of
UnwindInfoChecker
described in here.This implementation looks into the modified registers by each instruction and checks:
This implementation does not support DWARF expressions and treats them as unknown unwinding rules.