-
Notifications
You must be signed in to change notification settings - Fork 298
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Arc] MemoryReadPortOp: remove enable and clock operand, make it pure #5194
[Arc] MemoryReadPortOp: remove enable and clock operand, make it pure #5194
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for tackling all of this 🥳!
SymbolTableCollection symbolTable; | ||
mlir::SymbolUserMap userMap(symbolTable, getOperation()); | ||
for (auto defOp : arcsToLower) { | ||
auto *terminator = defOp.getBodyBlock().getTerminator(); | ||
builder.setInsertionPoint(terminator); | ||
builder.create<func::ReturnOp>(terminator->getLoc(), | ||
terminator->getOperands()); | ||
terminator->erase(); | ||
builder.setInsertionPoint(defOp); | ||
auto funcOp = builder.create<func::FuncOp>(defOp.getLoc(), defOp.getName(), | ||
defOp.getFunctionType()); | ||
funcOp->setAttr("llvm.linkage", | ||
LLVM::LinkageAttr::get(builder.getContext(), | ||
LLVM::linkage::Linkage::Internal)); | ||
funcOp.getBody().takeBody(defOp.getBody()); | ||
|
||
for (auto *user : userMap.getUsers(defOp)) { | ||
builder.setInsertionPoint(user); | ||
ValueRange results = builder | ||
.create<func::CallOp>( | ||
user->getLoc(), funcOp, | ||
cast<CallOpInterface>(user).getArgOperands()) | ||
->getResults(); | ||
user->replaceAllUsesWith(results); | ||
user->erase(); | ||
} | ||
|
||
defOp.erase(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering if we can avoid the need for a FuncOp
in the future if we made MemoryReadOp
not have a MemoryRead
side-effect, since LegalizeStateUpdate guarantees that reads do the right thing even if they are misordered according to their side-effects. Sounds ugly though...
Alternatively, we could think about converting all arcs into funcs at this point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand I would love to not have func.func operations at that stage already, on the other hand it would allow us to move read and write operation into the function and thus reducing the number of argument and result values of a call and the number of overall functions because we could merge arcs that are only separated by a memory effect operation.
Changing MemoryReadOp
to not have a memory read effect is likely not safe though. It's definitely fine before LegalizeStateUpdate, but we also use it after that pass and then, e.g., CSE, could change semantics. Another option could be lowering memories as part of LegalizeStateUpdate, not sure if I like that approach though.
Lowering all functions here already could totally be a way to go. I wasn't really sure what's the best way to deal with the memory side effect added in LowerState, so it would be great if we could discuss this in more detail at some point. My reasoning to not lower all arcs was that func.call
has arbitrary side-effects while arc.call
doesn't have any, so it could allow for more canonicalizations/optimizations, but having both at the same time might not be nice indeed.
arc.memory_read_port
making it a combinatorial read port. Latencies can be added using thearc.state
operation.arc.memory_read_port
pure to allow it to be moved intoarc.define
ops. This means the memory value has to be passed via argument. This will be considered to pass the whole memory as a value (likehw.array
) at this stage of the lowering. The LegalizeStateUpdates pass should make it easy to switch to pointer semantics for memories.sv.macro.decl
ops in StripSV (sorry for this unrelated change, but it's just 4 lines :) )!arc.memory
type and add a member function to conveniently compute the stride on-demand. Not removing it led to some issues in the lowering to LLVM which would have required bigger changes to AllocateState otherwise.memory_read_port
operandsfunc.call
op lowering in LowerArcToLLVM and lower arcs with LLVM internal linkage such that LLVM can remove those functions when getting inlined to reduce the objfile size to almost a third for rocket.Sorry for touching a lot of things in this PR. Let me know if I should try to split it up a bit more.