Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace llvm.memcpy et al's i1 isVolatile with i8 VolFlags #65748

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

urnathan
Copy link
Contributor

@urnathan urnathan commented Sep 8, 2023

Modify memcpy, memcpy.inline & memmove intrinsics to change the final i1 isVolatile parameter becomes an i8 VolFlags parameter. The former is true if either (or both) dst and src point to volatile storage. The new VolFlags separates dst and src volatilities to separate bits in the argument value. This allows any expansion code to only mark the volatile accesses as volatile. The motivating use is copying a volatile structure to/from non-volatile storage.

The patch series:

* Changes the intrinsics, documentation and AutoUpdater. The APIs are not updated, and the existing APIs convert false/true to 0/3 as required.
* Update MLIR
* Update Flang
* Update Polly (tests only)
* At this point the compiler builds, but tests fail
* Update LLVM & Clang non-UTC test cases
* Update LLVM & Clang UTC test cases
* At this point the conversion is done, but nothing makes use or can generate the new values.
* Update IRBuilder APIs to specify separate Dst & Src Volatilities. The API change is backwards compatible and a deprecation warning is added in a later patch (it's quite noisy until the remaing conversions are done).
* Update Clang code gen to specify separate Dst & Src Volatilities.
* At this point IRBuilder and Clang are generating more precise volatility information, but no target lowering makes use of it.
* Update TargetLowering APIs. New predicates added to MemTransferInst, and as with the above API change, this is source compatible, but will eventually educes deprecation warnings.
* Update llvm transforms and target lowering APIs. New EmitTargetCodeForMem{cpy,move} added to SelectionDagTargetInfo, source compatible by forwarding to the old API.
* Target-specific changes for the new APIs.
* Motivating testcase.
* Convert remaining old-API uses, but mark (some) locations as candidates for further optimization opportunities.
* Deprecate the compatibility APIs, this alerts out-of-tree maintainers.
* Conversion of AArch64's bespoke memcpy/memmove expansion to use the new EmitTargetCodeForMem{cpy,move} API.

The MLIR and Flang conversions took about 4 hours each -- not having previously looked at those code bases. Thus I believe any out-of-tree conversions will be straight forwards too. Just pay attention to the deprecation warnings and pass the correct tuple.

90 % of this diff are mechanical testcase changes (either via sed or UTC). Most of the rest is fairly mechanical, located via deprecation warnings.

Modify memcpy, memcpy.inline & memmove intrinsics to change the final
i1 isVolatile parameter to be an i8 VolFlags parameter. The former is
true if either (or both) dst and src point to volatile storage. The
new VolFlags separates dst and src volatilities to separate bits in
the argument value. This allows any expansion code to only mark the
volatile accesses as volatile. The motivating use is copying a
volatile structure to/from non-volatile storage. We get better code
generation when expanding such one-sided memcpys, as the non-volatile
side can (often) be held in registers.

Other than changing the intrinsics, the APIs are not adjusted in this patch.
This update MLIR for the intrinsics isVolatile->VolFlags change. The
MLIR parameter name remains isVolatile.
Fortran has no concept of 'volatile', but this updates Flang's IR to
represent memcpy/memmove's new VolFlags argument. We only pass i8:0
(replacing the previous i1:false).
Update the non-UTC llvm testcases for memcpy/memmove intrinsic
isVolatile->VolFlags change. Only the check lines are altered --
existing llvmir using the old boolean is left and altered during read
in.
Update the non-UTC clang testcases for memcpy/memmove intrinsic
isVolatile->VolFlags change. Only the check lines are altered --
existing llvmir using the old boolean is left and altered during read
in.
Regenerate llvm UTC tests for the isVolatile->VolFlags change.
Regenerate clang UTC tests for the isVolatile->VolFlags change. Most
of these were first generated via sed, and then regenerated via
UTC.
Introduce llvm::MemTransferVolatility. This provides separate Dst &
Src volatility flags, along with compatibility functions to convert
from and to a single bool. Those will be deprecated in a later
patch, so uses generate warnings.

The memcpy et al builders now take a MemTransferVolality, and encode
it into the VolFlags argument value.
Update Clang's generation of memcpy et al intrinsics to specify
separate Dst & Src volatilities. Completes Clang's conversion to the
new API.
Add separate dst & src volatility accessors to MemTransferInst. Add
MemTransferVolatility to selection dag's getMemcpy,
getMemmove. Existing uses are source compatible, but that API will be
deprecated in a later patch to catch any remaining uses.
Replace single isVolatile with VolFlags. Add a new
SelectionDAGTargetInfo::EmitCodeForMem{Cpy,Move} virtual function with
the new API. The default implementation of which forwards to the old
API.
Update all the Target code generation to use the new APIs. Does not
convert the EmitTargetCodeForMem{cpy,move} implementations.
The motivating testcase for memcpy/memmove VolFlags change. Copying a
volatile strucure to/from non-volatile storage should not pessimize
the latter. With the new flags, we can hoist the non-volatile object
into registers and process it directly there.
@urnathan urnathan requested review from a team as code owners September 11, 2023 18:47
@@ -214,14 +214,14 @@ class LLVM_MemcpyIntrOpBase<string name> :
/*requiresAccessGroup=*/1, /*requiresAliasAnalysis=*/1> {
dag args = (ins Arg<LLVM_AnyPointer,"",[MemWrite]>:$dst,
Arg<LLVM_AnyPointer,"",[MemRead]>:$src,
AnySignlessInteger:$len, I1Attr:$isVolatile);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me that should handle this as we do with the FastMathFlags: that is a enum attribute modeling the bitfields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, looking ...

@@ -24790,8 +24790,8 @@ static SDValue LowerVACOPY(SDValue Op, const X86Subtarget &Subtarget,
return DAG.getMemcpy(
Chain, DL, DstPtr, SrcPtr,
DAG.getIntPtrConstant(Subtarget.isTarget64BitLP64() ? 24 : 16, DL),
Align(Subtarget.isTarget64BitLP64() ? 8 : 4), /*isVolatile*/ false, false,
false, MachinePointerInfo(DstSV), MachinePointerInfo(SrcSV));
Align(Subtarget.isTarget64BitLP64() ? 8 : 4), /*Vol=*/{false, false},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For understandability, should we use a MemTransferVolatility builder helper instead of embedding implicit values like this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, something like MemTransferVolatility().setDst(true/false).setSrc(true/false) ? Do you have preference for names?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No preference for names - do we need to handle runtime bool values or could a few basic hardcoded combos be enough MemTransferVolatility::neverVolatile() / MemTransferVolatility::readVolatile() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a patch that uses MemTransferVolatility default ctor for the non-volatile case, and allows .Dst(bool), .Src(bool) and/or .All(bool) to explicitly specify. One does need the ability to provide runtime values there.

@nikic nikic requested review from nikic and removed request for a team September 14, 2023 07:58
@nikic
Copy link
Contributor

nikic commented Sep 15, 2023

Rather than permit {bool, bool}, use
MemTransferVolatility().Dst(bool).Src(bool).All(bool). One can only
set bits this way, not clear them.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants