Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[main] Add feature macro __ARM_FEATURE_RCPC #199

Merged
merged 2 commits into from
Sep 20, 2022

Conversation

minglotus-6
Copy link
Contributor

@minglotus-6 minglotus-6 commented Jun 16, 2022

To guard the usage of Load-AcquirePC instructions in inline assembly, https://reviews.llvm.org/D127798 introduces macro __ARM_FEATURE_RCPC in Clang/LLVM.

This patch documents the existence of the feature MACRO, its availability and use case.

Checklist: (mark with X those which apply)

  • If an issue reporting the bug exists, I have mentioned it in the
    PR (do not bother creating the issue if all you want to do is
    fixing the bug yourself).
  • I have added/updated the SPDX-FileCopyrightText lines on top
    of any file I have edited. Format is SPDX-FileCopyrightText: Copyright {year} {entity or name} <{contact informations}>
    (Please update existing copyright lines if applicable. You can
    specify year ranges with hyphen , as in 2017-2019, and use
    commas to separate gaps, as in 2018-2020, 2022).
  • I have updated the Copyright section of the sources of the
    specification I have edited (this will show up in the text
    rendered in the PDF and other output format supported). The
    format is the same described in the previous item.
  • I have run the CI scripts (if applicable, as they might be
    tricky to set up on non-*nix machines). The sequence can be
    found in the contribution
    guidelines
    . Don't
    worry if you cannot run these scripts on your machine, your
    patch will be automatically checked in the Actions of the pull
    request.
  • I have added an item that describes the changes I have
    introduced in this PR in the section Changes for next
    release
    of the section Change Control/Document history
    of the document. Create Changes for next release if it does
    not exist. Notice that changes that are not modifying the
    content and rendering of the specifications (both HTML and PDF)
    do not need to be listed.
  • When modifying content and/or its rendering, I have checked the
    correctness of the result in the PDF output (please refer to the
    instructions on how to build the PDFs
    locally
    ).
  • The variable draftversion is set to true in the YAML header
    of the sources of the specifications I have modified.
  • Please DO NOT add my GitHub profile to the list of contributors
    in the README page of the project.

@fpetrogalli
Copy link
Contributor

fpetrogalli commented Jun 30, 2022

Hello @minglotus-6, thank you for your patch. May I ask you to go through the checklist I have added to the description of the PR? The Contribution Agreement in the CONTRIBUTING page allows you or the company you work for to retain copyright for the changes.

Kind regards,

Francesco

@lenary
Copy link
Contributor

lenary commented Jun 30, 2022

Broadly, I am supportive of this from a technical perspective. However, I do have some minor comments.

The armv8a Arm ARM (https://developer.arm.com/documentation/ddi0487/ha/) defines the following values for the ID field for LRCPC in ID_AA64ISAR1_EL1:

  • 0b0000 corresponds to none of FEAT_LRCPC*
  • 0b0001 corresponds to FEAT_LRCPC (LDAPR* instructions)
  • 0b0010 corresponds to FEAT_LRCPC2 (LDAPUR* and STLUR*)

It would make sense for this macro, therefore, to match these values, as this gives a better idea of which instructions these enable in inline assembly. We do something similar with __ARM_ARCH (though with more complex rules). I'm not sure what to do about the 0-value case: as it seems reasonable also to leave the macro undefined when you have no LRCPC features.

I would also like the description to reference the official FEAT_* names somewhere, for the avoidance of doubt about exactly which values correspond to which instructions.

I think leaving the macro name as __ARM_FEATURE_RCPC is fine, as the compiler switch to enable FEAT_LRCPC instructions is +rcpc (Documented for GNU here: https://developer.arm.com/Tools%20and%20Software/GNU%20Toolchain#Supported-Devices, but Clang/LLVM matches this), and the consistency model in the Arm ARM is called "RCpc".

@minglotus-6
Copy link
Contributor Author

Hello @minglotus-6, thank you for your patch. May I ask you to go through the checklist I have added to the description of the PR? The Contribution Agreement in the CONTRIBUTING page allows you or the company you work for to retain copyright for the changes.

Kind regards,

Francesco

Sure. I saw the checklist but was not sure if I should list them before. Did it this time.

@minglotus-6
Copy link
Contributor Author

Broadly, I am supportive of this from a technical perspective. However, I do have some minor comments.

The armv8a Arm ARM (https://developer.arm.com/documentation/ddi0487/ha/) defines the following values for the ID field for LRCPC in ID_AA64ISAR1_EL1:

  • 0b0000 corresponds to none of FEAT_LRCPC*
  • 0b0001 corresponds to FEAT_LRCPC (LDAPR* instructions)
  • 0b0010 corresponds to FEAT_LRCPC2 (LDAPUR* and STLUR*)

It would make sense for this macro, therefore, to match these values, as this gives a better idea of which instructions these enable in inline assembly. We do something similar with __ARM_ARCH (though with more complex rules). I'm not sure what to do about the 0-value case: as it seems reasonable also to leave the macro undefined when you have no LRCPC features.

I would also like the description to reference the official FEAT_* names somewhere, for the avoidance of doubt about exactly which values correspond to which instructions.

Keeping macro value consistent with register field value makes a lot of sense to me. This allows inline assembly users to further gate the usage of LDAPUR* or STLUR* instructions. Use a table to describe the value of macro, and include FEAT_* in feature columns.

I'd like to mention one thing and hear feedback :) I wonder if using bitmaps is a recommended approach, say more values could be added to LRCPC field of ID_AA64ISAR1_EL1 in the future.

To elaborate,

  1. The macro value is NOT described as bitmap in the current PR, imagining users can write code like
#if DEFINED(__ARM_FEATURE_RCPC)
  #if __ARM_FEATURE_RCPC == 2
    Use LDAPUR or STLUR instructions
  #elif __ARM_FEATURE_RCPC == 1
    Use LDAPR instructions
  #else
    Use non-RCPC instructions
#endif
  1. On the other hand, many existing feature macros (e.g., __ARM_FP [1] or __ARM_FEATURE_CDE_COPROC [2]) is a bitmap, and valid values (or results of one-bit values) are documented.

[1] https://github.com/ARM-software/acle/blob/main/main/acle.md#hardware-floating-point
[2] https://github.com/ARM-software/acle/blob/main/main/acle.md#custom-datapath-extension

I think leaving the macro name as __ARM_FEATURE_RCPC is fine, as the compiler switch to enable FEAT_LRCPC instructions is +rcpc (Documented for GNU here: https://developer.arm.com/Tools%20and%20Software/GNU%20Toolchain#Supported-Devices, but Clang/LLVM matches this), and the consistency model in the Arm ARM is called "RCpc".

@minglotus-6
Copy link
Contributor Author

minglotus-6 commented Jul 2, 2022

A side-topic that's more relevant with Clang/LLVM but less relevant here, is that --march=<other-features>+rcpc2 is not supported by Clang yet [1]

As a result, for armv8.2 with rcpc2 extensions, there isn't a way for user to tell compiler to set __ARM_FEATURE_RCPC to 2 in the current state -> compiler could set the macro to 1 if compiler user specifies --march=+rcpc. Yet this could be feasible to fix in Clang/LLVM.

[1] https://github.com/llvm/llvm-project/blob/40d2ef841b68f6b493ce88bd750a92105a2b567d/llvm/include/llvm/Support/AArch64TargetParser.def#L129 is where features are defined, and https://godbolt.org/z/Gh4safcx7 exemplifies it

@fpetrogalli
Copy link
Contributor

@minglotus-6 - thank you for updating the list.

May I ask you to add also the copyright statement (it is required in the contribution agreement) and update the variable draftversion of the YAML header to true?

WRT the copyright statements, what you need to do is described in the checklist of the description of the PR.

Please let me know if anything is unclear!

Thank you,

Francesco

Copy link
Contributor

@lenary lenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to mention one thing and hear feedback :) I wonder if using bitmaps is a recommended approach, say more values could be added to LRCPC field of ID_AA64ISAR1_EL1 in the future.

Please do not propose a bitmap in this case. FEAT_LRCPC2 implies FEAT_LRCPC, while they have what look like orthogonal bitmap values, this is just a coincidence from assigning 1 then 2.

I much prefer your example (1), especially so programmers can write __ARM_FEATURE_RCPC >= 1 if they only have a version which requires LDAPR.

With respect to the FP and CDE extensions, the values in there are orthogonal (though the FP one tells you the exact values that preprocessor definition will ever have, showing it's not fully orthogonal).

Below, I'm broadly happy, though I have one comment about the table entry.

main/acle.md Outdated Show resolved Hide resolved
@minglotus-6
Copy link
Contributor Author

@minglotus-6 - thank you for updating the list.

May I ask you to add also the copyright statement (it is required in the contribution agreement) and update the variable draftversion of the YAML header to true?

WRT the copyright statements, what you need to do is described in the checklist of the description of the PR.

Please let me know if anything is unclear!

Thank you,

Francesco

Thanks for pointing this out Francesco! I've just read some guidelines inside the company I worked for and will also consult to see what I plan to fill in looks good. I'll update once I get some guidance inside the company. As you may see I'm not very familiar with this contributing process and I appreciate your patience and guidance here!

@minglotus-6
Copy link
Contributor Author

I'd like to mention one thing and hear feedback :) I wonder if using bitmaps is a recommended approach, say more values could be added to LRCPC field of ID_AA64ISAR1_EL1 in the future.

Please do not propose a bitmap in this case. FEAT_LRCPC2 implies FEAT_LRCPC, while they have what look like orthogonal bitmap values, this is just a coincidence from assigning 1 then 2.

Acknowledged, thanks for taking a look!

I much prefer your example (1), especially so programmers can write __ARM_FEATURE_RCPC >= 1 if they only have a version which requires LDAPR.

With respect to the FP and CDE extensions, the values in there are orthogonal (though the FP one tells you the exact values that preprocessor definition will ever have, showing it's not fully orthogonal).

This makes sense.

Below, I'm broadly happy, though I have one comment about the table entry.

Applied the suggestions.

To guard the usage of Load-AcquirePC instructions in inline assembly, https://reviews.llvm.org/D127798 introduces macro __ARM_FEATURE_RCPC in Clang/LLVM.

This patch documents the existence of the feature MACRO, and its use case.

Co-authored-by: Sam Elliott <sam@lenary.co.uk>
@rearnsha
Copy link

rearnsha commented Jul 5, 2022

Historically we have not added #defines to ACLE unless they guard, or describe, features or intrinsics provided by ACLE and that are visible directly to C programmers (ie, providing macros to conditionalize assembly language is out of scope). So exactly what does this new macro guard?

@minglotus-6
Copy link
Contributor Author

minglotus-6 commented Jul 7, 2022

@minglotus-6 - thank you for updating the list.
May I ask you to add also the copyright statement (it is required in the contribution agreement) and update the variable draftversion of the YAML header to true?
WRT the copyright statements, what you need to do is described in the checklist of the description of the PR.
Please let me know if anything is unclear!
Thank you,
Francesco

Thanks for pointing this out Francesco! I've just read some guidelines inside the company I worked for and will also consult to see what I plan to fill in looks good. I'll update once I get some guidance inside the company. As you may see I'm not very familiar with this contributing process and I appreciate your patience and guidance here!

Hi Francesco,
I updated both the checklist and the PR itself after getting replies from relevant teams inside the company (after a holiday weekend here). Could you take a look and see if it looks good? In particular, shall I ask for an email and append it in the copyright notices? Thanks!

@minglotus-6
Copy link
Contributor Author

Historically we have not added #defines to ACLE unless they guard, or describe, features or intrinsics provided by ACLE and that are visible directly to C programmers (ie, providing macros to conditionalize assembly language is out of scope). So exactly what does this new macro guard?

This is to guard the usage of FEAT_RCPC and FEAT_RCPC2 instructions in the inline assembly. An example usage is in #199 (comment). I wonder if the use case looks reasonable, or are there better alternatives?

@minglotus-6
Copy link
Contributor Author

A bump of this PR :-)

@lenary
Copy link
Contributor

lenary commented Sep 20, 2022

I am happy with this from the technical side. I find the sentence about ID_AA64ISAR1_EL1 clear, though I realise this is not exactly how we've defined other system registers.

@rsandifo-arm
Copy link
Contributor

LGTM too FWIW.

Regarding the issue that @rearnsha raised: it turned out that there were already other macros that broke the original principle. We agreed to relax the principle to something else, with a TODO to go back and make the existing features consistent. I've filed #215 for that.

@rsandifo-arm rsandifo-arm merged commit a405989 into ARM-software:main Sep 20, 2022
nstester pushed a commit to nstester/gcc that referenced this pull request Oct 4, 2022
ARM-software/acle#199 adds a new feature
macro for RCPC, for use in things like inline assembly.  This patch
adds the associated support to GCC.

Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it.  This was probably harmless in practice
since GCC simply ignored the extension until now.  (The GAS
definition is OK.)

gcc/
	* config/aarch64/aarch64.h (AARCH64_ISA_RCPC): New macro.
	* config/aarch64/aarch64-arches.def (armv8.3-a): Include RCPC.
	* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
	(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
	__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
	* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 20, 2022
ARM-software/acle#199 adds a new feature
macro for RCPC, for use in things like inline assembly.  This patch
adds the associated support to GCC.

Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it.  This was probably harmless in practice
since GCC simply ignored the extension until now.  (The GAS
definition is OK.)

gcc/
	* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_3): Add
	AARCH64_FL_RCPC.
	(AARCH64_ISA_RCPC): New macro.
	* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
	(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
	* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
	__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
	* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 20, 2022
ARM-software/acle#199 adds a new feature
macro for RCPC, for use in things like inline assembly.  This patch
adds the associated support to GCC.

Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it.  This was probably harmless in practice
since GCC simply ignored the extension until now.  (The GAS
definition is OK.)

gcc/
	* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_3): Add
	AARCH64_FL_RCPC.
	(AARCH64_ISA_RCPC): New macro.
	* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
	(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
	__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
	* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
kraj pushed a commit to kraj/gcc that referenced this pull request Oct 21, 2022
ARM-software/acle#199 adds a new feature
macro for RCPC, for use in things like inline assembly.  This patch
adds the associated support to GCC.

Also, RCPC is required for Armv8.3-A and later, but the armv8.3-a
entry didn't include it.  This was probably harmless in practice
since GCC simply ignored the extension until now.  (The GAS
definition is OK.)

gcc/
	* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_3): Add
	AARCH64_FL_RCPC.
	(AARCH64_ISA_RCPC): New macro.
	* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
	(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
	* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define
	__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
	* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
@vhscampos vhscampos added this to the 2022Q4 milestone Nov 14, 2022
@vhscampos
Copy link
Member

@all-contributors please add @lenary for review.

@allcontributors
Copy link
Contributor

@vhscampos

I've put up a pull request to add @lenary! 🎉

Copy link
Contributor

@sallyarmneale sallyarmneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants