Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow signed _ExtInt(1) #125

Merged
merged 1 commit into from
Oct 13, 2021
Merged

Conversation

lforg37
Copy link
Contributor

@lforg37 lforg37 commented Jul 15, 2021

No description provided.

@lforg37 lforg37 requested review from keryell and Ralender July 15, 2021 08:38
@keryell
Copy link
Member

keryell commented Jul 15, 2021

That looks good but I have the feeling there are 2 things here:

  • a new feature that should go into Clang/LLVM upstream instead: signed _ExtInt(1) (@AaronBallman feedback?);
  • some Intel FPGA code cleaning of Intel namespace pollution that should go directly into https://github.com/intel/llvm (@bader feedback?).

At least this shows what should be done to support real ap_int in SYCL along https://github.com/lforg37/extint_apint that Intel FPGA team should look at too if we want some common standard extension.

@AaronBallman
Copy link
Contributor

Can I get some background on why you'd like to make this change? The feature was just adopted by WG14 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf) and this change won't conform to what was adopted. _ExtInt(1) is nonsensical because it consists of only a sign bit and has no value bits. Please see the requirements in 6.2.5 and 6.7.2 the linked document.

@lforg37
Copy link
Contributor Author

lforg37 commented Jul 15, 2021

_ExtInt(1) is nonsensical because it consists of only a sign bit and has no value bits.

In 2's complement the msb is a value bit with a negative weight, so an _ExtInt(1) can have two values : -1 and 0.

One example of an application is splitting a wide product as a sum of multiple subproducts.
If we take the example of a product between two 33 bits signed integers a and b, on an FPGA that has DSP able to compute 32x32 products, the product can be computed that way:

  • Each input is splitted between its msb (a_h and b_h), and the remaining 32 low bits (a_l and b_l).
  • The result is ((a_h x b_h) << 64) + ((a_h x b_l + b_h x a_l) << 32) + a_l x b_l.

This computation uses only one DSP for a_l x b_l, the three other products are done inexpensively on fpga logic.

Both a_h and b_h are conceptually _ExtInt(1) (one bit integers that can be either 0 or -1). There is no difference with the unsigned case for the 1x1 product, but in the case of non square products such as a_h x b_l, it is important that the product gives -b_l when the a_h bit is set, otherwise the result is false.

This example is in the case of a full square product and can look a bit artificial, but when computed truncated product (which makes sense on FPGA, has it is less expensive than computing the full product) the case of subproducts involving a 1 bit signed integer can also appear.

I'll search if there is other arithmetic application in which it makes sense to have signed ExtInt(1), but at least to represent the scaled value of the msb of a signed integer it has a meaning.

@AaronBallman
Copy link
Contributor

In 2's complement the msb is a value bit with a negative weight, so an _ExtInt(1) can have two values : -1 and 0.

We purposefully decided to not go down that route because experience with bit-fields showed that this is very confusing to users in practice. Also, up until C2x, you couldn't assume twos complement anyway.

I'll search if there is other arithmetic application in which it makes sense to have signed ExtInt(1), but at least to represent the scaled value of the msb of a signed integer it has a meaning.

Thank you for the information, that's helpful! I'm not certain it's sufficiently compelling to warrant this change, however. In C, signed integer types are divided into three groups: value bits, padding bits, and the sign bit. That's why _ExtInt(2) has one sign bit and one value bit. _ExtInt(1) would be ambiguous as to what you'd get, but I'd expect it would only be able to represent 0 and 1, not 0 and -1 (I'd expect value bits are useful -- a sign bit by itself is not).

However, it's not really me you'd have to convince. _ExtInt is a Clang feature that was proposed to WG14 and will be tracking that functionality. So really, this would require a proposal to WG14 to change the feature. That's not impossible, but it would require considerable motivation and I'm not certain FPGA uses will be sufficient or not.

@keryell
Copy link
Member

keryell commented Jul 15, 2021

@AaronBallman thanks for the feedback.
Here we are just talking about SYCL and C++20. C++20 uses two complement representation, so it makes sense to have a natural signed 1-bit value.
I do not really care about the C language, besides the fact that C++ inherit from most of C, so C2X is just interesting science-fiction for me. :-)
If I understand correctly, Tommy Hoffner, who was the Intel FPGA guy behind ISO/IEC JTC1 SC22 WG14 N2709 is no longer working at Intel, so Intel might not be interested by this anymore.
But I work for the real FPGA company where we care about this. ;-)
That said, all this is a useful extension for specific hardware, even if not in any standard, so it might be an interesting up-stream extension for Clang so other people can benefit from it? This PR could be updated to keep the same limitation and diagnostic in the official C mode but have this feature only when using some -fsigned-1-bit-extint or whatever option.
Otherwise we can just use it in our fork.

@AaronBallman
Copy link
Contributor

Here we are just talking about SYCL and C++20.

That's basically immaterial. The Clang implementation is still going to follow what WG14 standardized (that was part of the community's requirement for getting _ExtInt into Clang in the first place).

If I understand correctly, Tommy Hoffner, who was the Intel FPGA guy behind ISO/IEC JTC1 SC22 WG14 N2709 is no longer working at Intel, so Intel might not be interested by this anymore.

Intel is still interested in this functionality (I work for Intel, btw; I wasn't certain if you knew). I am also one of the primary authors on the WG14 proposal. :-)

That said, all this is a useful extension for specific hardware, even if not in any standard, so it might be an interesting up-stream extension for Clang so other people can benefit from it?

It's an option, but it'll get opposition if you attempt to upstream it (I certainly will push back on it pretty hard). It would be really bad for C and C++ to decide to support different bit-precise integers semantics and even worse when you consider compatibility with other compilers. This leads to user confusion over how the feature works and in what languages and compilers.

FWIW, our plan is to also propose this feature to WG21 (though likely spelled std::bit_int<N>) with the same semantics and ABI as the C feature.

Otherwise we can just use it in our fork.

That's definitely an option!

@lforg37
Copy link
Contributor Author

lforg37 commented Jul 15, 2021

Thank you for your feedback, there is just a few points on which I am not sure I understand what you meant.

We purposefully decided to not go down that route because experience with bit-fields showed that this is very confusing to users in practice.

I don't see the relation with bit-fields: can you elaborate ?
My remark was just that the only difference between 2's complement and unsigned integer representation is the sign of the msb weight.

It would be really bad for C and C++ to decide to support different bit-precise integers semantics and even worse when you consider compatibility with other compilers.

FWIW, our plan is to also propose this feature to WG21 (though likely spelled std::bit_int) with the same semantics and ABI as the C feature.

I understand why having a modification on an adopted feature is annoying, and might not be worth the trouble.
However, as in recent C++20 signed integers are represented in two's complement, why not allowing a signed std::bit_int<1> here ?
If I'm correct that is only a small overset of the C functionalities, which is only the removing of a restriction (which will not show unexpected discrepancies with _ExtInt C code due to the non promoting rules etc, that would just be an extra type).

(Maybe this is also the good timing to include a std::bit_int<0> that would be a type whose representation uses no place in memory and have always the value 0, which would avoid a lot of special case handling when designing custom sized datapath operators :-) )

@AaronBallman
Copy link
Contributor

I don't see the relation with bit-fields: can you elaborate ?

Sure! We have plenty of feedback that users get baffled by constructs like:

struct S {
  int i : 1;
};

because they don't know whether S::i represents 0 and 1 or 0 and -1. In many instances, the difference doesn't matter. But it does come up (for example, users who test values explicitly as in if (s.i == 1) as opposed to if (s.i) are in for a surprise). This leads to coding rules like https://rules.sonarsource.com/c/RSPEC-2216, MISRA C++ 9-6-4, etc.

My remark was just that the only difference between 2's complement and unsigned integer representation is the sign of the msb weight.

Understood!

However, as in recent C++20 signed integers are represented in two's complement, why not allowing a signed std::bit_int<1> here ?

Because the expectation is that STL implementations on compilers that support both C and C++ (Clang, GCC, etc) will implement bit_int as:

template <size_t N>
class bit_int {
  _BitInt(N) Val;
public:
  ...
};

(Maybe this is also the good timing to include a std::bit_int<0> that would be a type whose representation uses no place in memory and have always the value 0, which would avoid a lot of special case handling when designing custom sized datapath operators :-) )

Er, that could get awkward because sizeof(bit_int<0>) can't be 0 anyway (that would be a problem for doing something like: bit_int<0> HowDoesThisWork[100]; But I do agree that I wonder what WG21 will think of the generic programming aspects of the type.

@keryell
Copy link
Member

keryell commented Jul 16, 2021

If I understand correctly, Tommy Hoffner, who was the Intel FPGA guy behind ISO/IEC JTC1 SC22 WG14 N2709 is no longer working at Intel, so Intel might not be interested by this anymore.

Intel is still interested in this functionality (I work for Intel, btw; I wasn't certain if you knew). I am also one of the primary authors on the WG14 proposal. :-)

I know all of this, since I discovered Tommy Hoffner in the list of authors of N2702 few months ago and he was not in my list of known people. I do not think the 3 other authors are related to FPGA.

That said, all this is a useful extension for specific hardware, even if not in any standard, so it might be an interesting up-stream extension for Clang so other people can benefit from it?

It's an option, but it'll get opposition if you attempt to upstream it (I certainly will push back on it pretty hard).

I am starting the massage. :-)

It would be really bad for C and C++ to decide to support different bit-precise integers semantics and even worse when you consider compatibility with other compilers. This leads to user confusion over how the feature works and in what languages and compilers.

At the end this is still super low-level features for hardware people expected to know a little bit about what they are doing. In our case, it is mostly be contained inside in a SYCL or HLS C++ kernel because the semantics of a compact array of ap_int<3>[10] and pointers to its elements when hitting the road of a C++ std::byte-oriented memory is at most vendor specific in the FPGA realm...

FWIW, our plan is to also propose this feature to WG21 (though likely spelled std::bit_int<N>) with the same semantics and ABI as the C feature.

I am thinking about voting Strongly Against if there is no both signed and unsigned std::bit_int<1> ;-)

Otherwise we can just use it in our fork.

That's definitely an option!

Of course, since this is the current situation.
At the end this is good for Xilinx FPGA to have both signed and unsigned 1-bit variables while Intel FPGA can handle only unsigned 1-bit variables, so I am for the current status quo... ;-)

@AaronBallman
Copy link
Contributor

At the end this is still super low-level features for hardware people expected to know a little bit about what they are doing.

No, it's not; it's a builtin standard integer datatype. Your use case is for a super low-level use where hardware people are expected to know what they're doing, but you can't apply that use case to everyone. For example, this is a very useful feature for data exchange formats like files and network protocols; you can specify the precise bit-width but without the portability and usability issues of bit-fields. Also, general purpose programmers like the idea of an integer type that doesn't promote to int when you sneeze near it, so I suspect this will get a surprising amount of use outside of FPGA situations.

Again, I'm not saying "there's never a use for this." I'm pointing out that the notion of a datatype with no value bits is understood to be a sharp edge that was not compelling to replicate in a new datatype. If there's new information that makes it more compelling, that's fantastic -- but the responsible way forward is for someone to propose something to WG14 to see if they'd like to change the feature (preferably before C23 ships, but because this would be relaxing a restriction and work on C23 is rapidly winding down, it could be done post C23 with little harm, IMO). You asked me for feedback on upstreaming this change and my feedback is that I don't believe the Clang community is going to be willing to break from what WG14 standardized when the Clang community explicitly asked for the feature to go through the standardization process in the first place. CCing @mibintc and @erichkeane who were also authors on the feature and have been involved in the Clang implementation to see if they have opinions (I can't seem to find a GitHub handle for Tommy, in case someone knows of it).

@keryell
Copy link
Member

keryell commented Jul 16, 2021

Yes, thanks, this is interesting feedback.
This is true that our focus is low-level hardware and 2-complement encoding, so we do not end up into choosing between +0 and -0 for other encoding schemes.
Another domain interested by -1 and 0, along 0 and +1 of course, is machine-learning, when processing 1-bit data.
So, do you suggest we write a paper for WG14?
There is a C-C++ ISO SG22 meeting today organized by you, by the way... :-)

@AaronBallman
Copy link
Contributor

Yes, thanks, this is interesting feedback.
This is true that our focus is low-level hardware and 2-complement encoding, so we do not end up into choosing between +0 and -0 for other encoding schemes.

Thankfully, in C23, we only have twos complement now as well (we removed support for the other formats in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf), but C still makes a distinction between "the sign bit" and "value bits" as part of the model.

Another domain interested by -1 and 0, along 0 and +1 of course, is machine-learning, when processing 1-bit data.

Good call!

So, do you suggest we write a paper for WG14?

Personally, I think that would be a good way forward. Ideally, the paper would have some concrete information about the performance benefits of allowing the construct along with the motivating examples. Basically, convince ourselves that the perf gains are worth the usability losses with numbers, see if this introduces new sharp edges and whether they lead to additional problems (for example, I think the proposed wording would need to figure out what to do with the integer model where there's a sign bit and value bits), and then pound out a paper from there. Once we have a paper in front of WG14, I think it's more reasonable to go to the Clang community with the functionality (note: they may stall until WG14 makes a decision; they did that with the integer conversion and promotion rules).

There is a C-C++ ISO SG22 meeting today organized by you, by the way... :-)

:-)

@erichkeane
Copy link
Contributor

Yes, thanks, this is interesting feedback.
This is true that our focus is low-level hardware and 2-complement encoding, so we do not end up into choosing between +0 and -0 for other encoding schemes.

Thankfully, in C23, we only have twos complement now as well (we removed support for the other formats in http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf), but C still makes a distinction between "the sign bit" and "value bits" as part of the model.

Another domain interested by -1 and 0, along 0 and +1 of course, is machine-learning, when processing 1-bit data.

Good call!

So, do you suggest we write a paper for WG14?

Personally, I think that would be a good way forward. Ideally, the paper would have some concrete information about the performance benefits of allowing the construct along with the motivating examples. Basically, convince ourselves that the perf gains are worth the usability losses with numbers, see if this introduces new sharp edges and whether they lead to additional problems (for example, I think the proposed wording would need to figure out what to do with the integer model where there's a sign bit and value bits), and then pound out a paper from there. Once we have a paper in front of WG14, I think it's more reasonable to go to the Clang community with the functionality (note: they may stall until WG14 makes a decision; they did that with the integer conversion and promotion rules).

There is a C-C++ ISO SG22 meeting today organized by you, by the way... :-)

:-)

I think we'd want to wait until it is at least positively received by WG14 before we change clang (as the intent of the _ExtInt is, as you said, to track the WG14 standard. I'd received this feedback (1 bit signed ints) at one point in the past via email, but the submitter wasn't able to provide justification in a way that I thought I could justify it to a standards committee, and Tommy wasn't motivated at the time. I personally would be in support of such a paper thanks to the explanation here(at least if the paper had sufficient motivation section), think it is a small enough change that it should be a paper will only need a meeting or two.

@bader
Copy link
Contributor

bader commented Jul 17, 2021

I can't seem to find a GitHub handle for Tommy, in case someone knows of it

@d86thoffner?

@keryell
Copy link
Member

keryell commented Aug 12, 2021

So, I guess we need to dive into the C standard around integer encoding.
@AaronBallman is there a public Git repository for the on-going ISO C draft as there is one for ISO C++?

@AaronBallman
Copy link
Contributor

So, I guess we need to dive into the C standard around integer encoding.
@AaronBallman is there a public Git repository for the on-going ISO C draft as there is one for ISO C++?

There's not a public git repo for WG14 like there is for WG21, but we do have the latest working drafts in our document list (http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm). The latest WD I see is: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2596.pdf but that doesn't have the _BitInt paper's wording applied to it yet. So you'll have to use a combination of N2596 and the wording from http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2709.pdf for the moment.

@keryell
Copy link
Member

keryell commented Aug 12, 2021

There's not a public git repo for WG14 like there is for WG21, but we do have the latest working drafts in our document list (http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm).

Thanks for the feedback. I have started a C folder for some bookmarks in my Firefox. :-)
In your list I found http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf "2021/07/09 Ballman, Adding a Fundamental Type for N-bit integers (updates N2709)" and http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf "2021/07/26 Ballman, Literal suffixes for bit-precise integer". It looks like things are moving fast.
Our main issue is to estimate how difficult it would be to change the concept for integers based on sign bit + value bit(s) to a pure 2-complement uniform definition in the C standard.

@AaronBallman
Copy link
Contributor

There's not a public git repo for WG14 like there is for WG21, but we do have the latest working drafts in our document list (http://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_log.htm).

Thanks for the feedback. I have started a C folder for some bookmarks in my Firefox. :-)
In your list I found http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf "2021/07/09 Ballman, Adding a Fundamental Type for N-bit integers (updates N2709)" and http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf "2021/07/26 Ballman, Literal suffixes for bit-precise integer". It looks like things are moving fast.
Our main issue is to estimate how difficult it would be to change the concept for integers based on sign bit + value bit(s) to a pure 2-complement uniform definition in the C standard.

Whoops! I meant that http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2763.pdf is the final form of that document, sorry for that confusion!

As for "fast", the only two things we're still hoping to get added for C23 are: the literal suffix (because otherwise comparisons and mathematical operations with constants become more likely to accidentally induce a conversion) and a conversion specifier so that you can perform file IO. We have other stuff in the pipeline, but it's lower priority (like atomics, a bitwidthof operator, etc).

As for your main issue, it should be somewhat straightforward given that we're now mandating twos complement for C23. The trickiest bit will be finding any places that talk about "sign bit" and making sure you got all of the places, or properly redefining "sign bit" in terms of twos complement representation. Another issue will be timing -- WG14's proposal plate is filling up very quickly (and is potentially already filled). We're still considering new proposals for C23 at the next meeting and the meeting after (Nov 2021), but I believe those are the last meetings for new proposals. However, I think a case could be made that yours is not a new proposal but a refinement of an existing one, so it may still be possible to see this change in C23 (but just to set expectations, it's up to the convener how to schedule papers and so there's a very real possibility your proposal would be for C2y instead).

@@ -126,7 +126,9 @@ struct ConstantPipeStorage {
};

// Arbitrary precision integer type
namespace __spv {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this name __spv? What is the meaning?
If it has something related to SPIR-V, should it be something like __spirv?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, this is already named like that in up-stream... intel/llvm#3986

Copy link
Member

@keryell keryell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@keryell keryell merged commit d281d2b into triSYCL:sycl/unified/next Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants