New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement BinaryInteger.words #10558
Conversation
/cc @dabrahams |
@swift-ci Please test |
There are some further things we could do to improve this code: One option is to define an internal shortcut API for doing conversions between standard integer types. This would eliminate the use of an iterator for these basic cases, potentially speeding things up. We could also consider changing the documented public API to require |
Build failed |
Build failed |
I know @moiseev has thoughts about the use of this technique, but another workaround would be to define a protocol |
@xwu We would need to split |
@lorentey Sorry, why? |
@xwu Ah, evidently I do not understand the basic underlying issue with
This is great; I'll try to implement your approach, and see where it gets us. |
It works as long as they're in the same file or you compile with whole-module optimization. Not a great constraint, but… |
c21cfef
to
7682e07
Compare
@xwu Sadly I couldn't get the constraint to work; I keep running into compiler limitations. On master it is currently possible to constrain I could, however, modify the PR to bring integer conversion performance to roughly the same level as it is on master. To do this, I upgraded So the PR as it stands replaces
|
@lorentey Mind if I have a stab at this over the weekend? I am very curious why the compile-time crashes are happening. Mocking up this protocol hierarchy in a playground doesn't crash. |
@lorentey I think I've got a path forward, which turns out to be much simpler. New insight (for me, at least): We don't need to be able to express the intended constraint (which, you're right, strangely won't compile). Instead, we only need to supply a default implementation of That much is doable. |
@xwu Of course, go right ahead! It would be so much better if we could make the constraints work. (I couldn’t reduce the issue, either; there is likely a critical puzzle piece that I’m still missing. I’ll keep looking for it, too.) |
@xwu But we can’t add the constraint on the protocol requirement itself, since it would need to be propagated to every place where an integer conversion is used in generic context. So we would still need to implement the unconstrained variant, wouldn’t we? (I don’t see how setting a default value would help, though. Maybe I misunderstood your approach!) |
@lorentey My approach would change nothing on the protocol requirement; however, the default implementation would be constrained. Therefore, no use in a generic or concrete context is affected. The default value is not crucial to the solution; it's icing for a transition from the status quo and helps the stdlib compile with minimal changes. WIP (the linker is complaining about strings for some reason; I may need to do a clean build): |
@xwu Interesting! What I don't understand is how constraining (An associated type's default value can have no effect in generic contexts, because implementations are free to replace the default type with any other type that satisfies the constraints.) The fundamental problem is that the generic extending/truncating initializer requirement uses an unconstrained binary integer. As long as |
If we're OK with modifying the original proposal, I think it would be possible in today's compiler to use a constraint like Using a sequence rather than a collection seems like a drastic change. However, if the only major use case for Adding |
I haven't read the details, but if you're having a problem with recursive constraints, the workaround these days is to declare the associatedtype in a less-refined concept, e.g. protocol _FixedWidthInteger {
associatedtype Words
}
protocol FixedWidthInteger : BinaryInteger, _FixedWidthInteger
where Words : Sequence && Words.Element : FixedWidthInteger
{
...
} HTH |
@dabrahams That is the initial suggestion I made. As @lorentey's experimentation showed (which I was able to reproduce), trying to use this particular workaround is fine with |
@lorentey Ouch, that's a terribly silly think-o on my part. So, to summarize, what's clear is that both the desired design as well as the go-to workaround for it are not currently possible. IMO, a new workaround that involves a new type In the meantime, I wonder if we get meaningfully closer to the final design by replacing one set of underscored implementations for another. Honestly, |
@dabrahams, @xwu I think the problems with Switching to |
@lorentey Well, if that's the problem, can't we just rename |
7682e07
to
e2864f9
Compare
@xwu: I could get the I pushed a new commit that implements it: The drawback is of course that |
on Fri Jul 07 2017, Xiaodi Wu <notifications-AT-github.com> wrote:
To use
this particular workaround is fine with `Words : Sequence` but breaks
horribly with `Words : Collection`. With the latter constraint, the
compiler crashes while trying to compile functions like `Swift.min`,
which makes no sense whatsoever, suggesting something going horribly
awry.
Sometimes the thing that resolves nonsensical problems when circular
conformances are created is to reorder source files in CMakeLists.txt
:-/
…--
-Dave
|
@@ -2320,24 +2292,24 @@ ${unsafeOperationComment(x.operator)} | |||
} | |||
% end | |||
|
|||
@inline(__always) | |||
@_transparent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is scary!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep; I made this change at the last minute to work around a weird performance regression. I should probably change it back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was changed from transparent because of the "sil explosion" at compile time. Watch the memory used by the swiftc while building the standard library, and if it hits ~6Gb – that's this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Holy moly! I'll revert with prejudice.
public typealias Indices = CountableRange<Int> | ||
public typealias SubSequence = BidirectionalSlice<${Self}.Words> | ||
|
||
public var _value: ${Self} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it public to enable @_transparent
/@inline(__always)
members. Since we have _lowWord
, this collection doesn't get used much, so I'll remove the attributes and change this to internal
. Nice!
(I see @_versioned
is a thing; I should read up on it. Any pointers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test/Prototypes/BigInt.swift
Outdated
count: end - twosComplementData.count)) | ||
public var words: [UInt] { | ||
_sanityCheck(UInt.bitWidth % Word.bitWidth == 0) | ||
var words: [UInt] = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reserveCapacity
here?
@swift-ci Please test |
@swift-ci Please smoke benchmark |
@swift-ci Please Test Source Compatibility |
@swift-ci Please test |
@swift-ci Please smoke benchmark |
@swift-ci Please Test Source Compatibility |
On Jul 11, 2017, at 12:07 AM, Dave Abrahams ***@***.***> wrote:
@lorentey <https://github.com/lorentey>:
BinaryInteger.Words is constrained to a Sequence of UInt elements.
Can you also constrain it to _Indexable, or do things start breaking?
Oh, I should’ve thought to try this before. With `_Indexable`, stdlib builds fine, but I get all sorts of weird errors all over the test suite. This is a new symptom; I’ll see if I can make sense of it.
The requirements _word(at:) and _countRepresentedWords are replaced with the words property.
The problem with this is that you can't get the count in a generic context without iterating the Sequence.
We do have `bitWidth`, though! From the point of view of an integer implementation, `bitWidth` may be an even better fit than just `words.count`.
[U]Int*._lowUWord is renamed _lowWord and promoted to a BinaryInteger requirement to speed up conversions between standard integer types. This results in a 30–40% speedup, so that conversion performance is the same before/after this PR.
Nice!
UInt.Words is implemented as a new bidirectional collection view. It should be a RandomAccessCollection, but defining it as such crashes the compiler. It may be possible to work around this by reordering CMakeLists.txt, but this doesn't seem especially urgent.
Reordering CMakeLists.txt does not typically fix a compiler crash. Did you file a bug report for the crash?
Not yet! I’ll link the bug here when I file it.
On 32-bit platforms, we also have UInt64.Words.
I'm confused. Why don't we have this for every integer type on all platforms?
I wanted to be conservative about the number of new types I introduce. For most of the standard integer types, `Words` would be a single-element collection, and it seemed wasteful to (essentially) duplicate the same custom implementation in all of them — especially in view of the next point.
If that’s not a concern, then I can easily remove the unification. It would actually make the code a little simpler!
The rest of the [U]Int* types implement words in terms of UInt.Words (or UInt64.Words, where applicable).
CollectionOfOne might be better in most of those cases.
Fully agreed; `[U]Int*.Words` should be `CollectionOfOne<UInt>` when possible. However, `CollectionOfOne` implements `RandomAccessCollection`, and sadly it crashes the same way.
I think BinaryInteger.Words would actually work better if it was officially downgraded from a Collection to a Sequence.
Why? That strongly implies you can only make a single pass over it, precludes iteration from MSW to LSW, and makes certain operations (e.g. words.count) unavailable. What is the upside for users of the library?
My underlying assumption is that `words` would only be used for implementing integer conversions, where single-pass LSW iteration is adequate. The upside would be that people providing external implementations of `BinaryInteger` won’t need to implement the full `Collection` API. For example, the `BigInt` prototype currently generates a temporary array for its `words`; it would be (slightly) easier to replace this with an on-the-fly sequence than an on-the-fly collection view.
Note that direct MSW->LSW iteration over `words` is not practical in general with SE-0104, as it doesn’t require `Words` to be a `BidirectionalCollection`.
Can we make that happen? (It'd be a small change to a somewhat esoteric feature, but I assume it needs a round trip to swift-evolution.)
Being able to index into words seems of limited usefulness to me, and adds considerable complexity: this PR would be 50–70 lines shorter if UInt.Words and DoubleWidth.Words were just single-pass Sequences. When converting between big integer types, it is nice to have an exact words.count; however, bitWidth already provides a quite usable alternative. (Without IndexDistance's slight inconvenience in generic contexts, even.)
I am not in favor of such a change at the moment. I'm not sure I ever would be in favor, but a basic prerequisite would be to implement a variable width integer type (e.g. BigNum)—with generic multiplication and division operators—whose words only satisfy Sequence, to prove that only Sequence conformance is needed.
I think you envision more use cases for `words` than I do. The only use case I can see for `words` as it stands in SE-0104 is to enable efficient conversion between any two implementations of `BinaryInteger`.
Indeed, it is not practical to express multiplication or division directly on `Sequence`s of `UInt`s, but I don’t think a forward collection would make much difference, either. A `RandomAccessCollection` would work slightly better, but it seems much more practical to always start by converting inputs to the same known representation, possibly rolling some preliminary steps (e.g. the normalization of the divisor) into the conversion loop. But that is just as easy when `words` is a sequence!
|
Build comment file:Optimized (O)Regression (1)
Improvement (3)
No Changes (317)
Unoptimized (Onone)Regression (1)
Improvement (1)
No Changes (319)
Hardware Overview
|
Build failed |
D'oh! I'll need to figure out how to run the 32-bit watchOS tests. |
We don’t need Words’ members to be @_transparent; simple conversions use _lowWord instead.
This fixes integer conversion issues on 32-bit platforms.
This adds 8 more collection views, but makes integer definitions more consistent across all the available bit widths and between 32-bit and 64-bit platforms.
I found the bug that broke 32-bit platforms; could you please trigger new builds for me? |
@swift-ci Please test |
@swift-ci Please smoke benchmark |
@swift-ci Please Test Source Compatibility |
@swift-ci Please smoke benchmark |
@swift-ci Please Test Source Compatibility |
Build comment file:Optimized (O)Regression (3)
Improvement (2)
No Changes (316)
Unoptimized (Onone)Improvement (2)
No Changes (319)
Hardware Overview
|
This PR is a followup to #10460, providing a better fix for SR-5275 by implementing BinaryInteger.words as the primary way of accessing the words of an integer. To achieve this, it defines the
words
property for all standard fixed with integer types, includingDoubleWidth
. TheWords
associated types are defined as views of a corresponding integer value.(To reduce the number of such collection types, only
UInt.Words
(and on 32-bit platforms,UInt64.Words
) are actually defined; the other standard integers use typealiases to one of these two.)For existing code in the standard library that accesses words in generic contexts, I had to work around not being able to explicitly constrain
BinaryInteger.Words
to aCollection
ofUInt
elements. Currently onlyFixedWidthInteger.init<T: BinaryInteger>(extendingOrTruncating:)
uses words in such a way, but it is a rather important initializer that is used in all sorts of integer conversions.I needed to define some sort of simplified internal API to iterate over the words of an integer in generic contexts. While the original
_word(at:)
&_countRepresentedWords
APIs look ideal for this, it does not seem possible to define them in terms ofwords
without index arithmetic operations that involve numeric conversions. Numeric conversions in_word(at:)
lead to tricky infinite recursions when the method is called from the initializers implementing these conversions.So as a proof of concept I added a
BinaryInteger._wordsIterator()
method that returns a type-erased iterator over the words of the integer. This is a somewhat ridiculous thing to do during, say, anUInt16
toInt32
conversion, but it does prove that the concept is viable. (The code passes all tests, although it does require a small change in a SILOptimizer test case.)