New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Issue 19532 - chunkBy errors involving non-forward input ranges #6841
Fix Issue 19532 - chunkBy errors involving non-forward input ranges #6841
Conversation
private ElementType!Range prev; | ||
|
||
this(Range range, ElementType!Range _prev) | ||
this(ref Range range, ElementType!Range _prev) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should probably be return ref
cc @WalterBright
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it simply become: this(return ref Range range, ElementType!Range _prev)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Checked my local unit test build, the iteration.d file is built with -dip1000
. My impression was that this flag would catch needed changes. I'll wait additional clarification before making changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WalterBright Can you clarify? Should this be return ref
as per @thewilsonator? The PR as written compiles with -dip1000
. I searched phobos code and looked through the DIP 1000 document, but didn't see an example using return ref
in a constructor function. I'd like to identify the correct way to write this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually, the compiler is able to infer parameter attributes like scope
and return
for templates, lambdas, etc. so in many cases adding the attributes manually is not necessary / has no effect. However adding attributes is needed for cases where the compiler can't infer things on its own.
Which are those cases is highly implementation specific, and it depends on a case by case basis. As rule of thumb, try to go first without scope
/return
and if the compiler complains, add them as necessary.
Most importantly, have test cases involving static arrays in @safe
unittests, as the escape analysis enabled by -dip1000
will emit errors only in @safe
functions, and the point of the escape analysis is to disallow escaping a reference to a stack-allocated object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZombineDev Thanks for the explanation. Current chunkBy
unit tests are not @safe
, they are @system
instead. There is a separate issue listed for this, issue 18161. According to that report the problem is that the implementation for the forward range case uses RefCounted
.
My personal preference would be to leave the issue of making chunkBy
@safe
as a separate PR, and have this PR focus on the bug being fixed. That is, leave the implementation as written, without introducing scope / return
. However, if reviewers prefer, I could try to make the non-forward input range implementation @safe
and introduce unit tests for this purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a closer look at this topic and experimented with creating @safe
unit tests. Best I can tell, DIP 1000 is saying that the technique I used is not considered @safe
under -dip1000
. First time I've really looked at DIP 1000, but my interpretation is that it is saying that the GC can no longer be used to manage lifetime in @safe
code, and instead some other technique, such as reference counting, must be used. Implementing this appears be an issue, perhaps temporary, as RefCounted
is @system
at the moment.
It'd be fantastic if I've missed something basic and there's an easy way satisfy@safe
under -dip1000
. If so, great. But otherwise I recommend going forward with the PR as written. The bug being addressed is quite significant, enough that it makes chunkBy
dangerous to use. It'd be better to fix the bug even if it's @system
under -dip1000
. The forward input range implementation of chunkBy
is @system
due to use of RefCounted
, so presumably both pieces of the chunkBy
implementation could be made @safe
together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise looks good
2c2e2cc
to
b179840
Compare
Thanks for your pull request and interest in making D better, @jondegenhardt! We are looking forward to reviewing it, and you should be hearing from a maintainer soon.
Please see CONTRIBUTING.md for more information. If you have addressed all reviews or aren't sure how to proceed, don't hesitate to ping us with a simple comment. Bugzilla references
Testing this PR locallyIf you don't have a local development environment setup, you can use Digger to test this PR: dub fetch digger
dub run digger -- build "master + phobos#6841" |
This PR address problems in
std.algorithm.chunkBy
when operating on non-forward input ranges.chunkBy
has separate implementations for forward and non-forward input ranges.The non-forward version works by instantiating a new struct to iterate over each "chunk" in the input range. One problem with the existing implementation is that the range being chunked is getting copied when these structs are created. This causes them to get out of sync if the same chunk is retrieved multiple time. This PR addresses this by passing a reference to the underlying range to these structs so these structs stay in sync with each other and with the parent range. These changes are in the
ChunkByChunkImpl
struct.A second problem is that the existing implementation uses
empty
on the underlying range to defineempty
on the outer chunked range. This is a logical problem because the outer range becomes "empty" when the last chunk becomes empty, without apopFront
on the outer range. This causes common iteration patterns to issue apopFront
on the outer range after it becomes empty. Compounding the problem, thechunkBy
popFront
andfront
routines do not assert when called in an empty state. This makes diagnosis more difficult. Asserts are triggered when the caller or callee detects it.std.algorithm.map
detects this case, which is why it shows up in issue reports. (map.popFront
is called, it asserts because the underlyingchunkBy
range is "empty".) Iteration patterns that callchunkBy.popFront
will have the call forwarded to the underlying input range, which is where the error is triggered. This is why range producing algorithms likeroundRobin
,heapify
(binary heap), andmultiwayMerge
trigger asserts when the resulting range is passed tochunkBy
.This second problem is fixed by changing the "empty" definition to reflect that the outer range is not empty until
popFront
is run on the last chunk. Assert checks were also added to thepopFront
andfront
routines. The effect is to divorce the notion of emptying an individual chunk from the act ofpopFront
'ing the chunk off the outer range. UsingpopFront
on a chunk to pop it off the outer range still has the effect of emptying the chunk.Issues 16169 and 17966 are also fixed by this PR, they are variations on the theme.
The forward input range implementation was not examined as part of this PR. Some of the new unit test cases also include forward ranges were it makes sense.