Range adaptor: cache #1364

Merged
merged 1 commit into from Sep 25, 2014

Conversation

Projects
None yet
Collaborator

monarchdodra commented Jun 22, 2013

As discussed in this thread:
http://forum.dlang.org/thread/ovbjcmogezbvsxrwfcol@forum.dlang.org

This provides a range adaptor that cache the result of another range.

Meant as a lazy alternative to array. Not much else to say...?

Implemented as obscure type

Wording in documentation might suck.

Please destroy.

@eco eco commented on an outdated diff Jun 23, 2013

std/algorithm.d
@@ -614,6 +614,204 @@ unittest
}
/**
+$(D cache) eagerly evaluates $(D front) of $(D range)
+on each construction or call to $(D popFront),
+to store the result in a cache.
+The result is then directly returned when $(D front) is called,
+rather than re-evaluated.
+
+This can be useful a function to place in a chain, after functons
+that have expensive evaluation, as a lazy alternative to $(XREF array,array).
+In particular, it can be placed after a call to $(D map), or before a call
+to $(D filter).
+
+$(D cache) may provide bidirectional iteration if needed, but since
+this comes at an increased cost, it must be explicitly requested via the
+call to $(D cahceBidir).
@eco

eco Jun 23, 2013

Member

Typo on cacheBidir

@andralex andralex commented on an outdated diff Jun 23, 2013

std/algorithm.d
+ }
+ }
+
+ static if (isInfinite!R)
+ enum empty = false;
+ else
+ bool empty() @property
+ {
+ return source.empty;
+ }
+
+ @property
+ {
+ E front()
+ {
+ version(assert) if (source.empty) throw new RangeError;
@andralex

andralex Jun 23, 2013

Owner

s/source.empty/empty/

@andralex andralex commented on an outdated diff Jun 23, 2013

std/algorithm.d
+ else
+ bool empty() @property
+ {
+ return source.empty;
+ }
+
+ @property
+ {
+ E front()
+ {
+ version(assert) if (source.empty) throw new RangeError;
+ return f;
+ }
+ static if (Bidir) E back()
+ {
+ version(assert) if (source.empty) throw new RangeError;
@andralex

andralex Jun 23, 2013

Owner

s/source.empty/empty/

@andralex andralex and 1 other commented on an outdated diff Jun 23, 2013

std/algorithm.d
+ static if (Bidir) E back()
+ {
+ version(assert) if (source.empty) throw new RangeError;
+ return b;
+ }
+ static if (hasLength!R) auto length()
+ {
+ return source.length;
+ }
+ }
+
+ void popFront()
+ {
+ version(assert) if (source.empty) throw new RangeError;
+ source.popFront();
+ if (!source.empty)
@andralex

andralex Jun 23, 2013

Owner

I think this may affect performance because we have two tests of source.empty per iteration. Unfortunately I don't see an easy way out of this. One possibility would be to add an enum:

enum State { cached, staleCache, done }`

and then store such a value as part of Cache and update it appropriately. But then management of that guy will have its own overhead.

@monarchdodra

monarchdodra Jun 24, 2013

Collaborator

Given the flow of the code though, this should be relatively easily optimized away, no? Since, it's basically actually "if (empty) return" followed by "if (empty) break"... That, and it's about the same as almost every adaptor (take, filter, splitter, joiner... )

I (personally) find that straight up update tends to be easier than lazy update. Plus, it helps have a more deterministic behavior, which is always good.

Owner

andralex commented Jun 23, 2013

I think a complete caching proposal would add two more overloads of cache: cache(range, 42) caches up to 42 items ahead of the current one and offers a method auto ref ElementType!R ahead(size_t) that offers access for up to 42 elements and size_t aheadAvailable() returning how many items ahead are available.

Also, cache(42, 1024) caches 42 elements ahead and 1024 elements behind. Should offer auto ref ElementType!R behind(size_t) and size_t behindAvailable().

We should also offer cacheBehind for the rare case in which lookbehind is needed but not lookahead.

As a rule of thumb for index calculations, cache(range, 0, 0) should only cache front.

Owner

andralex commented Jun 23, 2013

As a rule of thumb for index calculations, cache(range, 0, 0) should only cache front.

(Not sure whether that's the best rule.)

Collaborator

monarchdodra commented Jun 24, 2013

For, say "cache(range, 42)", what would be the semantics (before I get working I mean)? Is this OK?
Eagerly evaluates the first 42 elements of the range, then caches them in a private buffer. As elements are popped from the front, more items are eagerly evaluated ahead?"

"Calls to index[i] that are outside of that cache will be lazily evaluated, but will not raise an error".

Would aheadAvailable() really be very useful? Would it be any different from "max(42, length)" ? Maybe it might be fishy for infinite ranges...? I guess it can't hurt to have it.

Implementation wise, this would be implemented via a circular buffer? (Cycle) ?

Would it be OK to instead have this signature: cache!1024(range) (look ahead size must be static ?)

I'm not sure how the "cache behind would work"? Ranges don't "grow", and indexes can't be negative...? The semantic usage would seem very strange to me.

Member

DmitryOlshansky commented Jun 24, 2013

I think a complete caching proposal would add two more overloads of cache: cache(range, 42) caches up to 42 items ahead of the current one and offers a method auto ref ElementType!R ahead(size_t) that offers access for up to 42 elements and size_t aheadAvailable() returning how many items ahead are available.

[...]

Truth be told I don't see how this design will carry its own weight without a good enough use case. Also I'd say that having behind or ahead rather return a RA-range would eliminate an extra primitive.

I see the immediate value in using singular .cache in an UFCS chain as a bit of aspirin to ease the transient .front headache.

Fixed-sized look-ahead/look-behind (commonly with length=1) could be useful but it turns out that having these things as "always-on" is a bad idea performance-wise.

Now the main trouble with enable/disable and std.algorithm style adapter is that having an InputRange range at start and then calling auto withLookahead = lookahead(range)
you may have fun with the new range withLookahead. But now how do you stop this accumulation and keep the perceived position intact? Another obstacle here is that each adapter is a new type unrelated to the original source.

Second point is that e.g. parser requires unrestricted length in lookahead....

3rd point is where and how (as in container/allocation policy) does this adapter store items is important.

Would it be OK to instead have this signature: cache!1024(range) (look ahead size must be static ?)

No, nor I do think that a limit defined at run-time will suffice.

Collaborator

monarchdodra commented Jun 28, 2013

So which direction should we take? Truth be told, I don't really see much point to having "cache(range, howMany)": Range are either iterated RA style, or forward style. I know of few algorithms that pop the range, while accessing only the N first elements.

I proposed this range, I really only had simple caching in mind. I think it provides useful and simple functionality.

I see the immediate value in using singular .cache in an UFCS chain as a bit of aspirin to ease the transient .front headache.

I'm actually unsure it would help much: As soon as popFront is called, the cache is refreshed, and the internal range pop'ed too, so the problem is still here.

Member

DmitryOlshansky commented Jun 28, 2013

I see the immediate value in using singular .cache in an UFCS chain as a bit of aspirin to ease the transient .front headache.

I'm actually unsure it would help much: As soon as popFront is called, the cache is refreshed, and the internal range pop'ed too, so the problem is still here.

Well by itself no, since one have to deep copy first, but something like:

range.map!(x => x.idup).cache

will at least avoid allocations per access to front down the pipeline.

In short it makes sure you do deep copy only once. In fact the irony is that map used to cache values but then it was fixed not to. Now we finaly allow folks to do it again but explicitly.

Member

quickfur commented Jul 18, 2013

The transient .front problem can't (at the moment) be solved generically, certainly not with Cache. You basically have to know what the element type is, and how to make a copy of it. If it's a shallow array, you can use .idup (you may have to do more than .idup if the array elements themselves are reference types). Otherwise, if it's a reference type, you have to know how to make a copy of said type. Barring a generic deep-copy function in Phobos, this is the best we can do right now.

Member

quickfur commented Sep 25, 2013

ping

Might want to rebase this to prod the autotester onward.

Collaborator

monarchdodra commented Oct 6, 2013

Might want to rebase this to prod the autotester onward.

Rebased.

ping

I'm here. I don't recall there is anything else that requires changing?

Member

quickfur commented Dec 30, 2013

Hmm. Now there's a whole bunch of compile errors. Rebase again?

Collaborator

monarchdodra commented Jan 2, 2014

Hmm. Now there's a whole bunch of compile errors. Rebase again?

Fixed. I also did a bit of cleanup.

Member

eco commented Jan 2, 2014

I just hit the problem this solves so I'm eager to see this included.

I had a long pseudo member chain with a std.net.curl call early in it. It was taking far longer than it should have to process so sticking some writelns in revealed everything was happening twice because of a filter() call near the end of the chain.

Once this is merged I think it would be useful to add some warnings to functions like filter() that advise people to use cache() if the preceding elements of the chain are costly.

Collaborator

monarchdodra commented Jan 2, 2014

I just hit the problem this solves so I'm eager to see this included.

That's good to hear, because I myself was actually wondering how useful this actually is, and was having trouble coming up with a good usecace.

Indeed, filter does "double" calls to front, and cache is a good adapter to place right before (in particular, as you say, if the previous call is expensive, such as a map).

Once this is merged I think it would be useful to add some warnings to functions like filter() that advise people to use cache() if the preceding elements of the chain are costly.

No reason not to do it now :) if it gets merged.

Member

quickfur commented Jan 2, 2014

On Thu, Jan 02, 2014 at 01:54:15PM -0800, monarch dodra wrote:

I just hit the problem this solves so I'm eager to see this
included.

That's good to hear, because I myself was actually wondering how
useful this actually is, and was having trouble coming up with a good
usecace.

Indeed, filter does "double" calls to front, and cache is a good
adapter to place right before (in particular, as you say, if the
previous call is expensive, such as a map).
[...]

Just out of curiosity, why can't filter be changed so that it doesn't
access .front twice?

T

People tell me that I'm paranoid, but they're just out to get me.

Member

eco commented Jan 2, 2014

(in particular, as you say, if the previous call is expensive, such as a map)

@monarchdodra It's actually not just the previous call. The std.net.curl call was several steps further up the chain so everything was getting double called all the way up to the beginning of the chain which is terribly unexpected. I actually had to comment out each step of the chain over and over until I isolated what was causing it.

Here's the code.

Perhaps filter() should make use of cache() internally to avoid the issue entirely.

Collaborator

monarchdodra commented Jan 3, 2014

Just out of curiosity, why can't filter be changed so that it doesn't access .front twice?

Intrinsically, I'm not sure it's possible, because of the way it works. Each element must be tested, and then the user calls front, leading to a second call to front. I don't think there's any way to work around that.

Perhaps filter() should make use of cache() internally to avoid the issue entirely.

filter could cache the result internally, but doing that is actually difficult, and not 100% generic. It'll fail for ranges with const elements, or non-assignable elements. That's a restriction that's not acceptable. In contrast, cache is opt in. Furthermore, cache (as I wrote it), will error with a very explicit explanation (IMO) about the requirements on the element type (Assignablility of ElementType to Unqual!ElementType).

Member

quickfur commented Jul 17, 2014

ping all reviewers
Are we moving forward with this, or is this a bad idea and we should close it?

Collaborator

monarchdodra commented Jul 17, 2014

I still think it would be useful to have. It can be a useful complement in otherwise long and costly UFCS chains.

But I'm not passionate about it. Anybody of the opinion this range is actually bad/wrong?

Semi-on topic, I'm not too sure about providing birdirectional access anymore: 99% of the functionality is in forward iteration anyways. Bidirectional iteration could have a surprising effect, as the "last" element would be evaluated twice.

Member

DmitryOlshansky commented Jul 17, 2014

IMO we should go with it. Thoughts about cached range as a new kind of range need some public discussion.

Collaborator

Dicebot commented Jul 17, 2014

I have this bookmarked for some more detailed experiments during the upcoming weekend.

Member

eco commented Jul 17, 2014

I'm 100% in favor of this. Right now I have to use array() which eagerly caches everything unnecessarily in situations I'd use cache(). cache() would be a big improvement because it lets you retain some laziness and uses far less memory.

I can't really think of uses for the multi-element cache. Maybe to compartmentalize part of a long chain so there is less pressure on the CPU cache? I welcome the feature in case it becomes handy for something though.

Unless there is some huge defect in the concept I'm not seeing it should definitely go forward once everyone who wants to review the implementation has done so.

@eco eco commented on the diff Jul 17, 2014

std/algorithm.d
+to store the result in a cache.
+The result is then directly returned when $(D front) is called,
+rather than re-evaluated.
+
+This can be useful a function to place in a chain, after functons
+that have expensive evaluation, as a lazy alternative to $(XREF array,array).
+In particular, it can be placed after a call to $(D map), or before a call
+to $(D filter).
+
+$(D cache) may provide bidirectional iteration if needed, but since
+this comes at an increased cost, it must be explicitly requested via the
+call to $(D cacheBidirectional). Furthermore, a bidirectional cache will
+evaluate the "center" element twice, when there is only one element left in
+the range.
+
+$(D cache) does not provide random access primitives,
@eco

eco Jul 17, 2014

Member

I wonder if @DmitryOlshansky's sliding window range idea could be used here and, more broadly, if there is actually some overlap between cache() and his concept.

@DmitryOlshansky

DmitryOlshansky Jul 17, 2014

Member

There is, which is something I find intriguing after looking at this again. Given that @andralex wasn't objecting adding a couple of primitives which is a good sign.

I'm going to (hopefully soon) do a short writeup on a buffer range design and new i/o.

@quickfur

quickfur Jul 17, 2014

Member

I like the idea of a sliding window range. It could even be applicable to the prospective std.io.

Member

quickfur commented Aug 5, 2014

Looks like we have merge conflicts. Please rebase.

@JakobOvrum JakobOvrum commented on an outdated diff Aug 20, 2014

std/algorithm.d
@@ -731,6 +732,231 @@ unittest
}
/**
+$(D cache) eagerly evaluates $(D front) of $(D range)
+on each construction or call to $(D popFront),
+to store the result in a cache.
+The result is then directly returned when $(D front) is called,
+rather than re-evaluated.
+
+This can be useful a function to place in a chain, after functons
@JakobOvrum

JakobOvrum Aug 20, 2014

Member

s/useful a function/a useful function/
s/functons/functions/ (typo)

@JakobOvrum JakobOvrum commented on an outdated diff Aug 20, 2014

std/algorithm.d
+}
+@safe pure nothrow unittest
+{
+ //safety etc
+ auto a = [1, 2, 3, 4];
+ assert(equal(a.cache(), a));
+ assert(equal(a.cacheBidirectional(), a));
+}
+unittest
+{
+ char[][] stringbufs = ["hello".dup, "world".dup];
+ auto strings = stringbufs.map!((a)=>a.idup)().cache();
+ assert(strings.front is strings.front);
+}
+
+private struct Cache(R, bool Bidir)
@JakobOvrum

JakobOvrum Aug 20, 2014

Member

Bidir should not be capitalized

@JakobOvrum JakobOvrum commented on an outdated diff Aug 20, 2014

std/algorithm.d
+ assert(strings.front is strings.front);
+}
+
+private struct Cache(R, bool Bidir)
+{
+ private
+ {
+ alias E = ElementType!R;
+ alias UE = Unqual!E;
+
+ R source;
+ UE f = UE.init;
+ static if (Bidir) UE b = UE.init;
+
+ static assert(isAssignable!(UE, E) && is(UE : E),
+ algoFormat("Cannot instanciate a Cache from %s because is %s not assignable to %s.", R.stringof, E.stringof, UE.stringof));
@JakobOvrum

JakobOvrum Aug 20, 2014

Member

Instantiate*

I don't think referring to Cache in error messages (or documentation) is wise as long as Cache is undocumented.

@JakobOvrum JakobOvrum and 1 other commented on an outdated diff Aug 20, 2014

std/algorithm.d
+ if (!range.empty)
+ {
+ f = range.front;
+ static if (Bidir) b = range.back;
+ }
+ }
+
+ static if (isInfinite!R)
+ enum empty = false;
+ else
+ bool empty() @property
+ {
+ return source.empty;
+ }
+
+ E front() @property
@JakobOvrum

JakobOvrum Aug 20, 2014

Member

if the underlying range has assignable elements, then this range can propagate that with setter properties

@monarchdodra

monarchdodra Aug 20, 2014

Collaborator

Hum... I guess. Though I wonder how often one would use cache with a range that has assignable elements... Why not I guess?

@JakobOvrum JakobOvrum and 2 others commented on an outdated diff Aug 20, 2014

std/algorithm.d
+This can be useful a function to place in a chain, after functons
+that have expensive evaluation, as a lazy alternative to $(XREF array,array).
+In particular, it can be placed after a call to $(D map), or before a call
+to $(D filter).
+
+$(D cache) may provide bidirectional iteration if needed, but since
+this comes at an increased cost, it must be explicitly requested via the
+call to $(D cacheBidirectional). Furthermore, a bidirectional cache will
+evaluate the "center" element twice, when there is only one element left in
+the range.
+
+$(D cache) does not provide random access primitives,
+as $(D cache) would be unable to cache random accesses.
+If $(D Range) provides slicing primitives,
+then $(D cache) will also be slicible,
+but $(D hasSlicing!Cache) will not yield true.
@monarchdodra

monarchdodra Aug 20, 2014

Collaborator

It's because it does not provide random access, and hasSlicing checks for RA.

@JakobOvrum

JakobOvrum Aug 21, 2014

Member

Roger, thanks

@schuetzm

schuetzm Aug 23, 2014

Contributor

Could you write "will not yield true (because hasSlicing implies random access)." to clarify?

Member

JakobOvrum commented Aug 20, 2014

The currently proposed functionality is forward-compatible with Andrei's proposed extensions. I think we should focus on the currently proposed functionality first, as it has firmly demostrated value.

Dicebot removed the needs review label Aug 21, 2014

Collaborator

Dicebot commented Aug 21, 2014

Removing the "needs review" label, there does seem to be enough approval to proceed with.

Collaborator

monarchdodra commented Aug 21, 2014

Removing the "needs review" label, there does seem to be enough approval to proceed with.

I'll just close this then.

Collaborator

Dicebot commented Aug 21, 2014

Erm? You are not going to proceed with it anymore? It should still be kept open if someone wants to take it over as there is clear interest in having it merged.

Dicebot reopened this Aug 21, 2014

Collaborator

Dicebot commented Aug 21, 2014

or, probably, you have misread it as "does NOT seem to be enough approval"? :)

Member

eco commented Aug 21, 2014

Yeah, there must have been some miscommunication here. I'm looking forward to this getting merged. There's potentially some overlap with Dmitry's sliding window idea but this is definitely a feature we should have one way or another.

@monarchdodra monarchdodra commented on an outdated diff Aug 22, 2014

std/algorithm.d
+ assert(strings.front is strings.front);
+}
+
+private struct Cache(R, bool bidir)
+{
+ private
+ {
+ alias E = ElementType!R;
+ alias UE = Unqual!E;
+
+ R source;
+ UE f = UE.init;
+ static if (bidir) UE b = UE.init;
+
+ static assert(isAssignable!(UE, E) && is(UE : E),
+ algoFormat("Cannot instanciate range with %s because is %s elements are not assignable to %s.", R.stringof, E.stringof, UE.stringof));
@monarchdodra

monarchdodra Aug 22, 2014

Collaborator

I suck at English... will re-fix tongiht :(

Collaborator

monarchdodra commented Aug 22, 2014

or, probably, you have misread it as "does NOT seem to be enough approval"? :)

Actually, yeah, that is how I read it :D Weird.

@DmitryOlshansky DmitryOlshansky and 1 other commented on an outdated diff Aug 23, 2014

std/algorithm.d
+ enum empty = false;
+ else
+ bool empty() @property
+ {
+ return source.empty;
+ }
+
+ E front() @property
+ {
+ import core.exception : RangeError;
+ version(assert) if (empty) throw new RangeError();
+ return f;
+ }
+ static if (bidir) E back() @property
+ {
+ import core.exception : RangeError;
@DmitryOlshansky

DmitryOlshansky Aug 23, 2014

Member

May just as well import it on the struct level.

UPDATE: private import core.exception : RangeError;

@monarchdodra

monarchdodra Aug 23, 2014

Collaborator

Remind me what a private import implies... ?

Collaborator

monarchdodra commented Aug 25, 2014

Allright, I touched the documentation mistakes when issues where brought up.

I also tweaked the implementation to hold a TypeTuple field. It makes the constructor a tiny bit more complex (edit: Fixed), but it limits the overall requirement for static ifs later on (such as in save), since you can just do a TypeTuple assignment, without caring about how many members it holds.

Personal comment: I really love generic tuples. Those things are amazing, and let you do some really wild things...

I haven't squashed yet though, so as to keep track of any changes. I'll do that when it's "LGTM".

@JakobOvrum JakobOvrum and 1 other commented on an outdated diff Aug 25, 2014

std/algorithm.d
+}
+
+private struct Cache(R, bool bidir)
+{
+ import core.exception : RangeError;
+
+ private
+ {
+ alias E = ElementType!R;
+ alias UE = Unqual!E;
+
+ R source;
+
+ static if (bidir) alias Args = TypeTuple!(UE, UE);
+ else alias Args = TypeTuple!UE;
+ Args args = Args.init;
@JakobOvrum

JakobOvrum Aug 25, 2014

Member

args? This is never passed to a function AFAICS

@monarchdodra

monarchdodra Aug 25, 2014

Collaborator

That's a declaration of the member field(s) args, defined as type Args. It's a pretty bad name, but I don't really know how else to name a field (fields) that carry both front and back. Or type names for that matter... CacheTypes and caches? maybe instead?

@monarchdodra

monarchdodra Aug 25, 2014

Collaborator

Fixed

@JakobOvrum

JakobOvrum Aug 25, 2014

Member

New name is much better, thanks

@ghost

ghost commented Sep 10, 2014

Other than a few styling / unittest nitpicks this LGTM. It's a welcome feature too, and should be in the next changelog.

Member

DmitryOlshansky commented Sep 10, 2014

@AndrejMitrovic How about adding a special label like "to the changelog" to mark things for later inclusion into the changelog?

@ghost

ghost commented Sep 10, 2014

I never thought about that, good idea actually. However to avoid having to remove labels manually when a new changelog is made (because old pulls will still have the changelog label) I suggest we use version-tagged labels, e.g. changelog_v2.068.

Collaborator

monarchdodra commented Sep 10, 2014

I don't know how the changelog works. Should I do anything else here in regards to that?

@ghost

ghost commented Sep 10, 2014

I don't know how the changelog works. Should I do anything else here in regards to that?

No need to worry, we'll handle it.

Member

DmitryOlshansky commented Sep 10, 2014

@AndrejMitrovic Cool. Versionized labels feel just right.

Collaborator

Dicebot commented Sep 10, 2014

Note that this won't be in 2.067 as branch for it has been already forked : https://github.com/D-Programming-Language/phobos/tree/2.067

Unless it is going to be restarted, of course.

@ghost

ghost commented Sep 10, 2014

Note that this won't be in 2.067 as branch for it has been already forked

I don't get it, 2.066 just came out. 2.067 can't come out that quickly, can it? And what commits is it going to contain anyway, there haven't been that many?

Member

yebblies commented Sep 10, 2014

I don't get it, 2.066 just came out. 2.067 can't come out that quickly, can it? And what commits is it going to contain anyway, there haven't been that many?

Everything that went into master while 2.066 was in alpha/beta except for the regression fixes which went into both.

@ghost

ghost commented Sep 10, 2014

Ah. I'm clearly out of my depth now for how things are done. I guess we'll have to label it 2.068 then? But isn't that like.. a year from now (judging by how unfrequent our releases are)? :/

Member

yebblies commented Sep 10, 2014

The ideas is releases should be more frequent. Who knows what will actually happen though.

Collaborator

monarchdodra commented Sep 11, 2014

AndrejMitrovic added the changelog_v2.067 label a day ago

Oh, OK. I get it now.

Member

DmitryOlshansky commented Sep 11, 2014

It seems very close to the idea of Scala's sliding (they had it way back in 2009):
http://daily-scala.blogspot.ru/2009/11/iteratorsliding.html

Collaborator

monarchdodra commented Sep 11, 2014

Addressed points raised by @AndrejMitrovic 's in regards to style. Also improved the unittest to show that the effects of using cache over not using cache (in a fashion I find well expressed mind you).

I also removed "support" for objects with disabled this, as it lead to using said objects without actually initializing them, so I wasn't actually supporting them.

If all is well and good, I'll give it a squash.

@quickfur quickfur commented on an outdated diff Sep 11, 2014

std/algorithm.d
+{
+ import core.exception : RangeError;
+
+ private
+ {
+ alias E = ElementType!R;
+ alias UE = Unqual!E;
+
+ R source;
+
+ static if (bidir) alias CacheTypes = TypeTuple!(UE, UE);
+ else alias CacheTypes = TypeTuple!UE;
+ CacheTypes caches;
+
+ static assert(isAssignable!(UE, E) && is(UE : E),
+ algoFormat("Cannot instanciate range with %s because %s elements are not assignable to %s.", R.stringof, E.stringof, UE.stringof));
@quickfur

quickfur Sep 11, 2014

Member

Nitpick: "instantiate" not "instanciate".

Collaborator

monarchdodra commented Sep 15, 2014

Anything else?

@JakobOvrum JakobOvrum commented on the diff Sep 15, 2014

std/algorithm.d
+private struct Cache(R, bool bidir)
+{
+ import core.exception : RangeError;
+
+ private
+ {
+ alias E = ElementType!R;
+ alias UE = Unqual!E;
+
+ R source;
+
+ static if (bidir) alias CacheTypes = TypeTuple!(UE, UE);
+ else alias CacheTypes = TypeTuple!UE;
+ CacheTypes caches;
+
+ static assert(isAssignable!(UE, E) && is(UE : E),
@JakobOvrum

JakobOvrum Sep 15, 2014

Member

Shouldn't this be moved to the template constraints?

@monarchdodra

monarchdodra Sep 15, 2014

Collaborator

It would be hard to express in a short and "grokkable" way for the end user.

Also, I still believe that template constraints is for "logical disambiguation", and not "internal validation". This is the overload that works on input ranges. From there, the static assert is only designed as a way to cleanly express a compile error.

We do not want allow someone else to write a "cache" that could also operates on input ranges, with completly different semantics, just because the element type has a particular trait. This would require an explicit disambiguation (even if one of them wouldn't compile anyways, it is close enough logically to be ambigous).

@JakobOvrum JakobOvrum and 1 other commented on an outdated diff Sep 15, 2014

std/algorithm.d
+ static if (bidir) alias CacheTypes = TypeTuple!(UE, UE);
+ else alias CacheTypes = TypeTuple!UE;
+ CacheTypes caches;
+
+ static assert(isAssignable!(UE, E) && is(UE : E),
+ algoFormat("Cannot instantiate range with %s because %s elements are not assignable to %s.", R.stringof, E.stringof, UE.stringof));
+ }
+
+ this(R range)
+ {
+ source = range;
+ if (!range.empty)
+ {
+ caches[0] = range.front;
+ static if (bidir)
+ caches[1] = range.back;
@monarchdodra

monarchdodra Sep 15, 2014

Collaborator

Oops. Missed it. Done.

@JakobOvrum JakobOvrum and 1 other commented on an outdated diff Sep 15, 2014

std/algorithm.d
+ }
+
+ static if (hasSlicing!R)
+ {
+ static if (hasLength!R)
+ {
+ typeof(this) opSlice(size_t low, size_t high)
+ {
+ return typeof(this)(source[low .. high]);
+ }
+ }
+ else
+ {
+ auto opSlice(size_t low, size_t high)
+ {
+ return this[low .. $].take(high - low);
@JakobOvrum

JakobOvrum Sep 15, 2014

Member

Won't this fail if the below static-if condition is false, as opDollar won't be available?

@monarchdodra

monarchdodra Sep 15, 2014

Collaborator

Hum... I had thought that for infinite ranges, [n .. $] was mandatory to satisfy hasSlicing. Not yet I guess. Alright, gonna rewrite it.

@monarchdodra @monarchdodra monarchdodra Range adaptor: cache
As discussed in this thread:
http://forum.dlang.org/thread/ovbjcmogezbvsxrwfcol@forum.dlang.org

This provides a range adaptor that cache the result of another range.

Meant as a lazy alternative to array. Not much else to say...?

Documentation might suck.
ebcc18e
Member

quickfur commented Sep 23, 2014

ping
Anything else? This pull has been green for a while now. Are we ready to merge?

Collaborator

monarchdodra commented Sep 25, 2014

Anything else?

AFAIK, no.

Member

DmitryOlshansky commented Sep 25, 2014

Let's do it.

Member

DmitryOlshansky commented Sep 25, 2014

Auto-merge toggled on

@DmitryOlshansky DmitryOlshansky added a commit that referenced this pull request Sep 25, 2014

@DmitryOlshansky DmitryOlshansky Merge pull request #1364 from monarchdodra/cache
Range adaptor: cache
83e90a8

@DmitryOlshansky DmitryOlshansky merged commit 83e90a8 into dlang:master Sep 25, 2014

1 check passed

default Pass: 10
Details
Collaborator

monarchdodra commented Sep 25, 2014

Merged #1364.

Cool. Thanks.

Member

quickfur commented Sep 25, 2014

On Thu, Sep 25, 2014 at 04:20:43PM -0700, Михаил Страшун wrote:

..and yet another epic PR taken care of! :)

Go, team Phobos! :-P

Member

MartinNowak commented Sep 29, 2014

Perhaps filter() should make use of cache() internally to avoid the issue entirely.

Filter could just simply cache it's front element like any other range, at least if front has a side-effect.
Can someone please write a bug report.

Collaborator

monarchdodra commented Sep 30, 2014

Filter could just simply cache it's front element like any other range, at least if front has a side-effect.

The point is that filter doesn't have a side effect on its element. It merely (sometimes) evaluates front more than once. Making filter systematically cache its elements would bloat it, potentially be counter productive (there is no overhead for accessing array elements), and most of all, restricts it: No lvalue access, require isAssignable!(ElementType!R)...

Owner

CyberShadow commented Oct 2, 2014

Please add new functions to the cheat sheet as well as the function list.

Owner

CyberShadow commented Mar 18, 2015

A regression has been filed, caused by this pull request:
https://issues.dlang.org/show_bug.cgi?id=14301

Contributor

ntrel commented Mar 6, 2016

Maybe too late now, but IMO cache should really be in std.range, it's not an algorithm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment