-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collections: remove redundant calls to .seq #3089
Conversation
Students of Scala's history might recall that at introduction of parallel collections, GenIterable et al were *not* added; instead the parallel collections inherited from the existing interfaces. This of course was an invitation to widespread disaster as any existing code that foreach-ed over a collection might now experience unwanted concurrency. The first attempt to fix this was to add the `.seq` method to convert a parallel colleciton to a sequential one. This was added in e579152, and call sites in the standard library with side-effecting foreach calls were changed to use that. But this was (rightly) deemed inadequate, as we could hardly expect people to change existing code or remember to do this in new code. So later, in 3de9615, GenIterable was sprouted, and parallel collections were re-parented. This commit removes residual needless calls to .seq when the static type of the receiver is already a plain Iterable, which are no-ops.
/cc @Ichoran (I've got a few performance oriented pull requests in the works that you might be interested in.) |
@@ -97,7 +97,7 @@ trait TraversableOnce[+A] extends Any with GenTraversableOnce[A] { | |||
// for internal use | |||
protected[this] def reversed = { | |||
var elems: List[A] = Nil | |||
self.seq foreach (elems ::= _) | |||
self foreach (elems ::= _) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this
should work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whoa, whoa, that won't work without seq, will it? elems isn't
synchronized. If you traverse in parallel you'll end up with missed
elements.
Where are these pull requests?
--Rex
On Thu, Oct 31, 2013 at 4:52 AM, Aleksandar Prokopec <
notifications@github.com> wrote:
In src/library/scala/collection/TraversableOnce.scala:
@@ -97,7 +97,7 @@ trait TraversableOnce[+A] extends Any with GenTraversableOnce[A] {
// for internal use
protected[this] def reversed = {
var elems: List[A] = Nil
- self.seq foreach (elems ::= _)
- self foreach (elems ::= _)
Seems like this should work as well.
—
Reply to this email directly or view it on GitHubhttps://github.com//pull/3089/files#r7339648
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We know that self
is already TraversableOnce
, not a GenTraversableOnce
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. Fair enough then; I agree there's no need for .seq.
--Rex
On Thu, Oct 31, 2013 at 7:53 AM, Jason Zaugg notifications@github.comwrote:
In src/library/scala/collection/TraversableOnce.scala:
@@ -97,7 +97,7 @@ trait TraversableOnce[+A] extends Any with GenTraversableOnce[A] {
// for internal use
protected[this] def reversed = {
var elems: List[A] = Nil
- self.seq foreach (elems ::= _)
- self foreach (elems ::= _)
We know that self is already TraversableOnce, not a GenTraversableOnce.
—
Reply to this email directly or view it on GitHubhttps://github.com//pull/3089/files#r7343601
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other pull requests will be trickling in. I've been throwing a lot of ideas around in https://github.com/retronym/scala/compare/topic/opt2 and will be working to find what's worthwhile in there with @gkossakowski next week.
On that branch, I saw improvements in the order of 10-15% of a warmed up resident compiler FSC repeatedly processing src/library/util/**.scala
. But the usual caveats apply; that result might not generalize, and I might have made things incorrect along the way.
Some (many) of the commits there seemed to deliver cleaner YourKit profiles without real improvements in elapsed time.
Some profile results suprised me (as usual). A lot of time was reported in the constructors of AbstractSeq
and friends. I ended up writing a custom ListBuffer
that didn't participate in the collections hierarchy at all to, and use that in a custom version of map
that was specialized for List
(no allocations on Nil.map, no indirection through the CanBuildFrom).
As soon as we figure out which changes seem to be the most beneficial, I'll let you know.
LGTM. Strange that this wasn't noticed earlier. |
LGTM also. |
I really need to get Jeremy Manson's async stack trace capturer (I hesitate to call it a profiler; it's very low level) working at a level where it could be useful for profiling the compiler. I'm not sure whether YourKit uses AsyncGetCallTrace; it didn't used to because it wasn't part of the JVMTI specification. But you can have huge discrepancies between what you profile and what actually is taking the time unless you use it. |
@Ichoran I did try once to get the Oracle profiler up (http://www.oracle.com/technetwork/server-storage/solarisstudio/overview/index.html) which I've heard peeks under the hood to avoid the safepoint sampling bias and also gives some insights into JIT. But I got stuck and set it aside. |
Collections: remove redundant calls to .seq
Leveraged -Xmigration to burn off some warts which arose in the new collections. Warnings put in place for behavioral changes, allowing the following. 1) Buffers: create new collections on ++ and -- like all the other collections. 2) Maps: eliminated never-shipped redundant method valuesIterable and supplied these return types: def keys: Iterable[A] def keysIterator: Iterator[A] def values: Iterable[B] def valuesIterator: Iterator[B] def keySet: Set[A] I concluded that keys should return Iterable because keySet also exists on Map, and is not solely in the province of Maps even if we wanted to change it: it's defined on Sorted and also appears in some Sets. So it seems sensible to have keySet return a Set and keys return the more general type. Closes scala#3089, scala#3145. Review by odersky.
Students of Scala's history might recall that at introduction of
parallel collections, GenIterable et al were not added; instead
the parallel collections inherited from the existing interfaces.
This of course was an invitation to widespread disaster as any
existing code that foreach-ed over a collection might now experience
unwanted concurrency.
The first attempt to fix this was to add the
.seq
method toconvert a parallel colleciton to a sequential one. This was added
in e579152, and call sites in the standard library with
side-effecting foreach calls were changed to use that.
But this was (rightly) deemed inadequate, as we could hardly expect
people to change existing code or remember to do this in new code.
So later, in 3de9615, GenIterable was sprouted, and parallel
collections were re-parented.
This commit removes residual needless calls to .seq when the static
type of the receiver is already a plain Iterable, which are no-ops.
Review by @axel22