New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove empty parameter list from `iterator()` #520

Closed
NthPortal opened this Issue Mar 22, 2018 · 22 comments

Comments

Projects
None yet
8 participants
@NthPortal
Copy link
Contributor

NthPortal commented Mar 22, 2018

There was a discussion on https://gitter.im/scala/collection-strawman about whether or not IterableOnce.iterator()/Iterable.iterator() should have an empty parameter list, or none at all. I will do my best to record the main points of the discussion below.


(mostly quotes from Gitter)

I ask "Why does IterableOnce.iterator() have an empty parameter list, rather than no parameter list?"

@szeiger and @julienrf suggest it is because iterator() is a side-effecting method, in that two returned Iterators have distinct identities, so the method is not referentially transparent.

@martijnhoekstra notes that "If everything that's not referentially transparent is a side effect, obtaining an iterator is a side-effect".

I ask "What about toArray? The Arrays are mutable, so while initially they would be equivalent, you could modify them so they are not".

@szeiger: "Indeed, if you go by referential transparency a lot more methods would need an empty argument list. What about size? When used on a mutable collection it's not referentially transparent. Of course, nothing is. The distinction doesn't make any sense there. So what about List.size? It is referentially transparent but that's not good enough because substituting such calls has huge performance implications. In a pure functional language the compiler can help you with that but Scala can't."

@SethTisue: "My intuition likes the 'initially they would be equivalent' side of your way of phrasing it, @NthPortal. But I'm having trouble backing up my intuition with a convincing argument [...] I think my intuition is something like, require the () if it introduces mutation or side effects into a program that wouldn't otherwise have it. And toArray doesn't do that, because you are free to write code that uses Array but never mutates an array".

@martijnhoekstra: "Neither does iterator() though - provided you never call next() [...] There is a lot less you can do with an iterator without modifying it than you can do with an Array without modifying it though".

@SethTisue: "In Scala through 2.12, .iterator has no parameter lists and the compiler says 'Iterator[...] does not take parameters' if you write .iterator(). So somebody made a different decision on this years ago. Just leaving it like it was doesn't feel terrible to me. Even if you buy the argument that it ought to have the (), I'm not sure it's worth annoying people with the change, especially since (as Stefan has pointed out) the change will become more annoying when future Scala requires the ()".

@Ichoran says that "My intuition is that () as a way to denote referential transparency is not a good idea anyway. Instead, () is useful as a marker for things that observably mutate the object on which the method is called. That is, for some method bar, the behavior of x.bar will be different after x.foo()." @SethTisue likes this intuition.

@SethTisue

This comment has been minimized.

Copy link
Member

SethTisue commented Mar 22, 2018

and then @julienrf said "I’m OK to go back to iterator for the sake of backward compatibility."

@marcelocenerine

This comment has been minimized.

Copy link
Contributor

marcelocenerine commented Mar 23, 2018

This reminds me of this discussion long time ago: https://groups.google.com/forum/#!topic/scala-language/RlV9O1RDmis

@odersky

This comment has been minimized.

Copy link
Contributor

odersky commented Mar 23, 2018

The uniform access principle says that () is redundant if the operation could have been implemented as a (mutable) field. A more precise way of saying that is that it's OK to drop the () for member m on value x if

x.m() == x.m()

assuming some single-threadedness guarantees*. So that means length can be written without () for both mutable and immutable collections. currentTimeMillis() must be written with () because its value may change from one call to the next.

Furthermore:

toList, toSeq: No() because of the way equality is defined on collections.
But it's iterator(), toArray(), or next() since different calls yield different values.

I think it would be nice if we could enforce that principle with the collections redesign, provided porting efforts are acceptable. For Scala 2.13, is there a cost for changing to iterator()? For Scala 3, we will enforce the right parentheses on the call site. On the other hand, that should be easy to rewrite automatically. The rewrite tool could do that already for Scala 2.13, in order to keep things tidy.

  • Single-threadedness guarantees: If the type in question is assumed to be safe for multi-threaded use, we must account for possible concurrent calls, which might force us to add (). For other types, we assume singlethreaded use, with the exception of System interrupts (e.g. currentTimeMillis().
@Ichoran

This comment has been minimized.

Copy link
Contributor

Ichoran commented Mar 24, 2018

But xs.iterator == xs.iterator for every extant collection type. If xs.iterator is on an iterator, it is a reference to itself, and iterators are equal to themselves. If it is on an Iterable, the two copies are independent.

Also, if xs is an iterator, it is not true that xs.toList == xs.toList.

@odersky

This comment has been minimized.

Copy link
Contributor

odersky commented Mar 24, 2018

But xs.iterator == xs.iterator for every extant collection type.

Not if we mean == literally:

scala> List(1, 2, 3)
res0: List[Int] = List(1, 2, 3)

scala> res0.iterator == res0.iterator
res1: Boolean = false

Also, if xs is an iterator, it is not true that xs.toList == xs.toList.

That's indeed a problem, but I guess it is part of the understanding that iterators are "use once", so you could argue that it.toList == it.toList does not even make sense. By the same token,

xs.map(f) == xs.map(f)

for every collection xs and every pure function f, but the same is not true for iterators.

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented Mar 24, 2018

It is a more general problem with IterableOnce (or IterableOnceOps), that (virtually) all of its operations consume it, while that is not the case for Iterable, which extends it.

The only real way to solve it without breaking the inheritance structure of the collections is to allow overriding the number of empty parameter lists for a method, which seems weird.

@Ichoran

This comment has been minimized.

Copy link
Contributor

Ichoran commented Mar 25, 2018

Oops, I'd forgotten that Iterator uses only reference equality.

I agree with the uniform access principle, but I don't think that equality as a proof of uniform access works properly. Take, for instance,

case class C(i: Int) {
  def bigger: C = new C(i + 1)
}

class D(val i: Int) {
  def bigger(): D = new D(i + 1)
}

These two behave identically with regard to the bigger method. The only difference is that C has value equality and D has reference equality, so that c.bigger == c.bigger but not d.bigger() == d.bigger().

So the equality test seems not to be necessary since you can break it by essentially irrelevant modifications to other parts of the class. For instance, if you add or remove an equals method, are you supposed to refactor all your ()'s?

Furthermore, the return value isn't even the important thing. Take, for instance,

class E[A] {
  val calls = new ArrayBuffer[() => A]
  def add(call: () => A): Unit = { calls += call; () }
  def drop: Unit = {
    if (calls.nonEmpty) calls.remove(calls.length -1)
    ()
  }
}

Now, despite the fact that e.drop == e.drop, from the signature of drop we know that it is pointless unless it is side-effecting, and if the signature changed to return something (e.g. a constant ID value), that wouldn't make it any less side-effecting.

So certainly the equality test isn't sufficient because we need to know whether the class itself has changed or not.

Since equality is neither necessary nor sufficient for determining whether something obeys the replace-me-with-a-field rule, I think we shouldn't really use it. Sometimes it will give the same answer, but it can have errors either way.

Now, I think phrasing the uniform access principle in terms of whether a call is indistinguishable from field access is fine, but that leaves us with a rather unpleasant situation in inheritance hierarchies where some subsets of the hierarchy are immutable. You're basically forced to always use () if you belong to a hierarchy where the method could have returned a copy of something mutable.

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented Apr 3, 2018

Can toArray actually have empty parentheses, given that it has an implicit ClassTag parameter?

@szeiger

This comment has been minimized.

Copy link
Collaborator

szeiger commented Apr 19, 2018

It can. Everything looks normal as long as the implicit parameter list stays implicit but you're up for a surprise if you're used to calling with without parentheses and then want to pass in an explicit ClassTag:

scala> implicit def f[T : scala.reflect.ClassTag]() = ()
f: [T]()(implicit evidence$1: scala.reflect.ClassTag[T])Unit

scala> f

scala> f()

scala> f()(implicitly[scala.reflect.ClassTag[Int]])

scala> f(implicitly[scala.reflect.ClassTag[Int]])
                   ^
       error: no arguments allowed for nullary method f: ()(implicit evidence$1: scala.reflect.ClassTag[T])Unit

@szeiger szeiger added this to the Scala 2.13.0-M4 milestone Apr 19, 2018

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 10, 2018

It seems there's an agreement to follow @odersky's strategy #520 (comment). So I'm closgin this, created #571 for toArray

@lrytz lrytz closed this May 10, 2018

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented May 10, 2018

@lrytz I'm not as convinced of this agreement - I haven't seen anyone address @Ichoran's points

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 10, 2018

@NthPortal thanks for speaking up, I was also testing the waters :-) I'll reopen for now.

@Ichoran says

I agree with the uniform access principle, but I don't think that equality as a proof of uniform access works properly

I agree with that. Can we just drop the second part in @odersky's strategy?

The uniform access principle says that () is redundant if the operation could have been implemented as a (mutable) field. A more precise way of saying that is that it's OK to drop the () for member m on value x if

x.m() == x.m()

This would leave us with "if the operation could have been implemented as a (mutable) field".

So this would mean .iterator(), .toArray(), also .toBuffer() because the result is mutable, but .toList.

@Ichoran remarks

that leaves us with a rather unpleasant situation in inheritance hierarchies where some subsets of the hierarchy are immutable. You're basically forced to always use () if you belong to a hierarchy where the method could have returned a copy of something mutable.

I'd like to see an example here.

@lrytz lrytz reopened this May 10, 2018

@Ichoran

This comment has been minimized.

Copy link
Contributor

Ichoran commented May 10, 2018

@lrytz - As an example, consider tail(), flatten(), and sum(). None of these can be fields because any of them might return (different instances of) mutable objects. (Admittedly, sum is usually only defined on immutable objects, but there's no reason you can't define Numeric on a mutable one.)

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented May 11, 2018

@lrytz An example from a very different angle is Factory[Coll[_]]#empty[A], where whether or not empty can be a field depends on what inheritance hierarchy Coll is in, even though the Factorys are the same trait (and are not in different places in an inheritance hierarchy). That is, the existence of inheritance hierarchies in general affects whether or not empty can be a field.

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented May 11, 2018

To generalize what @Ichoran said, any method on a scala.collection.CollOps[A, CC[_], C] which returns a C or a CC[_] is going to have this problem.

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 11, 2018

Thanks for the good examples!

Do we all agree that we don't want to add () to tail/flatten/empty etc?

Can we just say "add () if the method can perform side effects"? That's what @Ichoran seems to suggest (PR description, last paragraph).

Iterator.next(), println(), but .iterator, .tail, .empty, .toArray, .toList. I personally like it to have () as a sign that a side effect is happening. Multi-threading would not matter, no () for methods that only observe (but don't perform) effects. So in principle also .currentTimeMillis (well that's defined in Java anyway - but do we really need an exception for "exception of System interrupts" as Martin mentions above?)

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 11, 2018

It leads to the question what is a side effect. Referential transparency is not a useful answer here, as we're working with mutable state. Effects are IO, observable state changs (changes to local / newly allocated state is not an effect, niether is updating a cache). Exceptions are excluded (.head).

@SethTisue

This comment has been minimized.

Copy link
Member

SethTisue commented May 11, 2018

Can we just say "add () if the method can perform side effects"?

Yes, please. This how I've understood and taught () for more than a decade now, and it's how I see others understanding and teaching it, too.

I dislike iterator() and toArray() — both direct dislike, and dislike of the migration annoyance we will cause people if we do this.

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 11, 2018

@NthPortal

This comment has been minimized.

Copy link
Contributor

NthPortal commented May 11, 2018

I noticed last night that Factory[Coll]#newBuilder[A] got () as well in the collections redesign as well - is that something that should be reverted as well?

@lrytz

This comment has been minimized.

Copy link
Member

lrytz commented May 11, 2018

Searching a bit in the library, there's also

  • Iterable.newSpecificBuilder()
  • Builder.result() - this should stay because the builder should not be used after
  • SetOps.subsets() - this needs to stay because it's overloaded, scala/bug#9116

And outside collections

  • Random.nextInt() and friends - this should stay, as getting a number modifies the state of the random instance
  • RichInt.longValue() and friends -- they are defined with () in the base class ScalaNumericAnyConversions
  • Numeric.Ops.toInt() and friends
@Ichoran

This comment has been minimized.

Copy link
Contributor

Ichoran commented May 11, 2018

Methods on Random change the instance of the class, so they should have () under any scheme we've discussed! Likewise with Builder.result()--the behavior after calling it is undefined (or defined if it's a ReusableBuilder, but you still shouldn't call it again).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment