Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add productElementName and productElementNames methods to Product #6972

Merged
merged 1 commit into from
Aug 8, 2018

Conversation

cb372
Copy link
Contributor

@cb372 cb372 commented Jul 25, 2018

This continues @olafurpg's work in #6951.

Starting with his commit, I've:

  • updated the synthetic productElementName to use the (decoded) case class parameter names instead of the accessor names
  • added some tests
  • fixed a typo
  • added Scaladoc

@scala-jenkins scala-jenkins added this to the 2.13.0-M5 milestone Jul 25, 2018
@cb372
Copy link
Contributor Author

cb372 commented Jul 25, 2018

Oops, just noticed that the behaviour is also incorrect for multiple param lists. The synthetic only picks up the first param list. Investigating now.

@cb372
Copy link
Contributor Author

cb372 commented Jul 25, 2018

Oops, just noticed that the behaviour is also incorrect for multiple param lists.

Scratch that. I misunderstood the behaviour of productArity:

Welcome to the Ammonite Repl 1.1.0
(Scala 2.12.4 Java 1.8.0_144)
If you like Ammonite, please support our development at www.patreon.com/lihaoyi
@ case class MultipleParamLists(a: String, b: Int)(c: Boolean)
defined class MultipleParamLists

@ val x = MultipleParamLists("foo", 123)(true)
x: MultipleParamLists = MultipleParamLists("foo", 123)

@ x.productArity
res2: Int = 2

// def productElementNameMethod = perElementMethod(nme.productElementName, StringTpe)(x => LIT(x.name.toString))
def productElementNameMethod = {
val constrParamAccessors = clazz.constrParamAccessors
createSwitchMethod(nme.productElementName, constrParamAccessors.indices, StringTpe)(idx => LIT(constrParamAccessors(idx).name.dropLocal.toString))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably should be .name.dropLocal.decodedName to fix:

scala> case class Symbols(:: : Int, || : Int)
defined class Symbols

scala> Symbols(0, 0).productElementNames.toList
res4: List[String] = List($colon$colon, $bar$bar)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Added a test case for this.

@SethTisue SethTisue added the release-notes worth highlighting in next release notes label Jul 26, 2018
@cb372
Copy link
Contributor Author

cb372 commented Jul 26, 2018

I did a quick empirical measurement of the classfile size impact for a simple case class.

case class FiveFields(a: Int, b: Int, c: Int, d: Int, e: Int)

Scala 2.12.6:

-rw-r--r--  1 chris  staff  2207 Jul 26 16:56 FiveFields$.class
-rw-r--r--  1 chris  staff  5798 Jul 26 16:56 FiveFields.class

This PR:

-rw-r--r--  1 chris  staff  2207 Jul 26 16:47 FiveFields$.class
-rw-r--r--  1 chris  staff  6291 Jul 26 16:47 FiveFields.class

Regarding @retronym's comment from the previous PR:

We could ratchet the fixed cost down a bit by adding a static factory method to scala.runtime for the IIOBE.

I tried that but it actually makes the classfile slightly bigger. It reduces the bytecode size for the productElementName method, but I guess you end up with more stuff in the symbols table. Unless I've misunderstood what you meant, or I've implemented it really badly? (Both quite likely.)

This PR as it stands:

  public java.lang.String productElementName(int);
    Code:
       0: iload_1
       1: istore_2
       2: iload_2
       3: tableswitch   { // 0 to 4
                     0: 36
                     1: 41
                     2: 46
                     3: 51
                     4: 56
               default: 61
          }
      36: ldc           #109                // String a
      38: goto          76
      41: ldc           #110                // String b
      43: goto          76
      46: ldc           #111                // String c
      48: goto          76
      51: ldc           #112                // String d
      53: goto          76
      56: ldc           #113                // String e
      58: goto          76
      61: new           #98                 // class java/lang/IndexOutOfBoundsException
      64: dup
      65: iload_1
      66: invokestatic  #96                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
      69: invokevirtual #101                // Method java/lang/Object.toString:()Ljava/lang/String;
      72: invokespecial #104                // Method java/lang/IndexOutOfBoundsException."<init>":(Ljava/lang/String;)V
      75: athrow
      76: areturn

Using a factory method to build the exception:

  public java.lang.String productElementName(int);
    Code:
       0: iload_1
       1: istore_2
       2: iload_2
       3: tableswitch   { // 0 to 4
                     0: 36
                     1: 41
                     2: 46
                     3: 51
                     4: 56
               default: 61
          }
      36: ldc           #110                // String a
      38: goto          69
      41: ldc           #111                // String b
      43: goto          69
      46: ldc           #112                // String c
      48: goto          69
      51: ldc           #113                // String d
      53: goto          69
      56: ldc           #114                // String e
      58: goto          69
      61: getstatic     #101                // Field scala/runtime/package$.MODULE$:Lscala/runtime/package$;
      64: iload_1
      65: invokevirtual #105                // Method scala/runtime/package$.aioobe:(I)Ljava/lang/ArrayIndexOutOfBoundsException;
      68: athrow
      69: areturn

The file size increases from 6291 to 6324 bytes.

The factory method looks like this:

package scala

package object runtime {
  @noinline
  final def aioobe(n: Int): ArrayIndexOutOfBoundsException =
    new ArrayIndexOutOfBoundsException(n.toString)
}

and the callsite looks like this:

    def createSwitchMethod(name: Name, range: Seq[Int], returnType: Type)(f: Int => Tree) = {
      createMethod(name, List(IntTpe), returnType) { m =>
        val arg0    = Ident(m.firstParam)
        val default = DEFAULT ==> Throw(Apply(getMemberMethod(RuntimePackage, TermName("aioobe")), arg0))
        val cases   = range.map(num => CASE(LIT(num)) ==> f(num)).toList :+ default

        Match(arg0, cases)
      }
    }

@retronym
Copy link
Member

A call to a real Java static method is a bit smaller in bytecode that a module load and a virtual call.

@cb372
Copy link
Contributor Author

cb372 commented Jul 27, 2018

I think this one is now ready to go, unless anyone has any more comments.

I'd like to investigate the static factory method optimisation further and potentially submit a separate PR for it later, if that's OK. I don't think it needs to block this PR.

Copy link
Member

@dwijnand dwijnand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you Chris and Olaf.

Copy link
Contributor

@julienrf julienrf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not convinced by this feature. In addition to the (small, I concede) impact on the bytecode size, introducing this feature means that we will have to support it afterwards. But is it really worth it? It encourages using a poorly typed API and we already have better tools for generic programming (e.g. shapeless or scalaz-deriving).

@tarsa
Copy link

tarsa commented Aug 5, 2018

It encourages using a poorly typed API and we already have better tools for generic programming (e.g. shapeless or scalaz-deriving).

I thought the main purpose of product element name is to define some nonstandard formatters for data classes. The standard toString for case classes and tuples doesn't for example quote strings, which makes converting ScalaTest output (i.e. x was not equal to y, where x is what we copy-paste) to code more tedious.

Using scalaz-deriving or shapeless is going to not only increase classpath size considerably and possibly introduce JAR hell for such a small feature, but they require some changes in the code. This PR adds method which could be used by e.g. ScalaTest out of the box without any changes to user code.

@SethTisue
Copy link
Member

scalaz-deriving or shapeless

not to mention the impact on compile times...

personally I'm convinced this should go in. @tarsa's use case simply shouldn't be hard, this is really basic functionality, lacking this just makes Scala look silly and bad.

sure, yes, Product is lowbrow stuff, but sometimes the lowbrow solution is good enough. given that Product exists at all, it's silly for the names not to be available, they should have been available in the first place. this comes up all the time on Stack Overflow, Gitter, IRC, etc., has been coming up for 10+ years. and the effect of withholding productElementName is simply that people end up doing even grosser stuff with Java reflection instead.

if people need to be warned of limitations of Product, I think the Product Scaladoc would be a fine place to do that

more principled, highfalutin' solutions will remain available for those who want them.

@lrytz what do you think?

needs rebase.

@cb372
Copy link
Contributor Author

cb372 commented Aug 7, 2018

Rebased.

@lrytz
Copy link
Member

lrytz commented Aug 7, 2018

LGTM too. @cb372 could you sqash all changes into a single commit?

@cb372
Copy link
Contributor Author

cb372 commented Aug 7, 2018

Squashed (and rebased again)

@cb372
Copy link
Contributor Author

cb372 commented Aug 7, 2018

The build now fails with a compile error, but I think it's unrelated to my changes. I see the same error on 2.13.x branch: https://travis-ci.org/scala/scala/jobs/413124970#L1021

@lrytz
Copy link
Member

lrytz commented Aug 7, 2018

Thanks, I'll look at the build failure.

@lrytz
Copy link
Member

lrytz commented Aug 7, 2018

#7010 should fix the build, you can rebase again when it's merged.

This commit adds two methods to the `scala.Product` trait:
```scala
trait Product {
  /** Returns the field name of element at index n */
  def productElementName(n: Int): String
  /** Returns field names of this product. Must have same length as productIterator */
  def productElementNames: Iterator[String]
}
```

Both methods have a default implementation which returns the empty
string for all field names.

This commit then changes the code-generation for case classes to
synthesize a `productElementName` method with actual class field names.

The benefit of this change is that it becomes possible to pretty-print
case classes with field names, for example
```scala
case class User(name: String, age: Int)

def toPrettyString(p: Product): String =
  p.productElementNames.zip(p.productIterator)
   .map { case (name, value) => s"$name=$value" }
   .mkString(p.productPrefix + "(", ", ", ")")

toPrettyString(User("Susan", 42))
// res0: String = User(name=Susan, age=42)
```

The downside of this change is that it produces more bytecode for each
case-class definition. Running `:javacp -c` for a case class with three
fields yields the following results
```scala
> case class A(a: Int, b: Int, c: Int)
> :javap -c A
  public java.lang.String productElementName(int);
    Code:
       0: iload_1
       1: istore_2
       2: iload_2
       3: tableswitch   { // 0 to 2
                     0: 28
                     1: 33
                     2: 38
               default: 43
          }
      28: ldc           78                 // String a
      30: goto          58
      33: ldc           79                 // String b
      35: goto          58
      38: ldc           80                 // String c
      40: goto          58
      43: new           67                 // class java/lang/IndexOutOfBoundsException
      46: dup
      47: iload_1
      48: invokestatic  65                 // Method scala/runtime/BoxesRunTime.boxToInteger:(I)Ljava/lang/Integer;
      51: invokevirtual 70                 // Method java/lang/Object.toString:()Ljava/lang/String;
      54: invokespecial 73                 // Method java/lang/IndexOutOfBoundsException."<init>":(Ljava/lang/String;)V
      57: athrow
      58: areturn
```

Thanks to Adriaan's help, the estimated cost per `productElementName`
appears to be fixed 56 bytes and then 10 bytes for each field with
the following breakdown:

* 3 bytes for the
  [string info](https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-4.html#jvms-4.4.3)
  (the actual characters are already in the constant pool)
* 4 bytes for the tableswitch entry
* 2 bytes for the ldc to load the string
* 1 byte for areturn

In my opinion, the bytecode cost is acceptably low thanks to the fact
that field name literals are already available in the constant pool.
@tpolecat
Copy link
Contributor

tpolecat commented Aug 7, 2018

What is the use case for this? As @SethTisue mentioned people ask about this constantly on Gitter and so on, but in the end it's never what they really want.

@SethTisue
Copy link
Member

What is the use case for this?

what @tarsa said. but also other tasks involving abstracting over multiple case classes. yes, doing that with Product, you may eventually hit a wall where you can't accomplish your goal, but not necessarily. sometimes you're abstracting over a limited set of case classes for a limited purpose.

As @SethTisue mentioned people ask about this constantly on Gitter and so on, but in the end it's never what they really want.

I disagree.

Certainly some of them eventually change their minds about what they want, later in their Scala journeys.

I'll admit there is an irreducible element of speculation here. What happens next, after their question on Gitter/SO/etc? We don't have a team of people in white lab coats with clipboards tracking these people afterwards.

But my best guess is that 1) a substantial portion eventually decide they need a "proper" solution like shapeless, but also 2) a substantial portion are able to get their job done with Product and/or reflection or whatever and move on and fight other battles.

@cb372
Copy link
Contributor Author

cb372 commented Aug 7, 2018

The two most obvious use cases that spring to mind are:

  • including the field names in the string representation of a case class
  • turning a case class into a Map[String, Any]

For the former, I think it's a no-brainer that you should be able to do that without having to reach for an external library, especially a "heavyweight" library like shapeless, scalaz-deriving or magnolia, or roll it yourself using macros/scalameta. It should just work out of the box. (It's too late to change the format of the toString representation, but we could potentially add a toLabelledString or something like that to the stdlib. This PR would be the first step toward that.)

As for the latter, that's something I've had to do a reasonable number of times in the past. It would be convenient to be able to achieve it with a few lines of code and no dependencies, instead of depending on shapeless or whatever.

@tpolecat
Copy link
Contributor

tpolecat commented Aug 7, 2018

List extends Product so even in this common case trying to abstract over Product will yield an undesirable string representation; and I can't imagine why you would ever want a Map[String, Any]. This all seems kind of shortsighted to me.

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 8, 2018

com-lihaoyi/PPrint#4 is an issue in the wild where this could be used

@dwijnand dwijnand requested a review from lrytz August 8, 2018 06:30
@dwijnand
Copy link
Member

dwijnand commented Aug 8, 2018

This is green and yours to land, @lrytz (already 👍 from me).

@olafurpg
Copy link
Member

olafurpg commented Aug 8, 2018

@tpolecat pretty-printing libraries like http://www.lihaoyi.com/PPrint/ already handle collection types so List would not be treated as a case class.

Note that productElementName is a low-level abstraction that can be used as a building block for higher-level APIs. For example, even macros like Magnolia use Product.productElement instead of accessing fields directly softwaremill/magnolia@6599b37. Derivation libraries could likewise use productElementName instead of inlining field name literals at every call-site.

@lrytz lrytz merged commit d9e0f5f into scala:2.13.x Aug 8, 2018
@cb372 cb372 deleted the productElementName branch August 8, 2018 09:29
@lrytz
Copy link
Member

lrytz commented Aug 8, 2018

For reference, scala-library.jar size went from 3.79 to 3.8 MB

@olafurpg
Copy link
Member

olafurpg commented Aug 8, 2018

Hurray! Thank you for everybody for the reviews and discussions 🎉

I'm wondering if Predef should expose an extension method Product.productToString 🤔 That would make this functionality accessible to everybody out-of-the-box avoiding the need to copy-paste the same prettyPrint method for basic use-cases. This would bring us close to implement #6936 without breaking changes.

Judging by the discussions in https://contributors.scala-lang.org/t/case-class-tostring-new-behavior-proposal-with-implementation/2056/43 there is a big demand to pretty-print case classes that can be copy-pasted back into source code and compile. Some requirements

  • handle collection types so List prints as List(....) instead of as a case class List(head=x,tail=xs)
  • handle tuple names so (1, 2) is not printed as (_1=1,_2=2)
  • strings are quoted and escaped properly, multiline strings might want to use triple-quotes """
  • recursive product values are handled properly, for example User(name="Susan",address=Address(zip="101",country="is"))
  • handle empty field names to avoid hanging = like Prefix(="value")
  • (optional, up for debate), line wrapping to a configurable line width like pprint does http://www.lihaoyi.com/PPrint/ It might also be desirable to expose this via Iterator[String] to support truncation for large data structures.

(essentially, I would love to have pprint in the standard library 😄 )

Where would be the best place to discuss this further? Contributors, scala/scala-dev, scala/bug?

@SethTisue
Copy link
Member

SethTisue commented Aug 8, 2018

Where would be the best place to discuss this further? Contributors

yes please https://contributors.scala-lang.org

@retronym
Copy link
Member

Noting for posterity: This change caused a slight slowdown (700ms to 715ms) in the scalap compiler benchmark.

The simplest explanation is that that corpus includes a higher ratio of case classes to total LoC that the others, and the cost of synthesizing, transforming, and code-gen for the new method is visible. I haven't dug deeper to confirm this though.

@nafg
Copy link
Contributor

nafg commented Aug 10, 2018

Why does List extend Product?

@SethTisue
Copy link
Member

Why does List extend Product?

let's discuss on https://contributors.scala-lang.org rather than take this PR off on a tangent

@nafg
Copy link
Contributor

nafg commented Aug 10, 2018

I asked because the answer could be relevant to some of the challenges here. I don't think it's a tangent but it's probably moot since it's merged and late.

@joshlemer
Copy link
Member

Are we continuing on the old thread contributors.scala-lang.org?

@SethTisue SethTisue changed the title Product element name Add productElementName and productElementNames methods to Product Aug 22, 2018
@SethTisue
Copy link
Member

aaaand we have our first user! #7242

@MariosPapasofokli
Copy link

Where would be the best place to discuss this further? Contributors, scala/scala-dev, scala/bug?

@olafurpg Did this discussion (Product.productToString) happen somewhere else ? Cannot seem to find it. It would be great if you could point me to it

@olafurpg
Copy link
Member

@MariosPapasofokli
Copy link

@olafurpg thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes worth highlighting in next release notes welcome hello new contributor!
Projects
None yet