Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement used scopes for used names and implement PatMat scope #267

Merged
merged 7 commits into from
Mar 28, 2017

Conversation

romanowski
Copy link
Contributor

Motivation

In our large codebase we use sealed classes/traits a lot and we quite often adds new values to them (e.g. we represent algorithms this way). Children of such sealed classes/traits change quite often (and this is usually done by changing mixed traits) causing all usages of sealed classes to be recompiled.

How we can improve zinc?

I am almost sure that (excluding macros and other non-standard) the only place where base trait is used that needs to be recompiled on change in child's type is pattern matching (to emit warning). To be more we need to invalidate all places where we match on type that contains mentioned class. Example below:

sealed trait A
object FooA

If we change FooA to FooA with Serializable we need to recompile all direct usages of FooA (this is not changed) but we don't need to recompile e.g. this: def bar(a: A) = println(a).

However every usage like:

def foo(a: A) = a match { ... }
def foor(b: Option[A]) = b match { ... }

Needs to be recompiled.

If we are using A inside case statements we don't need to recompile that usages (since warning won't be emitted in such case. Similarly if we are using unnaply to create value e.g. List(A, name) = myList() no warning will be emitted so we don't need to recompile this.

@retronym Please correct me if I am wrong since I am not compiler expert :)

Implemetation detalis

In order to support this in zinc I created UseScope that tells about scope where name is used. So far we've got 3 scopes: Default, Implicit and PatMat. As you may guess current logic is implemented as Default. Implicits are used so far only to find out if given class does not contains any implicit change (so is only used in API not to track implicit dependencies, but we may change this in future). Hashes are produced for each applicable scope so

  • plain def will declare hashes only in Default scopes
  • implicit class/def/val in both Default and Implicit
  • sealed trait in Default and Implicit

Hashes for Implicit and Default are equal (but we can change that) and for PatMat we include types of children.

If for given class only PatMat scope is changed we don't invalidate all places where only Default is used.

In future this mechanism can be used in #227 #226

For certain reasons this time I have to paste the below disclaimer:
THIS PROGRAM IS SUBJECT TO THE TERMS OF THE BSD 3-CLAUSE LICENSE.

THE FOLLOWING DISCLAIMER APPLIES TO ALL SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE:
THIS SOFTWARE IS LICENSED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE AND ANY WARRANTY OF NON-INFRINGEMENT, ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA,
OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. THIS SOFTWARE MAY BE REDISTRIBUTED TO OTHERS ONLY BY EFFECTIVELY USING THIS OR
ANOTHER EQUIVALENT DISCLAIMER IN ADDITION TO ANY OTHER REQUIRED LICENSE TERMS.
ONLY THE SOFTWARE CODE AND OTHER MATERIALS CONTRIBUTED IN CONNECTION WITH THIS SOFTWARE, IF ANY, THAT ARE ATTACHED TO (OR OTHERWISE ACCOMPANY) THIS SUBMISSION (AND ORDINARY
COURSE CONTRIBUTIONS OF FUTURES PATCHES THERETO) ARE TO BE CONSIDERED A CONTRIBUTION. NO OTHER SOFTWARE CODE OR MATERIALS ARE A CONTRIBUTION.

So it also means that we unified usages of implicit and macro names.
Better serialization is required.
@eed3si9n eed3si9n added the ready label Mar 15, 2017
Now only files that use sealed class as pattern match target gets recompiled when we add/modify children of sealed class/trait.
@romanowski
Copy link
Contributor Author

I will include benchmark results soon.


import scala.reflect.ClassTag

case class EnumSetSerializer[E <: Enum[E]: ClassTag](allValues: Array[E]) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is quite naive approach. I am open for suggestion about possible alternatives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is scary...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But works :) As said I am open for suggestions.

@jvican jvican self-assigned this Mar 15, 2017
Copy link
Member

@jvican jvican left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @romanowski, I didn't want to yak shave so I've just pushed a commit that fixes some questions/concerns I had. More info in the commit message.

These changes look great and I'm happy to merge them! I have some questions first, though:

  1. The provided tests are great, but we should have several scripted tests to check that this actually has the right behaviour. Can you please do that?
  2. It seems you've glossed over extractors in pattern matching targets. It would be great if you include them in the tests and make sure they work (make sure to check that nested extractors work too).
  3. The analysis that you provide in this PR for pattern matching is not as smart as it could be, if I understand these changes correctly. For instance:
def foo(a: A) = a match { ... }

This has to be recompiled iff the pattern match is exhaustively checking every possible child. If you have the following body a match { case a2: SubTypeOfA => ???; case a: A => ??? }, you don't need to recompile it since there's a default case that covers all the subtypes of A. You can generalize this as: "whenever there's a default case that covers the sealed super type, it's safe to ignore recompilation".

Please, confirm if this is an issue with your PR. The way I see this right now, it looks like the usability of these changes is not that great because if you a in the previous pattern match is A and not Any, you will always recompile. Perhaps I'm wrong and I didn't understand these changes well enough, though.

You don't have to do this in this PR, this is already an improvement over what we had before. We can do it as a follow up. I'm happy to implement it.

That aside, serialisation is far than ideal, but there's no clear escape, so I'm fine with that 😉.

@@ -16,7 +16,7 @@ import collection.mutable
import xsbti.compile.{ DeleteImmediatelyManagerType, IncOptions, TransactionalManagerType }
import xsbti.compile.ClassFileManager

object ClassFileManager {
object ClassFileManagers {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this change to the API? I suggest we leave it as it was before since it's a "companion object" of ClassFileManager. I'm open to changing the name in other PR, but I don't see the motivation for this name change here.

@@ -36,15 +36,15 @@ private final class IncrementalNameHashing(log: sbt.util.Logger, options: IncOpt
if (SameAPI(a, b))
None
else {
val aNameHashes = a.nameHashes
val bNameHashes = b.nameHashes
val aNameHashes = a.nameHashes()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nameHashes returns something, no? I think it's safe to leave the parens as they were 😄.

if (memberRef.nonEmpty) s"by transitiveInheritance: $memberRef" else ""

s"""$change invalidates ${all.size} classes due to ${memberRefInvalidator.invalidationReason(change)}
|\t$byTransitiveInheritance $byLocalInheritance $byMemberRef
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New lines? 💃

{
hash = 1
hashClass(c)
finalizeHash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we finalize hash here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep API unified and force users to use HashApi.apply.

Copy link
Contributor Author

@romanowski romanowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvican Thanks for that.

I see that you don't like pattern matching on null but I am fine with that.

The only thing we should change is manetioned vals (to defs).

val existingScopes = scopedNamesInFirstClass.get(topLevelName)
if (existingScopes == null)
scopedNamesInFirstClass.put(topLevelName, topLevelScopes)
else existingScopes.addAll(topLevelScopes)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This emits warning. That is why I added () in the end.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW this is quite strange to do a review in your own PR :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've realised that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this is by no means my PR, it's yours @romanowski.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't worry, I don't care about that much :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to push changes and then discuss them, than discussing them and forcing you to spend time on it. It's just more efficient 😉.

@@ -78,63 +76,71 @@ class ExtractUsedNames[GlobalType <: CallbackGlobal](val global: GlobalType) ext
}
}

private val DefaultScopes = EnumSet.of(UseScope.Default)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EnumSets are mutable so both this vals should be defs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the tip, this was not clear to me!

This commit makes minor changes to `ExtractUsednames` in the spirit of
better readability and some microoptimization to leave it the way it was
before (see the rewrite of `add` by a `contains` + `add`).

It also makes some changes to the API: sets some classes to final and
renames `ClassFileManagers` to `ClassFileManager` for consistency with
the rest of the API. In the future, I'm happy to consider a name change,
but for now it's better to stick to the convention of `ClassFileManager`
being acting like a "companion object".
@jvican
Copy link
Member

jvican commented Mar 16, 2017

Oh, I almost forgot @romanowski. Post the benchmarks when you have them, I'm curious to see the results. Here you go with some instructions on how to run them: https://github.com/sbt/zinc/blob/1.0/CONTRIBUTING.md.

I would use the Scala standard library, you can cherry-pick its benchmarks from this commit.

@romanowski
Copy link
Contributor Author

romanowski commented Mar 18, 2017

Benchmarks as promised. I've also have full outputs if needed.

@romanowski
Copy link
Contributor Author

romanowski commented Mar 18, 2017

Regarding @jvican review:

  1. My tests are more or less scripted tests done in Scala code but I also added scripted one.

  2. I don't know if I get this correctly. Do you want me to include a tests for cases like one below?

val List(Option(SealedChildren1(a)), None, Some(b: Sealed), Some(SealedChildren2(c)) = ???
  1. You are right, this is quite simple solution. So it should say that child of sealed trait/class changed zinc recompiles every place where warning can be reported. So there is a lot of space to improve that (the only thing we need to do is to improve mechanism to decide if we depend on given name in PatMat Scope). Good and quite cheap and easy to implement option will be check if default case is provided.

@jvican
Copy link
Member

jvican commented Mar 19, 2017

I don't know if I get this correctly. Do you want me to include a tests for cases like one below?

Yes, that's right, and the same for @:

any match {
  case s @ Some(value) =>
  case None =>

@jvican
Copy link
Member

jvican commented Mar 19, 2017

I will have a look at the benchmarks soon, I'm now unavailable. Thanks for responding to all the feedback @romanowski.

// Register names from pattern match target type in PatMatTarget scope
case ValDef(mods, _, tpt, _) if mods.isCase && mods.isSynthetic =>
updateCurrentOwner()
PatMatDependencyTraverser.traverse(tpt.tpe)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the types of synthetic case vals seems like a fairly robust way to find usages of sealed-ness. As you mentioned, this doesn't cover hidden dependencies from macros (e.g.. shapeless Generic). As long as we provide configuration to let people enable/disable this optimization, that's okay in my view.

@jvican
Copy link
Member

jvican commented Mar 20, 2017

Benchmark results on this one look great, there's a small decrease of 300s in p95, but we can live with it. Thanks for making them.

Last thing on this PR before proposing to merge it: can we add an option to IncOptions that enables/disables this mode as Jason suggests?

In the long term, when the support for sealed classes detection improves, we can make it default. For now, though, this PR may undercompile under some scenarios (macro dependencies) while before we were overcompiling. We know macro dependencies are not common, but we should try to protect users from any case in which Zinc undercompiles.

@romanowski
Copy link
Contributor Author

Option added and optimization is off by default.

@jvican
Copy link
Member

jvican commented Mar 27, 2017

Perfect, this is good to go @romanowski. I'll push the approve button in a few hours, I'm in mobile now. Can someone else have a look at this patch? @eed3si9n @dwijnand

*/
public enum UseScope {
// Don't add more than 6 scopes. Otherwise, change `Mapper` implementation.
Default, Implicit, PatMatTarget
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add documentation on what these scopes represent?

@eed3si9n
Copy link
Member

eed3si9n commented Mar 27, 2017

Added a minor comment, but overall looks good. Nice work!

@romanowski
Copy link
Contributor Author

romanowski commented Mar 28, 2017

Added some documentation to UseScope.

Also backported some small changes from our integration (ClassloaderCache can save us even 10% of compilation time, this is amazing!).

@jvican
Copy link
Member

jvican commented Mar 29, 2017

In future this mechanism can be used in #227 #226

Please, let's avoid this kind of statements without further double checking with maintainers. I explained in the Gitter channel that this is not correct, and this can lead to confusion as shown in #226 and #227.

cunei pushed a commit to cunei/zinc that referenced this pull request Oct 25, 2017
filter properties based on a criteria
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants