Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MatchError in ApiPhase.processScalaUnit with -sourcepath compiler option #1304

Closed
lrytz opened this issue Dec 6, 2023 · 7 comments · Fixed by #1309 · May be fixed by scala/scala#10905
Closed

MatchError in ApiPhase.processScalaUnit with -sourcepath compiler option #1304

lrytz opened this issue Dec 6, 2023 · 7 comments · Fixed by #1309 · May be fixed by scala/scala#10905
Assignees

Comments

@lrytz
Copy link
Contributor

lrytz commented Dec 6, 2023

Setup:

  • small sbt project with two source files for classes A and B
  • set the -sourcepath compiler flag
  • change source file B
  • touch (save) the source file A without modifications

Zinc's incremental compilation runs the compiler on the changed source file B. For class A, the compiler finds both a classfile and a source file. Because the source file has a more recent timestamp it creates a SourcefileLoader for symbol A, which invokes compileLate and creates a plain BatchSourceFile.

The API phase crashes because it expects an xsbti.VirtualFile: https://github.com/sbt/zinc/blob/v1.9.5/internal/compiler-bridge/src/main/scala/xsbt/API.scala#L62

I'm encountering this when switching branches in scala/scala.

(edit from Dale: switched A and B in the setup, to match the description below.)

@Friendseeker
Copy link
Member

Friendseeker commented Dec 8, 2023

Thanks for the bug report & explanation of the bug!

Would you mind to provide link to a small Github Repo as repro? I was trying to follow the setup locally, but without much success. A Github Repo repro would really help out.

@lrytz
Copy link
Contributor Author

lrytz commented Dec 8, 2023

@Friendseeker I managed to reproduce it as a scripted test:

https://github.com/sbt/zinc/compare/develop...lrytz:zinc:t1304?expand=1

[error] Caused by: scala.MatchError: /var/folders/8d/8tnybb497mbgq2v08vgcr5c80000gn/T/sbt_a286cfd0/src/main/scala/A.scala (of class scala.reflect.io.PlainFile)
[error] 	at xsbt.API$ApiPhase.processScalaUnit(API.scala:62)
[error] 	at xsbt.API$ApiPhase.processUnit(API.scala:57)
[error] 	at xsbt.API$ApiPhase.apply(API.scala:53)
[error] 	at scala.tools.nsc.Global$GlobalPhase.applyPhase(Global.scala:453)
[error] 	at scala.tools.nsc.Global$GlobalPhase.run(Global.scala:400)
[error] 	at xsbt.API$ApiPhase.run(API.scala:35)

@lrytz
Copy link
Contributor Author

lrytz commented Dec 8, 2023

A comment I made on our internal chat:

I guess the right fix is to override something in zinc's Global subclass to make sure the apropriate file subclass is created. Not sure if the right hooks are available. I have never looked at the many, many, many File subclasses that exist in the compiler and in zinc, so I'd need to spend some time.

@dwijnand
Copy link
Member

dwijnand commented Jan 3, 2024

It seems, given Lukas' analysis, that the compiler is over-compiling the touched but unchanged A source file, due to the naive incremental compilation logic that exists in the compiler. If the source should have been recompiled, it would've been passed by zinc, as an AbstractZincFile.

So rather than try to fix this by trying to fixup the compilation unit file (in #1309) I think we should try to avoid getting here. Ideally we could get the compiler to not trigger its own incremental compilation logic, somehow. But in the meantime perhaps we can skip the incremental analysis, assuming that we don't wipe out any metadata for A, that is still true - though we'll still have a fresh classfile, with a new timestamp...

@Friendseeker
Copy link
Member

Friendseeker commented Jan 3, 2024

It seems, given Lukas' analysis, that the compiler is over-compiling the touched but unchanged A source file, due to the naive incremental compilation logic that exists in the compiler. If the source should have been recompiled, it would've been passed by zinc, as an AbstractZincFile.

So rather than try to fix this by trying to fixup the compilation unit file (in #1309) I think we should try to avoid getting here. Ideally we could get the compiler to not trigger its own incremental compilation logic, somehow. But in the meantime perhaps we can skip the incremental analysis, assuming that we don't wipe out any metadata for A, that is still true - though we'll still have a fresh classfile, with a new timestamp...

Perhaps we can emit an error when zinc is on and -sourcepath option is set to tell user to remove usage of -sourcepath option and instead rely on zinc to provide all source files.

By the way, are there cases where -sourcepath is strictly needed? If not, deprecating -sourcepath is also an option.

@lrytz
Copy link
Contributor Author

lrytz commented Jan 4, 2024

Sorry for the long comment this turned into...

By the way, are there cases where -sourcepath is strictly needed

The user-facing functionality for -sourcepath is in conjunction with scaladoc's -doc-source-url:

  -doc-source-url <url>                       A URL pattern used to link to the source file, with some variables supported: For example, for `scala.collection.Seq` €{TPL_NAME} gives `Seq`, €{TPL_OWNER} gives `scala.collection`, €{FILE_PATH} gives `scala/collection/Seq`, €{FILE_EXT} gives `.scala`, €{FILE_PATH_EXT} gives `scala/collection/Seq.scala`, and €{FILE_LINE} gives `25` (without the backquotes). To obtain a relative path for €{FILE_PATH} and €{FILE_PATH_EXT} instead of an absolute one, use the -sourcepath setting.

That's the reason people seem to use this setting: https://github.com/search?q=%22-sourcepath%22%20path%3A*.sbt&type=code.

The original use case for -sourcepath (to work like -classpath but for source files) is only needed for compiling the standard library. (I assume the reason it's not a -Y flag is because it was copied from javac.)

The compiler needs to look up certain symbols from the standard library, see https://github.com/scala/scala/blob/v2.13.12/src/reflect/scala/reflect/internal/Definitions.scala#L371.

Usually these symbols are provided by scala-library.jar, i.e., the compiler is invoked with scalac -cp scala-library.jar project/sources/**.

However, when compiling the standard library we don't put scala-library.jar on the classpath, the compiler is invoked as scalac -sourcepath src/library src/library/**. Library symbols need to be found before the corresponding source file is processed by the namer phase, which normally creates new symbols. That requires -sourcepath.

I think @dwijnand is right about removing the compiler's built-in incremental compile option for when both a classfile and a source file is available. That was probably useful back in the days before we had actual incremental compilation, but now it can be removed.

But even then, -sourcepath is needed when doing a full compilation of scala/scala/src/library. This allows the symbols to exist before namer phase. Once the source file is processed by namer, the existing symbol gets a new type and a new position referring to the VirtualFile (updatePosFlags).

It turns out that the SourceLoaders (the lazy types that would trigger compileLate on completion) are always replaced before they are completed, so we don't encounter the plain BatchSourceFile instances that would lead to the MatchError described in this ticket. It seems finding these symbols is enough, their types are never forced early. However, that is probably just by luck.

So in principle we should still go ahead with your PR to zinc, to make sure a VirtualFile is created when completing a SourceLoader.

To clean things up we could

  • remove the compiler's poor invalidation
  • turn -sourcepath into a scaladoc-only setting, or create a new one with a different name
  • deprecate -sourcepath for the compiler
  • create a new -Y flag for compiling the standard library

@dwijnand
Copy link
Member

dwijnand commented Jan 4, 2024

So in principle we should still go ahead with your PR to zinc, to make sure a VirtualFile is created when completing a SourceLoader.

A big source of my hesitation is I don't like the slippery slope that is AnalysisCallback3 extends AnalysisCallback2...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants