watchSources is redundant and not precise #3718

OlegYch · 2017-11-06T18:47:42Z

This is more of an idea than an issue.

Recent issues with ~ like #3695 and #3501, and continuing performance problems with change detection (although slightly alleviated by NIO watching and upcoming enhancements in hashing) have led me to think that watchSources needs to go.

Instead ~ has to deduce the set of files consumed by each task and feed the changes to the task.
One way to do that might be to track files accessed by each task in a bit of global mutable in-memory state - ie each task would be given a (cache: (File => (File => T) => T), modified: Set[File]). This state intentionally would not survive sbt restart, but imho this is ok at the moment, though it would be nice to unify that with existing caching facilities.

Perhaps there are better ways, but this was the first thing i could imagine and i'd like to receive some feedback on this

The text was updated successfully, but these errors were encountered:

ShaneDelmore · 2017-11-06T23:03:55Z

I use watchSources to depend on more than what is referenced during compilation. For example, I may have test oracles files I want to compare output too, zipped files, json files, things like that, and when my code changes I want to rerun my tests, but also if any of my test oracles change I want to rerun my tests. Isn't this what watchSources is designed for?

OlegYch · 2017-11-06T23:53:10Z

@ShaneDelmore good point
run and test depend on products which reads config files, and you could make it depend on other files which are only read at runtime, just like you do with watchSources, so that usecase should be well covered

ShaneDelmore · 2017-11-07T01:04:41Z

Just to make sure I am understanding this, is the proposed change to add test oracles to products instead of watchSources to enable ~test to catch changes to test resources?

OlegYch · 2017-11-07T01:37:02Z

@ShaneDelmore any of the tasks that all of test testQuick and testOnly depend on, Test / products might be a good option
the proposed change is not limited to tests though, but rather to any task that reads files
currently watchSources has no idea what task it's used for, and therefore is either too eager or misses stuff and cannot be used for incremental change detection

ShaneDelmore · 2017-11-07T03:04:21Z

In that case, I see your point but fear the outcome. I imagine conversations like these:

NewSbtUser: I want to make ~test rerun tests when I change a file, how do I do it?
SbtExpert: Add the files to watchSources.
NewSbtUser: That makes sense, thanks!

Vs
NewSbtUser: I want to make ~test rerun tests when I change a file, how do I do it?
SbtExpert: Add the files to any that test, testQuick, and testOnly depend on.
NewSbtUser: Huh?

I’m trying to see how this conversation ends with a happy user. I’m not saying veto it for that reason alone, but hopefully we at least have a good answer for this.

eatkins · 2018-11-09T17:53:21Z

@OlegYch I am not sure that your suggestion is really practical. I can't imagine an implementation that wouldn't require adding a wrapper for every task that reads a file and much of that file consumption may be downstream from sbt. I do, however, think that watchSources should be better scoped than they currently are. For example, suppose that project foo depends on project bar but only for the compile configuration, if you touch a bar test source file, it will trigger a rebuild in ~foo/compile which doesn't make sense.

In general, this is really hard to get right without manual intervention. For example, as a simple strawman, suppose that we reworked things so that there was a single task:
val allSourceFiles = taskKey[Seq[Source]]("All of the compilation source files")
that was defined as

allSourceFiles := Seq(
  Source(unmanagedSourceDirectories.value, "*.scala" | "*.java", NothingFilter),
  Source(managedSourceDirectories.value, "*.scala" | "*.java", NothingFilter)
)

We still couldn't reuse this task for watchSources because the managed source files are updated during compilation, which would cause them to trigger a build, leading to an infinite loop. I believe that the cache of modified files would also suffer from this problem. The only way to fix that would be to add a bunch of exclusion rules, at which point we may as well just whitelist the files that should trigger anyway.

Where we can do better though is with documentation. I don't think that the documentation of ~ is especially helpful. I also have added some functionality that is completely undocumented and am hoping to update that before the 1.3.0 release.

OlegYch · 2018-11-09T18:13:26Z

@eatkins yep this definitely needs more thought
maybe we could take a look at what bazel, mill etc do with their pervasive caching

eatkins · 2018-11-09T18:26:56Z

Yeah, there's no doubt that there are things we can learn from other tools. That being said, last time I looked at Mill back in May, it was clear to me that if big projects start using it, they are going to have to rewrite the file watching portion. It currently uses a dead simple polling implementation that has terrible performance once the number of files it's monitoring exceeds O(1000).

OlegYch · 2018-11-09T23:43:13Z

@eatkins that's kinda beside the point, this issue is more about accuracy of watching changed sources (or task inputs in general) which is pretty low atm in sbt
better scoping of watchSources might be a start, but accurately tracking all inputs should be the goal imo

eatkins · 2018-11-10T02:19:35Z

Fair enough, though I still think it's worth pointing out that mill is not definitively better than sbt at watching for source changes. In my opinion, the main difference is that mill is much more opinionated about how the build is structured than sbt is. I don't think that it is more accurate per se. If you do not override the default source directories and you don't try to do anything other than compile scala and java sources, then I think sbt is just as accurate as mill.

Sbt makes it really easy to add any file as a watch source regardless of whether or not it is actually a compilation source. Granted, this functionality is not well documented so probably not readily available to most end users. That is more of a documentation problem than a technical problem. To me, the big technical challenge is that practically everything is override-able in sbt and also that with a large user base, there are a ton of nasty edge cases, e.g. symlinked directories or source files. If a project deviates from standard layout, or if it adds extra build steps, like running make in a jni project, then it is on the user to manually update the watchSources. In general, this is quite easy though, because, take the jni case, you can just do something like

watchSources += Source(jniDir, "*.c" | "*.h", NothingFilter)

and now it will trigger on any changes to all of those files. I'm not sure how easy it is to do something like that with other build tools.

Out of curiosity, what are your biggest issues with ~ today? In my projects, I find that it works pretty consistently well in sbt > 1.1.5, and especially in sbt 1.3.x (which is available as a nightly build). More specifically, in which situation do you find it to be inaccurate?

eatkins · 2018-11-10T02:24:37Z

I should add that I've put an enormous amount of work into improving ~ (much of which is still pending in the 1.3.0 release) and I want to ensure that it is meeting the needs of end users. For the most part, I believe that we've solved the hard problem: reliably watch any directory or file for changes even when there are symlinks involved. Watching the right files should be relatively easy in comparison.

OlegYch · 2018-11-10T16:26:04Z

@eatkins atm the immediately visible problem is that almost no plugins that use custom source dirs (eg sbt-protoc) adjust watchSources
i don't think that's an easy problem to solve, and will probably require changes in sbt api
otherwise the watching subsystem seems to work pretty good after your recent changes (in 1.2.x)

eatkins · 2018-11-10T18:18:21Z

Aah, yes, that makes a ton of sense. I think that if the watchSources were scoped by task (and by this I really mean that the ~ command should correctly apply scopes to the watchSources key given the tasks that it's going to run), then it wouldn't be too bad to get plugins to work with ~ out of the box. If this functionality existed, then we could update the docs to suggest that plugin authors add a $PLUGIN_TASK / watchSources := ... in their plugin definition and reach out to the more popular plugins and try and get them to add watchSources.

OlegYch · 2018-11-14T18:15:55Z

this sounds like a good start (though i imagined more pervasive inputs tracking)
not sure how this will work though, eg if you run compile and it depends on protoc (via generatedSources), how would ~compile know about protoc / watchSources ?

eatkins · 2018-11-14T19:02:57Z

I could be mistaken, but I think it's possible to get all of the dependent scopes for a task. If that's the case, then if compile depends on protoc, then we can append protoc / watchSources to the compile / watchSources list.

eatkins · 2018-11-16T18:59:05Z

I'm actually keen to address this because I've run into some frustrations with the scoping myself recently.

eatkins · 2018-11-21T02:35:26Z

@OlegYch my hunch was correct. I will have a PR up soon that works exactly the way I described in the comment.

eatkins · 2018-11-23T19:05:22Z

@OlegYch I've been thinking about it some more and I don't think I quite understood what you were saying. Upon further reflection, I think that you're right that we should deprecate and remove watchSources. My current idea is that we should add a new key (I'm not committed to this name):

val taskSources = settingKey[Seq[sbt.io.internal.Source]]("The sources for a particular task")

The idea is that taskSources are all of the files that specify the unmanaged inputs to a task. Note, however, that because they are specified with sbt.io.internal.Source rather than Seq[File] the taskSources key can be a SettingKey rather than a TaskKey, which is better for performance.

To reimplement the compile task, we'd have something like

Compile / compile / taskSources := unmanagedSourceDirectories.value.map {
  d => Source(d.value, includeFilter in unmanagedSources, excludeFilter in unmanagedSources)
}
Compile / compile / sources := taskSources.value.flatMap {
  s => IO.listFiles(s.base, s.includeFilter -- s.excludeFilter)
} ++ managedSources.value

With these definitions, we could use the taskSources for watchSources. The improvement is that any changes to taskSources automatically propagate to watchSources. For many (most?) tasks, there would be no distinction between sources and taskSources.

I think the syntax for all of this can be made pretty nice. For example, in the protobuf plugin use case, we should be able to do something like:

protoc / taskSources := protocSourceDirectory.value / ** / *.proto

and then the protoc task (and any task that transitively depens on it) will automatically watch all "*.proto" files.

If we had this infrastructure, we could also build out caching/persistence of setting/task output similar to what mill does: http://www.lihaoyi.com/mill/page/tasks.html. This would be nice because there are certain tasks that don't neatly fit into the task/setting dichotomy because they should only run if their inputs have actually changed. For compile, we almost get this for free with zinc, but most other tasks will run even if there is no chance that their output will change.

eatkins · 2018-11-23T19:08:05Z

@eed3si9n @dwijnand I'm curious what you think about ^^

dwijnand · 2018-11-23T21:29:22Z

Sounds reasonable to me.

dwijnand · 2019-05-31T07:38:04Z

According to #4512 (comment), fixed in #4512.

eed3si9n mentioned this issue Jan 10, 2018

sbt 1 triggered execution (file watch, ~, tilde) epic #3868

Closed

eed3si9n added the area/watch Issues around ~ (triggered execution, watch) label Jun 29, 2018

eed3si9n added the uncategorized Used for Waffle integration label Sep 18, 2018

eed3si9n removed the x/waffle label Nov 8, 2018

eatkins mentioned this issue Dec 12, 2018

Support disabling the sbt task linter #4485

Merged

1 task

eatkins mentioned this issue Jan 11, 2019

Watch rewrite #4512

Merged

1 task

dwijnand closed this as completed May 31, 2019

dwijnand added this to the 1.3.0 milestone May 31, 2019

github-actions bot locked as resolved and limited conversation to collaborators May 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

watchSources is redundant and not precise #3718

watchSources is redundant and not precise #3718

OlegYch commented Nov 6, 2017 •

edited

Loading

ShaneDelmore commented Nov 6, 2017

OlegYch commented Nov 6, 2017 •

edited

Loading

ShaneDelmore commented Nov 7, 2017

OlegYch commented Nov 7, 2017

ShaneDelmore commented Nov 7, 2017

eatkins commented Nov 9, 2018

OlegYch commented Nov 9, 2018

eatkins commented Nov 9, 2018 •

edited

Loading

OlegYch commented Nov 9, 2018

eatkins commented Nov 10, 2018 •

edited

Loading

eatkins commented Nov 10, 2018

OlegYch commented Nov 10, 2018

eatkins commented Nov 10, 2018

OlegYch commented Nov 14, 2018

eatkins commented Nov 14, 2018

eatkins commented Nov 16, 2018

eatkins commented Nov 21, 2018

eatkins commented Nov 23, 2018 •

edited

Loading

eatkins commented Nov 23, 2018

dwijnand commented Nov 23, 2018

dwijnand commented May 31, 2019

watchSources is redundant and not precise #3718

watchSources is redundant and not precise #3718

Comments

OlegYch commented Nov 6, 2017 • edited Loading

ShaneDelmore commented Nov 6, 2017

OlegYch commented Nov 6, 2017 • edited Loading

ShaneDelmore commented Nov 7, 2017

OlegYch commented Nov 7, 2017

ShaneDelmore commented Nov 7, 2017

eatkins commented Nov 9, 2018

OlegYch commented Nov 9, 2018

eatkins commented Nov 9, 2018 • edited Loading

OlegYch commented Nov 9, 2018

eatkins commented Nov 10, 2018 • edited Loading

eatkins commented Nov 10, 2018

OlegYch commented Nov 10, 2018

eatkins commented Nov 10, 2018

OlegYch commented Nov 14, 2018

eatkins commented Nov 14, 2018

eatkins commented Nov 16, 2018

eatkins commented Nov 21, 2018

eatkins commented Nov 23, 2018 • edited Loading

eatkins commented Nov 23, 2018

dwijnand commented Nov 23, 2018

dwijnand commented May 31, 2019

OlegYch commented Nov 6, 2017 •

edited

Loading

OlegYch commented Nov 6, 2017 •

edited

Loading

eatkins commented Nov 9, 2018 •

edited

Loading

eatkins commented Nov 10, 2018 •

edited

Loading

eatkins commented Nov 23, 2018 •

edited

Loading