Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make watching a bit more resilient to nonatomic bursts of filesystem changes #3009

Closed

Conversation

swaldman
Copy link
Contributor

I'll apologize in advance for this one. I can't say whether the problem is solves (at least might solve) is generally worth the extra bit of complexity. Anyway, I won't be insulted at all if the answer is no.

I use mill to build static-site generators. Before I generate, I use mill -w to serve my soon-to-be-static sites on localhost for edit-refresh cycles on whatever I'll be publishing. It works pretty well, but occasionally, for a variety of reasons, the watch cycles fail, and I have to manually restart mill. Rather than just tolerate that (it's tolerable!), I wondered whether it wouldn't be better to try to make watching a bit more robust to the kind of conditions that seem to flummox it.

The most frequent condition, in my case, has to do with emacs lock files. An example error and stack trace is at the bottom of this note. Basically, when I begin to modify a file, emacs creates a lockfile, and mill restarts harmlessly. That's no problem. But when I save my modified file, emacs first saves, then deletes its lockfile. That sets up a race for mill, which walks source directories, and then hashes in some fashion the files it finds. The walk sometimes occurs before the lockfile is removed, while the attempt to examine the file happens after after, leading to a java.nio.file.NoSuchFileException.

I don't think this issue is likely to be restricted to emacs. In general, I don't think it's uncommon for filesystem changes to happen in short, nonatomic bursts, so maybe it would be good for watching to be resilient to that.

This PR is kind of a dumb attempt at that. In the happy case, it should have very low overhead. But when an Exception would have occurred due to hitting this kind of race badly, we pause 50 milliseconds, then try again.

50 milliseconds is a guess! The intention is to be short enough not to slow builds down noticeably, but long enough that bursts of sequential filesystem changes are likely to complete before the retry. I don't know for sure that it's a great number. But it does seem like at worst it increases the likelihood of a watch succeeding and otherwise does little harm. I'd obviously be thrilled with better numbers or more sophisticated approaches.

Normally I'd want to log the initial failure somewhere, but I don't know of a logger I can access in Watchable, and wasn't sure whether just spewing to Console.err is okay.

Anyway, for what it's worth!

The example error trace is below:

An unexpected error occurred
Exception in thread "MillServerActionRunner" java.nio.file.NoSuchFileException: /Users/swaldman/development/gitproj/drafts.interfluidity.com/drafts/untemplate/com/interfluidity/drafts/mainblog/entries_2024_02/.#entry-situated-vs-unsituated-virtues.md.untemplate
	at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:106)
	at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
	at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.readAttributes(UnixFileAttributeViews.java:257)
	at java.base/sun.nio.fs.UnixFileAttributeViews$Posix.readAttributes(UnixFileAttributeViews.java:168)
	at java.base/sun.nio.fs.UnixFileSystemProvider.readAttributes(UnixFileSystemProvider.java:148)
	at java.base/java.nio.file.Files.readAttributes(Files.java:1851)
	at java.base/java.nio.file.Files.getPosixFilePermissions(Files.java:2125)
	at os.perms$.apply(PermsOps.scala:13)
	at mill.api.PathRef$.$anonfun$apply$3(PathRef.scala:129)
	at mill.api.PathRef$.$anonfun$apply$3$adapted(PathRef.scala:122)
	at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:576)
	at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:574)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:933)
	at scala.collection.IterableOps$WithFilter.foreach(Iterable.scala:903)
	at mill.api.PathRef$.apply(PathRef.scala:122)
	at mill.api.PathRef.recomputeSig(PathRef.scala:21)
	at mill.util.Watchable$Path.poll(Watchable.scala:21)
	at mill.util.Watchable.validate(Watchable.scala:15)
	at mill.util.Watchable.validate$(Watchable.scala:15)
	at mill.util.Watchable$Path.validate(Watchable.scala:20)
	at mill.runner.Watching$.$anonfun$statWatchWait$1(Watching.scala:75)
	at mill.runner.Watching$.$anonfun$statWatchWait$1$adapted(Watching.scala:75)
	at scala.collection.immutable.List.forall(List.scala:386)
	at mill.runner.Watching$.statWatchWait0$1(Watching.scala:75)
	at mill.runner.Watching$.statWatchWait(Watching.scala:98)
	at mill.runner.Watching$.watchAndWait(Watching.scala:67)
	at mill.runner.Watching$.watchLoop(Watching.scala:45)
	at mill.runner.MillMain$.$anonfun$main0$1(MillMain.scala:219)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
	at scala.Console$.withErr(Console.scala:193)
	at mill.api.SystemStreams$.$anonfun$withStreams$2(SystemStreams.scala:62)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
	at scala.Console$.withOut(Console.scala:164)
	at mill.api.SystemStreams$.$anonfun$withStreams$1(SystemStreams.scala:61)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:59)
	at scala.Console$.withIn(Console.scala:227)
	at mill.api.SystemStreams$.withStreams(SystemStreams.scala:60)
	at mill.runner.MillMain$.main0(MillMain.scala:101)
	at mill.runner.MillServerMain$.main0(MillServerMain.scala:83)
	at mill.runner.MillServerMain$.main0(MillServerMain.scala:35)
	at mill.runner.Server.$anonfun$handleRun$1(MillServerMain.scala:187)
	at java.base/java.lang.Thread.run(Thread.java:833)

@lihaoyi
Copy link
Member

lihaoyi commented Feb 11, 2024

What version of Mill are you on? This should have been fixed in #2832 I think, which went out in 0.11.6

@swaldman
Copy link
Contributor Author

Great!

I am still running 0.11.5. I'll just upgrade.

Thank you!

@swaldman swaldman closed this Feb 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants