New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Millisecond-precision file modification timestamps using native code #92

Merged
merged 2 commits into from Dec 6, 2017

Conversation

Projects
None yet
4 participants
@cunei
Contributor

cunei commented Dec 1, 2017

See [JDK-8177809] File.lastModified() is losing milliseconds (always ends in 000).

The Java methods lastModified() and getLastModifiedTime(), on many JDKs and operating systems, return a number rounded to the whole second, losing the millisecond part. Since those timestamps are used by sbt to detect file changes, the result is that rapid file changes may go undetected, leading to mysterious errors.
This pull request uses JNA (a native code interface) to replace lastModified() with native calls that return timestamps with full millisecond precision, on Linux 64-bit, Linux 32-bit, Windows, and macOS. The new methods are called getModifiedTime() and setModifiedTime(). The code will fall back to Java if no native support is available, or if the property "sbt.io.jdktimestamps" is defined and is not "false".
A diagnostic method sbt.io.Milli.getMilliSupportDiagnostic(projectDir: File): Option[String] is supplied to check whether the filesystem/JDK/OS in use support sub-second modification timestamps. It is intended to be run once after the project is loaded, with something like:

sbt.io.Milli.getMilliSupportDiagnostic(projectDir) map log.debug

If the test fails, it returns a diagnostic string that will remain available in the logs.

@cunei cunei added the in progress label Dec 1, 2017

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 1, 2017

Contributor

Please do not merge yet, I will use the pull requests to perform additional validations

Contributor

cunei commented Dec 1, 2017

Please do not merge yet, I will use the pull requests to perform additional validations

@ijuma

This comment has been minimized.

Show comment
Hide comment
@ijuma

ijuma Dec 1, 2017

The linked ticket states that the Files method works correctly. Is that not true?

ijuma commented Dec 1, 2017

The linked ticket states that the Files method works correctly. Is that not true?

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 1, 2017

Member

I added the link to JDK-8177809.

@cunei Could you check this?

Member

eed3si9n commented Dec 1, 2017

I added the link to JDK-8177809.

@cunei Could you check this?

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 2, 2017

Contributor

If you are referring to Files.getLastModifiedTime(), the answer is no. On my 16.04 Ubuntu:

println(getModifiedTime(filePath))   // native
1512178359814

println(Files.getLastModifiedTime((new File(filePath)).toPath).toMillis)
1512178359000
Contributor

cunei commented Dec 2, 2017

If you are referring to Files.getLastModifiedTime(), the answer is no. On my 16.04 Ubuntu:

println(getModifiedTime(filePath))   // native
1512178359814

println(Files.getLastModifiedTime((new File(filePath)).toPath).toMillis)
1512178359000
@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 2, 2017

Member

@cunei Yup that's the one. Thanks for the checking.

Member

eed3si9n commented Dec 2, 2017

@cunei Yup that's the one. Thanks for the checking.

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 3, 2017

Member

If anyone reading this want to try it at home. From sbt shell, run:

> eval java.nio.file.Files.getLastModifiedTime(java.nio.file.Paths.get("build.sbt")).toMillis
[info] ans: Long = 1512061937000
Member

eed3si9n commented Dec 3, 2017

If anyone reading this want to try it at home. From sbt shell, run:

> eval java.nio.file.Files.getLastModifiedTime(java.nio.file.Paths.get("build.sbt")).toMillis
[info] ans: Long = 1512061937000
@ijuma

This comment has been minimized.

Show comment
Hide comment
@ijuma

ijuma Dec 4, 2017

I checked on macOS with APFS and Java 10:

Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 10-ea).
Type in expressions for evaluation. Or try :help.

scala> import java.io.File
import java.io.File

scala> import java.nio.file.Files
import java.nio.file.Files

scala> val f = new File("Downloads")
f: java.io.File = Downloads

scala> Files.getLastModifiedTime(f.toPath).toMillis
res0: Long = 1512376609000

scala> f.lastModified
res1: Long = 1512376609808

So, it looks like Files.getLastModifiedTime is still broken.

ijuma commented Dec 4, 2017

I checked on macOS with APFS and Java 10:

Welcome to Scala 2.12.4 (Java HotSpot(TM) 64-Bit Server VM, Java 10-ea).
Type in expressions for evaluation. Or try :help.

scala> import java.io.File
import java.io.File

scala> import java.nio.file.Files
import java.nio.file.Files

scala> val f = new File("Downloads")
f: java.io.File = Downloads

scala> Files.getLastModifiedTime(f.toPath).toMillis
res0: Long = 1512376609000

scala> f.lastModified
res1: Long = 1512376609808

So, it looks like Files.getLastModifiedTime is still broken.

@ijuma

This comment has been minimized.

Show comment
Hide comment
@ijuma

ijuma Dec 4, 2017

I asked in the nio-dev mailing list and Alan clarified:

The behavior on Linux should be correct for both methods on JDK 10, except if you build OpenJDK with a very old gcc. The discussion on core-libs-dev was 8uX where different versions of gcc are used.

Once this code is updated in JDK 10 then the behavior should be consistent. It will also sort out the issue on OS X (which did not support the higher precision timestamps when originally implemented).

ijuma commented Dec 4, 2017

I asked in the nio-dev mailing list and Alan clarified:

The behavior on Linux should be correct for both methods on JDK 10, except if you build OpenJDK with a very old gcc. The discussion on core-libs-dev was 8uX where different versions of gcc are used.

Once this code is updated in JDK 10 then the behavior should be consistent. It will also sort out the issue on OS X (which did not support the higher precision timestamps when originally implemented).

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 5, 2017

Contributor

Travis fails because of compiler-bridge_2.13.0-M2:1.0.5, but the rest seems otherwise ok.

Contributor

cunei commented Dec 5, 2017

Travis fails because of compiler-bridge_2.13.0-M2:1.0.5, but the rest seems otherwise ok.

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 5, 2017

Contributor

The five commits concerning millisecond timestamps should be good to go; a unit test is still missing, but right now I would proceed with the merge in order to unblock Dale; also, there is no practical way to write a unit test that covers the various platforms: by definition I can only test the native calls on the architecture in use at build time. The startup diagnostic is also intended to warn if native timestamps do not work as intended (see sbt/sbt#3789 (comment))
So, I would merge at this point. The process is a bit tricky, since it involves all five modules and each depends on the other. So, it should work like this:

  • merge #92 against io/1.0.x
  • bump io version number in 1.0.x, release.
  • merge sbt/util#134 against util/1.0.x
  • bump util version number in 1.0.x, bump io dependency version, release.
  • merge sbt/librarymanagement#189 against lm/1.0.x
  • bump lm version number in 1.0.x, bump io, util dependency versions, release.
  • merge sbt/zinc#463 against zinc/1.0.x
  • bump zinc version number in 1.0.x, bump io, util, lm dependency versions, release.
  • merge sbt/sbt#3789 against sbt/1.0.x
  • bump sbt version number in 1.0.x, bump io, util, lm, zinc dependency versions, release.
  • Then, proceed to 1.x.
  • merge each 1.0.x into 1.x.
  • make sure there are no additional lastModified, getLastModifiedTime, setModified, setLastModifiedTime, or copyLastModified calls in 1.x with respect to 1.0.x, and in case replace them with getModifiedTime/setModifiedTime/copyModifiedTime. A tricky bit is that in internal/zinc-core/src/main/scala/sbt/internal/inc/Stamp.scala, util-cache/src/main/scala/sbt/util/FileInfo.scala there are occurrencies of the lastModified identifier which are distinct from the java.io.File one, and should be kept as they are.
  • re-release the whole chain for 1.x, as above.
    I have never released sbt's modules, so I would defer the process to @dwijnand or @eed3si9n. Kindly review the five pull requests and merge as appropriate.
Contributor

cunei commented Dec 5, 2017

The five commits concerning millisecond timestamps should be good to go; a unit test is still missing, but right now I would proceed with the merge in order to unblock Dale; also, there is no practical way to write a unit test that covers the various platforms: by definition I can only test the native calls on the architecture in use at build time. The startup diagnostic is also intended to warn if native timestamps do not work as intended (see sbt/sbt#3789 (comment))
So, I would merge at this point. The process is a bit tricky, since it involves all five modules and each depends on the other. So, it should work like this:

  • merge #92 against io/1.0.x
  • bump io version number in 1.0.x, release.
  • merge sbt/util#134 against util/1.0.x
  • bump util version number in 1.0.x, bump io dependency version, release.
  • merge sbt/librarymanagement#189 against lm/1.0.x
  • bump lm version number in 1.0.x, bump io, util dependency versions, release.
  • merge sbt/zinc#463 against zinc/1.0.x
  • bump zinc version number in 1.0.x, bump io, util, lm dependency versions, release.
  • merge sbt/sbt#3789 against sbt/1.0.x
  • bump sbt version number in 1.0.x, bump io, util, lm, zinc dependency versions, release.
  • Then, proceed to 1.x.
  • merge each 1.0.x into 1.x.
  • make sure there are no additional lastModified, getLastModifiedTime, setModified, setLastModifiedTime, or copyLastModified calls in 1.x with respect to 1.0.x, and in case replace them with getModifiedTime/setModifiedTime/copyModifiedTime. A tricky bit is that in internal/zinc-core/src/main/scala/sbt/internal/inc/Stamp.scala, util-cache/src/main/scala/sbt/util/FileInfo.scala there are occurrencies of the lastModified identifier which are distinct from the java.io.File one, and should be kept as they are.
  • re-release the whole chain for 1.x, as above.
    I have never released sbt's modules, so I would defer the process to @dwijnand or @eed3si9n. Kindly review the five pull requests and merge as appropriate.
@ijuma

This comment has been minimized.

Show comment
Hide comment
@ijuma

ijuma Dec 5, 2017

Btw, is this a significant enough issue that it's worth keeping native code around even though it has been fixed in the JDK itself?

ijuma commented Dec 5, 2017

Btw, is this a significant enough issue that it's worth keeping native code around even though it has been fixed in the JDK itself?

Reverted sbt.version, all checks passed.

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 5, 2017

Contributor

The fix is scheduled for inclusion only in JDK 10, but we will need to support JDK 8 and JDK 9 for many years to come. Also, this issue caused some of our tests to fail non-deterministically, and caused a bunch of other collateral issues. So, yep, we really ought to fix this.

Contributor

cunei commented Dec 5, 2017

The fix is scheduled for inclusion only in JDK 10, but we will need to support JDK 8 and JDK 9 for many years to come. Also, this issue caused some of our tests to fail non-deterministically, and caused a bunch of other collateral issues. So, yep, we really ought to fix this.

@ijuma

This comment has been minimized.

Show comment
Hide comment
@ijuma

ijuma Dec 5, 2017

JDK 9 will be EOL'd by Oracle in March when JDK 10 ships (it's not a LTS release), so I'd be surprised if anyone supports it for a long time. JDK 8 will be around for a while, I agree.

However, people have the option of running SBT with JDK 10+ while supporting older JDK versions via the --release flag (there are plans for scalac to support this flag too) so there won't be a strong reason to run SBT with Java 8 as far as I can tell.

Anyway, you are the ones maintaining the code so your call. :)

ijuma commented Dec 5, 2017

JDK 9 will be EOL'd by Oracle in March when JDK 10 ships (it's not a LTS release), so I'd be surprised if anyone supports it for a long time. JDK 8 will be around for a while, I agree.

However, people have the option of running SBT with JDK 10+ while supporting older JDK versions via the --release flag (there are plans for scalac to support this flag too) so there won't be a strong reason to run SBT with Java 8 as far as I can tell.

Anyway, you are the ones maintaining the code so your call. :)

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 5, 2017

Member

We came across this behavior during integration testing, but we also suspect it's been affecting other areas in sbt such as triggered execution and incremental compilation.

Member

eed3si9n commented Dec 5, 2017

We came across this behavior during integration testing, but we also suspect it's been affecting other areas in sbt such as triggered execution and incremental compilation.

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 5, 2017

Member

@cunei Thanks for all the ripple PRs. I'm going to run Dbuild and see what happens.

Member

eed3si9n commented Dec 5, 2017

@cunei Thanks for all the ripple PRs. I'm going to run Dbuild and see what happens.

cunei added some commits Nov 30, 2017

Add jna, jna-platform, and Millisecond timestamps native support
See [[JDK-8177809] File.lastModified() is losing milliseconds
(always ends in 000)](https://bugs.openjdk.java.net/browse/JDK-8177809).

The Java methods `lastModified()` and `getLastModifiedTime()`, on
many JDKs and operating systems, return a number rounded to the
whole second, losing the millisecond part. Since those timestamps
are used by sbt to detect file changes, the result is that rapid
file changes may go undetected, leading to mysterious errors.

This pull request uses JNA (a native code interface) to replace
`lastModified()` with native calls that return timestamps with full
millisecond precision, on Linux 64-bit, Linux-32bit, Windows, and
OSX. The new methods are called `getModifiedTime()` and
`setModifiedTime()`. The code will fall back to Java if no native
support is available, or if the property "sbt.io.jdktimestamps" is
defined and is not `"false"`.

A diagnostic method
`sbt.internal.io.Milli.getMilliSupportDiagnostic(projectDir: File):
Option[String]` is supplied to check whether the filesystem/JDK/OS
in use support sub-second modification timestamps. It is intended
to be run once after the project is loaded, with something like:
```
sbt.internal.io.Milli.getMilliSupportDiagnostic(projectDir) map log.debug
```
If the test fails, it returns a diagnostic string that will remain
available in the logs.
Convert lastModified/setLastModified to Milli calls
Also preserve the current behavior of IO.copyLastModified(),
for compatibility.

The method IO.copyLastModified() has an underspecified, and generally
inconsistent behavior. In particular, if the source file is missing,
it will silently set the target modification time to 1st January 1970,
returning success. Also, because it uses lastModified(), it may
trim away the millisecond part of the timestamp without notice.

The replacement copyModifiedTime(), instead, has true millisecond
precision. Also, it will correctly throw a FileNotFoundException
if any of the two files are missing, or an IOException in case an
IO exception occurs.

This commit deprecates copyLastModified() in favor of the new
copyModifiedTime(), adding a Scaladoc explanation.

@cunei cunei changed the base branch from 1.0.x to 1.x Dec 6, 2017

@eed3si9n eed3si9n changed the base branch from 1.x to 1.1.x Dec 6, 2017

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 6, 2017

Member

As per discussion during sbt 1 meeting, I am retargeting this to 1.1.x branch.

Member

eed3si9n commented Dec 6, 2017

As per discussion during sbt 1 meeting, I am retargeting this to 1.1.x branch.

@eed3si9n eed3si9n merged commit 66c06ea into sbt:1.1.x Dec 6, 2017

@eed3si9n eed3si9n removed the in progress label Dec 6, 2017

@eed3si9n

This comment has been minimized.

Show comment
Hide comment
@eed3si9n

eed3si9n Dec 6, 2017

Member

The dbuild validation ran up until the vscode stuff, which is not yet in 1.0.x so I'd consider it a pass.

Member

eed3si9n commented Dec 6, 2017

The dbuild validation ran up until the vscode stuff, which is not yet in 1.0.x so I'd consider it a pass.

@cunei

This comment has been minimized.

Show comment
Hide comment
@cunei

cunei Dec 6, 2017

Contributor

I temporarily used as base 1.x for util and lm as well, since 1.1.x had not been created yet.
The validator has compiled all the branches successfully, but using the result binaries to re-compile sbt failed. It seems there are more points in the code where sbt tries to get timestamps of non-existent files. Here is the run: https://jenkins.scala-sbt.org:8592/job/sbt-community/90/console

[sbt-zinc] [error] java.io.FileNotFoundException: No such file or directory: /localhome/jenkinssbt/workspace/sbt-community/target-0.9.9/extraction/76f69ff08fa3644f09af512f3ac4b1ba8edd8ec1/projects/698b3541df4104f83b9b3935c97ecdb2d797dbc7/project/project/target/scala-2.12/sbt-1.0/classes
[sbt-zinc] [error] 	at sbt.internal.io.PosixMilli.checkedIO(Milli.scala:102)
[sbt-zinc] [error] 	at sbt.internal.io.Linux64Milli$.getModifiedTimeNative(Milli.scala:187)
[sbt-zinc] [error] 	at sbt.internal.io.Linux64Milli$.getModifiedTimeNative(Milli.scala:182)
[sbt-zinc] [error] 	at sbt.internal.io.MilliNative.getModifiedTime(Milli.scala:63)

I can keep bypassing the exception, but inspecting files that do not actually exist is probably not a great idea; I'd rather check why that happens, and fix the client code instead. OK? @eed3si9n

Contributor

cunei commented Dec 6, 2017

I temporarily used as base 1.x for util and lm as well, since 1.1.x had not been created yet.
The validator has compiled all the branches successfully, but using the result binaries to re-compile sbt failed. It seems there are more points in the code where sbt tries to get timestamps of non-existent files. Here is the run: https://jenkins.scala-sbt.org:8592/job/sbt-community/90/console

[sbt-zinc] [error] java.io.FileNotFoundException: No such file or directory: /localhome/jenkinssbt/workspace/sbt-community/target-0.9.9/extraction/76f69ff08fa3644f09af512f3ac4b1ba8edd8ec1/projects/698b3541df4104f83b9b3935c97ecdb2d797dbc7/project/project/target/scala-2.12/sbt-1.0/classes
[sbt-zinc] [error] 	at sbt.internal.io.PosixMilli.checkedIO(Milli.scala:102)
[sbt-zinc] [error] 	at sbt.internal.io.Linux64Milli$.getModifiedTimeNative(Milli.scala:187)
[sbt-zinc] [error] 	at sbt.internal.io.Linux64Milli$.getModifiedTimeNative(Milli.scala:182)
[sbt-zinc] [error] 	at sbt.internal.io.MilliNative.getModifiedTime(Milli.scala:63)

I can keep bypassing the exception, but inspecting files that do not actually exist is probably not a great idea; I'd rather check why that happens, and fix the client code instead. OK? @eed3si9n

@fommil

This comment has been minimized.

Show comment
Hide comment
@fommil

fommil Jan 25, 2018

Cool! However I recommend having a pure jvm fallback in case of exception. Your detection logic could fail, or abis could change, then we're basically not on the jvm anymore.

fommil commented Jan 25, 2018

Cool! However I recommend having a pure jvm fallback in case of exception. Your detection logic could fail, or abis could change, then we're basically not on the jvm anymore.

@fommil

This comment has been minimized.

Show comment
Hide comment
@fommil

fommil Jan 25, 2018

Also, are you sure JNA isn't adding a performance bottleneck? These are very frequent calls and even submilli slowdowns will become apparent on large codebase as multisecond slowdowns. We really need JNI bulk operations to avoid problems. But does it also suffer from precision problems? I'd personally prefer performance over precision in this case, note that we cannot say "correctness" because we're still limited by the filesystem precision.

fommil commented Jan 25, 2018

Also, are you sure JNA isn't adding a performance bottleneck? These are very frequent calls and even submilli slowdowns will become apparent on large codebase as multisecond slowdowns. We really need JNI bulk operations to avoid problems. But does it also suffer from precision problems? I'd personally prefer performance over precision in this case, note that we cannot say "correctness" because we're still limited by the filesystem precision.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment