Skip to content

Conversation

blainefreestone
Copy link

@blainefreestone blainefreestone commented Jul 21, 2025

Why

The content of the .tasty file differed between Bazel builds making classes.jar files non-deterministic. Example diff:

@@ -76,7 +76,7 @@
 000004b0  6f 75 72 63 65 46 69 6c  65 01 88 69 6e 74 65 72  |ourceFile..inter|
 000004c0  6e 61 6c 02 83 f9 01 83  02 84 01 84 01 82 3f 85  |nal...........?.|
 000004d0  9d 01 85 00 d0 01 e2 5f  5f 73 61 6e 64 62 6f 78  |.......__sandbox|
-000004e0  2f 36 2f 5f 6d 61 69 6e  2f 73 72 63 2f 6a 76 6d  |/6/_main/src/jvm|
+000004e0  2f 39 2f 5f 6d 61 69 6e  2f 73 72 63 2f 6a 76 6d  |/9/_main/src/jvm|
 000004f0  2f 63 6f 6d 2f 6c 75 63  69 64 63 68 61 72 74 2f  |/com/lucidchart/|
 00000500  61 64 6d 69 6e 2f 73 65  72 76 69 63 65 2f 75 74  |admin/service/ut|
 00000510  69 6c 2f 63 6f 6e 66 69  67 2f 46 65 61 74 75 72  |il/config/Featur|

As you can see, Scala compiler is embedding the source file path. In the first build it was __sandbox/6/_main/src/jvm...and in the second it was __sandbox/9/_main/.....

We want this to be machine independent in order for it to be deterministic.

What

Added the sourceroot flag to the compilation command with the absolute path of the workDir variable. This flag makes the source path outputted in the .tasty file relative to this. Moreover, this option is filtered out from the analysis_store file before being written as it makes that file non-deterministic.

Testing

I pointed Bazel at my local repository with these changes and then built //src/jvm/com/lucidchart/admin/service/util/config:config a couple of times. Before the change, the difference, as shown in the example above, was present. After the change, the .tasty file recorded a relative source filepath and there was no difference between runs.

000004a0  81 0b 82 f6 82 0b 82 f6  83 0b 82 f6 84 01 8a 53  |...............S|
000004b0  6f 75 72 63 65 46 69 6c  65 01 88 69 6e 74 65 72  |ourceFile..inter|
000004c0  6e 61 6c 02 83 f9 01 83  02 84 01 84 01 82 3f 85  |nal...........?.|
000004d0  9d 01 85 00 d0 01 d0 73  72 63 2f 6a 76 6d 2f 63  |.......src/jvm/c|
000004e0  6f 6d 2f 6c 75 63 69 64  63 68 61 72 74 2f 61 64  |om/lucidchart/ad|
000004f0  6d 69 6e 2f 73 65 72 76  69 63 65 2f 75 74 69 6c  |min/service/util|
00000500  2f 63 6f 6e 66 69 67 2f  46 65 61 74 75 72 65 4e  |/config/FeatureN|
00000510  6f 74 69 66 69 63 61 74  69 6f 6e 43 6f 6e 66 69  |otificationConfi|
00000520  67 2e 73 63 61 6c 61 17  81 92 17 81 e6 3f 83 9d  |g.scala......?..|

I also tested that this doesn't invalidate the Bazel cache by running it multiple times and getting cache hits. I also investigated the commands with aquery and they were not changing between runs (despite different sandbox worker directories involved).

This determinism build showed a decrease in Scala output mismatches of 294 (from 424 to 130). Almost every single classes.jar and analysis_store.gz (and text) have been eliminated. There are dozens that still remain, but in quick analysis of a couple of them, they seem to be unrelated to this fix and based on other specific factors. These will have to be fixed individually.

Automated Test

The tests/determinism directory contains an automated test which builds a dummy target 5 times and compares the TASTy files between runs to ensure determinism. When this test was run locally before the change, it failed. After, it succeeded. The test does take about 5-10 minutes to run because the output base is cleaned between runs and the cache disabled to ensure accurate results.

@blainefreestone blainefreestone changed the title Fix tasty file non-determinism with sourceroot argument in multiplex worker Fix tasty file non-determinism with sourceroot argument in multiplex worker and analysis_store.gz changes Jul 21, 2025
@blainefreestone blainefreestone force-pushed the bfreestone-fixes-source-filepath-determinism-issue branch from 0e5232a to df73027 Compare July 25, 2025 15:01
@blainefreestone blainefreestone marked this pull request as ready for review July 25, 2025 15:27
…order to make the filepath relative in the tasty file

this should not invalidate the bazel cache every time because of the
path mapping set in the bazel rule config

remove sourceroot flag from analysis_store file content (it's not deterministic)

add -sourceroot conditionally (it was introduced in Scala 3); change from --sourceroot to -sourceroot

!fixup add -sourceroot

!fixup add -sourceroot

!fixup add -sourceroot

add comment explaining change
@blainefreestone blainefreestone force-pushed the bfreestone-fixes-source-filepath-determinism-issue branch from df73027 to 1c8d0e1 Compare July 28, 2025 22:56
@blainefreestone blainefreestone force-pushed the bfreestone-fixes-source-filepath-determinism-issue branch from 3e7609b to 0fb2ec3 Compare July 29, 2025 19:26
@blainefreestone blainefreestone merged commit 05d2c06 into lucid-master Jul 29, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants