Experiment in lock-free module invalidation #9639

hubertp · 2024-04-05T11:04:53Z

Pull Request Description

Experimenting with invalidating modules' indexes without requiring full write-context locks. That should significantly improve the execution.
Improving performance by making background job executor run in a larger threadpool than 1.

Checklist

Please ensure that the following checklist has been satisfied before submitting the PR:

The documentation has been updated, if necessary.
Screenshots/screencasts have been attached, if there are any visual changes. For interactive or animated visual changes, a screencast is preferred.
All code follows the
Scala,
Java,
and
Rust
style guides. In case you are using a language not listed above, follow the Rust style guide.
All code has been tested:
- Unit tests have been written where possible.
- If GUI codebase was changed, the GUI was tested when built using ./run ide build.

Experimenting with invalidating modules' indexes without requiring full write-context locks. That should significantly improve the execution.

Deserialization of suggestions sounds like a fully parallelizable job that be done independently for individual libraries. In fact it looks like all jobs have sufficient synchronization primitives to avoid problems and we can easily increase the pool.

4e6 · 2024-04-08T10:42:53Z

...ment-common/src/main/scala/org/enso/interpreter/instrument/job/AnalyzeModuleInScopeJob.scala

@@ -60,8 +61,13 @@ final class AnalyzeModuleInScopeJob(
        exports = ModuleExportsDiff.compute(prevExports, newExports),
        updates = SuggestionDiff.compute(Tree.empty, newSuggestions)
      )
-      sendModuleUpdate(notification)
-      module.setIndexed(true)
+      if (!module.indexed()) {


Now I see that because it is effectively the only place that marks the module as indexed, just synchronizing around the indexed flag should be enough 👍

Line 44 is setting an internal module field and leaving synchronized section...

Now we are changing the field again. Meanwhile the state wasn't synchronized. What makes you believe the result of such operation is going to be consistent? Btw.

What is the invariant to be consistent with?

JaroslavTulach

Please find some smaller issues spread as inline comments. The bigger concern is expressed here.

Thanks for bringing this isIndex() and setIndexed() flag to my attention. I never knew what it is good for. Now it looks like being related to suggestion database. I am scared whenever I see synchronization being spread across multiple files. I am horrified seeing synchronization being spread across multiple projects (Module is in runtime while the main logic is in runtime-instrument-common).

We desperately need some encapsulation.

It is probably acceptable (in current state of affairs) for Module to hold some internal state needed by suggestion DB and let runtime-instrument-common module invalidate it and schedule its recomputation. However it is not acceptable for a different module to poke around the Module flag and change its value randomly. Making such field synchronized doesn't improve the situation at all.

engine/runtime/src/main/java/org/enso/interpreter/runtime/ThreadExecutors.java

engine/runtime/src/main/java/org/enso/interpreter/runtime/Module.java

...ent-common/src/main/scala/org/enso/interpreter/instrument/execution/JobExecutionEngine.scala

JaroslavTulach · 2024-04-09T04:51:55Z

...ent-common/src/main/scala/org/enso/interpreter/instrument/execution/JobExecutionEngine.scala

      false
    )

  private val backgroundJobExecutor: ExecutorService =
-    context.newFixedThreadPool(1, "background-job-pool", false)
+    context.newCachedThreadPool("background-job-pool", 1, 4, 20, false)


Using more CPUs is probably desirable, but it also opens a new space for race conditions. I'd like to know the benefits, before we go that route.

Deserialization jobs are quite expensive in terms of execution time and there is no need to execute them sequentially.
This is particularly important when modules' invalidation is triggered.

Deserialization jobs are quite expensive in terms of execution time

Are they? We are supposed to have lazy deserialization - in such case most of the deserialization is happening later, during code execution, anyway. However I can see that SuggestionsCache is still relying on old java.io.ObjectInputStream mechanism - probably because we are not benchmarking SuggestionsCache in our startup benchmarks.

This is particularly important when modules' invalidation is triggered.

Maybe you want to share some profiling output to demonstrate how important this is.

Btw. background-job-pool is probably used for many other tasks, than deserialization - are they all ready for parallel processing?

...ment-common/src/main/scala/org/enso/interpreter/instrument/job/AnalyzeModuleInScopeJob.scala

JaroslavTulach · 2024-04-09T05:08:14Z

...ment-common/src/main/scala/org/enso/interpreter/instrument/job/AnalyzeModuleInScopeJob.scala

@@ -60,8 +61,13 @@ final class AnalyzeModuleInScopeJob(
        exports = ModuleExportsDiff.compute(prevExports, newExports),
        updates = SuggestionDiff.compute(Tree.empty, newSuggestions)
      )
-      sendModuleUpdate(notification)
-      module.setIndexed(true)
+      if (!module.indexed()) {


Line 44 is setting an internal module field and leaving synchronized section...

Now we are changing the field again. Meanwhile the state wasn't synchronized. What makes you believe the result of such operation is going to be consistent? Btw.

What is the invariant to be consistent with?

JaroslavTulach · 2024-04-09T05:28:37Z

engine/runtime/src/main/java/org/enso/interpreter/runtime/Module.java

+    Indexed
+  }
+
+  private IndexState indexState;


Let me assume the point of this IndexState is to have a tri-state status which can be invalidated, scheduled for recomputation and consumed, when it is available. I believe such a concept is best expressed by a Future. If we change the code in Module to just:

private volatile Future<Object> index; public final void indexing(Future<Object> newIndex) { this.index = newIndex; } public final Future<Object> index() { return this.index; }

I'll consider it to be encapsulated enough (for now). @hubertp, is such a future based interface enough for your purposes?

I don't believe that encapsulated the whole logic as it wouldn't express a situation when in-progress indexing would be marked as dirty and require re-indexing. I can be obviously wrong but there was a reason why logic was added to Analyze*Job to deal with this scenario without locking everything, as it was.

I continue to believe Future is the best concept to represent the desired behavior.

it wouldn't express a situation when in-progress indexing would be marked as dirty and require re-indexing

No, (cancellable) Future represents such concepts neatly, I believe.

Extracted logic used for indexing modules to a separate map available through `RuntimeContext`. That wasy should avoid mixing in LS logic into runtime `Module`. Adjusted threadpools to better reflect the needs.

hubertp · 2024-04-09T15:49:26Z

Please find some smaller issues spread as inline comments. The bigger concern is expressed here.

This was a useful feedback although I don't believe I have made it worse than it used to be. Previously we would use a rather broad writeCompilationLock with little consideration on how that affects scheduling of tasks.
Either way, I extracted the logic out of the runtime Module, as I agree that this is not an ideal place for indexing flag. Instead, similarly to PendingEdits, it is now carried around in RuntimeContext. Let me know what you think.

JaroslavTulach · 2024-04-10T04:59:22Z

I extracted the logic out of the runtime Module, as I agree that this is not an ideal place for indexing flag. Instead, similarly to PendingEdits, it is now carried around in RuntimeContext.

From a code perspective, such a change doesn't have real impact on the behavior. Whatever flaws were present so far, are still present. The only real impact is organizational one: This PR moves the code away from engine/runtime project - the changes in engine/runtime project are non-controversial now.

JaroslavTulach · 2024-04-10T05:03:14Z

...trument-common/src/main/scala/org/enso/interpreter/instrument/execution/ModuleIndexing.scala

+    val NotIndexed, NeedsIndexing, Indexed = Value
+  }
+
+  private val modules: ConcurrentMap[Module, IndexState.Value] =


The code is now rewritten into a new ModuleIndexing Scala class that uses ConcurrentHashMap. Good encapsulation, but I still miss the answer to the fundamental question: What is the invariant to be consistent with?

JaroslavTulach

Visual inspection of Analyze*Job line 44 and line 64 still hints there is a possibility of race condition. The code might have been working mostly correctly so far (as there was just a single threaded background pool). However, as the throughput of the background thread pool is about to be increased from 1 the race conditions are likely to appear - until we justify by spelling out some invariants/constraints why they cannot...

JaroslavTulach · 2024-04-10T05:12:46Z

...ment-common/src/main/scala/org/enso/interpreter/instrument/job/AnalyzeModuleInScopeJob.scala

@@ -60,8 +61,13 @@ final class AnalyzeModuleInScopeJob(
        exports = ModuleExportsDiff.compute(prevExports, newExports),
        updates = SuggestionDiff.compute(Tree.empty, newSuggestions)
      )
-      sendModuleUpdate(notification)
-      module.setIndexed(true)
+      if (!ctx.state.suggestions.markAsIndexed(module)) {


The problem I see continues to be the same. Line 44 deals with state.suggestions and line 64 does that as well. But meanwhile anything can happen. What if a new request to markAsNotIndexed comes in another thread?

there was a reason why logic was added to Analyze*Job to deal with this scenario

My experience with Enso code base and its readiness for concurrent execution doesn't support claims that advocate reason and/or logic. Even if they were present, it is questionable they were used in a sound way. Except using the most advanced concurrent building blocks, I only see attempts to avoid answers to the basic question: how that is supposed to work at all?

Introducing an immutable `IndexState` record that allows us to determine if the just computed index is really up-to-date and could be sent to the clients.

JaroslavTulach

the trick with previous index state is cute, but I am afraid there is a GC problem. We need to be able to GC no longer needed IndexState instances.

...nstrument-common/src/main/java/org/enso/interpreter/instrument/execution/ModuleIndexing.java

JaroslavTulach · 2024-04-11T12:03:13Z

...ment-common/src/main/scala/org/enso/interpreter/instrument/job/AnalyzeModuleInScopeJob.scala

      val notification = Api.SuggestionsDatabaseModuleUpdateNotification(
        module = moduleName.toString,
        actions =
          Vector(Api.SuggestionsDatabaseAction.Clean(moduleName.toString)),
        exports = ModuleExportsDiff.compute(prevExports, newExports),
        updates = SuggestionDiff.compute(Tree.empty, newSuggestions)
      )
-      sendModuleUpdate(notification)
-      module.setIndexed(true)
+      if (ctx.state.suggestions.markAsIndexed(module, state)) {


OK, this looks race condition free.

...nstrument-common/src/main/java/org/enso/interpreter/instrument/execution/ModuleIndexing.java

JaroslavTulach · 2024-04-12T07:49:41Z

...nstrument-common/src/main/java/org/enso/interpreter/instrument/execution/ModuleIndexing.java

+  }
+
+  /**
+   * @return true, if module has been isIndexed. False otherwise.


The return value isn't true anymore.

Experiment in lock-free module invalidation

0110d86

Experimenting with invalidating modules' indexes without requiring full write-context locks. That should significantly improve the execution.

hubertp added the CI: No changelog needed Do not require a changelog entry for this PR. label Apr 5, 2024

Increae pool for background jobs

65e901b

Deserialization of suggestions sounds like a fully parallelizable job that be done independently for individual libraries. In fact it looks like all jobs have sufficient synchronization primitives to avoid problems and we can easily increase the pool.

4e6 reviewed Apr 8, 2024

View reviewed changes

enso-bot bot mentioned this pull request Apr 8, 2024

Make execution multithreaded to allow for faster visualization responses #9529

Closed

executors with non-empty queue

bd590ca

hubertp marked this pull request as ready for review April 8, 2024 21:51

hubertp requested review from JaroslavTulach and Akirathan as code owners April 8, 2024 21:51

JaroslavTulach requested changes Apr 9, 2024

View reviewed changes

JaroslavTulach reviewed Apr 9, 2024

View reviewed changes

hubertp added 2 commits April 9, 2024 17:21

Fix broken logic

655d9aa

PR Review: Extract indexing logic

a65178d

Extracted logic used for indexing modules to a separate map available through `RuntimeContext`. That wasy should avoid mixing in LS logic into runtime `Module`. Adjusted threadpools to better reflect the needs.

hubertp requested a review from JaroslavTulach April 9, 2024 15:49

enso-bot bot mentioned this pull request Apr 10, 2024

Support returning small BigInteger values from Java #9392

Closed

JaroslavTulach reviewed Apr 10, 2024

View reviewed changes

JaroslavTulach requested changes Apr 10, 2024

View reviewed changes

hubertp added 2 commits April 11, 2024 12:06

Eliminate race-conditions when indexing

641d968

Introducing an immutable `IndexState` record that allows us to determine if the just computed index is really up-to-date and could be sent to the clients.

docs

2d662fd

JaroslavTulach requested changes Apr 11, 2024

View reviewed changes

hubertp added 2 commits April 11, 2024 14:54

Avoid GC problems

6348cf4

naming is hard

e6b6cf2

hubertp requested a review from JaroslavTulach April 11, 2024 13:08

enso-bot bot mentioned this pull request Apr 12, 2024

Generate completion of Table.join join criteria using data from both joined tables #5629

Closed

JaroslavTulach approved these changes Apr 12, 2024

View reviewed changes

JaroslavTulach reviewed Apr 12, 2024

View reviewed changes

Merge branch 'develop' into wip/hubert/lock-free-invalidation

99d6560

hubertp requested a review from jdunkerley as a code owner April 16, 2024 07:59

hubertp requested review from radeusgd, GregoryTravis and AdRiley as code owners April 16, 2024 07:59

hubertp force-pushed the wip/hubert/lock-free-invalidation branch from c14d1f0 to 99d6560 Compare April 16, 2024 08:26

Update doc

7881f9d

hubertp added the CI: Ready to merge This PR is eligible for automatic merge label Apr 16, 2024

mergify bot merged commit ca9e150 into develop Apr 16, 2024
37 checks passed

mergify bot deleted the wip/hubert/lock-free-invalidation branch April 16, 2024 09:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment in lock-free module invalidation #9639

Experiment in lock-free module invalidation #9639

hubertp commented Apr 5, 2024 •

edited

Loading

4e6 Apr 8, 2024

JaroslavTulach Apr 9, 2024 •

edited

Loading

JaroslavTulach left a comment

JaroslavTulach Apr 9, 2024

hubertp Apr 9, 2024

JaroslavTulach Apr 10, 2024 •

edited

Loading

JaroslavTulach Apr 9, 2024 •

edited

Loading

JaroslavTulach Apr 9, 2024 •

edited

Loading

hubertp Apr 9, 2024

JaroslavTulach Apr 10, 2024

hubertp commented Apr 9, 2024

JaroslavTulach commented Apr 10, 2024

JaroslavTulach Apr 10, 2024

JaroslavTulach left a comment

JaroslavTulach Apr 10, 2024

JaroslavTulach left a comment

JaroslavTulach Apr 11, 2024

JaroslavTulach Apr 12, 2024

Experiment in lock-free module invalidation #9639

Experiment in lock-free module invalidation #9639

Conversation

hubertp commented Apr 5, 2024 • edited Loading

Pull Request Description

Checklist

Choose a reason for hiding this comment

JaroslavTulach Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JaroslavTulach Apr 10, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

JaroslavTulach Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hubertp commented Apr 9, 2024

JaroslavTulach commented Apr 10, 2024

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JaroslavTulach left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hubertp commented Apr 5, 2024 •

edited

Loading

JaroslavTulach Apr 9, 2024 •

edited

Loading

JaroslavTulach Apr 10, 2024 •

edited

Loading

JaroslavTulach Apr 9, 2024 •

edited

Loading

JaroslavTulach Apr 9, 2024 •

edited

Loading