Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-compute Suggestion DB during build time #5068

Closed
wdanilo opened this issue Feb 5, 2023 · 12 comments · Fixed by #5698
Closed

Pre-compute Suggestion DB during build time #5068

wdanilo opened this issue Feb 5, 2023 · 12 comments · Fixed by #5698
Assignees
Labels
--low-performance -compiler p-high Should be completed in the next sprint x-new-feature Type: new feature request

Comments

@wdanilo
Copy link
Member

wdanilo commented Feb 5, 2023

Original issue is #184350816.

Motivation: Apparently we continue to re-parse documentation on every startup rather than reading the data from a cache. The parser is slow as it needs to be rewritten - tracked as #182497187, but not planned. Prototype implementation of caching on branch wip/jtulach/CacheSuggestions_184350816 shows 5-10% speed up by eliminating the generateDocumentation method all together. We should mitigate the urgent need to have a Rust parser for documentation by implementing the caching.

Plan: There is a build sequence that processes the standard libraries during sbt build (Regenerating manifest for [distribution/lib/Standard/Base/]) - we shall enhance it to pre-compute the Suggestion DB initial upload data and store them in JSON (preferably) next to the library manifest. During the initialization of the IDE/engine connection we check, if the pre-computed data are available next to the library manifest. If so, we send the prepared JSON and eliminate all computations what so ever.

Expectation: Following the current (and only) startup measurement methodology we are able to get down to 9600ms from the previous 11020ms - that's roughly the same as 1164ms wasted in generateDocumentation measurement suggests.

Conclusion: we can speed the engine initialization up by 5-10% by computing the suggestion database as part of the build of the standard libraries. Let's do it.

@wdanilo wdanilo added this to the Beta Release milestone Feb 6, 2023
@JaroslavTulach JaroslavTulach changed the title Investigate reading documentation data from caches rather than re-parsing it on startup Pre-compute Suggestion DB during build time Feb 7, 2023
@JaroslavTulach JaroslavTulach added p-high Should be completed in the next sprint and removed p-low Low priority labels Feb 7, 2023
@enso-bot
Copy link

enso-bot bot commented Feb 10, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-02-09):

Progress: Started working on the task. Updated the SBT build logic to run the compilation of the standard library. Later it will generate the suggestion files. Resolved a copule of issues with the "triage" label. It should be finished by 2023-02-17.

Next Day: Next day I will be working on the #5068 task. Continue working on tasks

@enso-bot
Copy link

enso-bot bot commented Feb 10, 2023

Dmitry Bushev reports a new STANDUP for today (2023-02-10):

Progress: Continue working on the task. Started working on the compiler logic to generate the suggestions during the compilation task. Was looking maybe I can reuse some of the existing module cache logic, but for now will only share the execution context. It should be finished by 2023-02-17.

Next Day: Next day I will be working on the #5068 task. Continue working on tasks

@enso-bot
Copy link

enso-bot bot commented Feb 14, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-02-13):

Progress: Continue working on the task. Created a stub for serialization manager and integrated it into the compiler pipeline. Started looking into the language server logic It should be finished by 2023-02-17.

Next Day: Next day I will be working on the #5068 task. Continue working on tasks

@enso-bot
Copy link

enso-bot bot commented Feb 14, 2023

Dmitry Bushev reports a new STANDUP for today (2023-02-14):

Progress: Continue working on the task. Updated EnsureCompiledJob to compile only the required dependencies. Right now the filtering logic is not optimal, but I may look into the package repository to optimize it. Started to update the language server but then decided that I can reuse the existing notification, then the logic should stay the same. It should be finished by 2023-02-17.

Next Day: Next day I will be working on the #5068 task. Continue working on tasks

@enso-bot
Copy link

enso-bot bot commented Feb 28, 2023

Dmitry Bushev reports a new 🔴 DELAY for yesterday (2023-02-27):

Summary: There is 14 days delay in implementation of the Pre-compute Suggestion DB during build time (#5068) task.
It will cause 0 days delay for the delivery of this weekly plan.

Delay Cause: Vacation

@enso-bot
Copy link

enso-bot bot commented Feb 28, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-02-27):

Progress: Catching up after the vacation. Was reviewing the changes in the PR. Started working on deserealization of the suggestions in the language server. It should be finished by 2023-03-03.

Next Day: Next day I will be working on the #5068 task. Continue working on the task

@enso-bot
Copy link

enso-bot bot commented Mar 1, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-02-28):

Progress: Continue working on the task. Started the refactoring to move the serialization logic to the package level. Jaroslav pointed out that the current runtime serialization will be replaced by the upcoming PR, so the serialization should remain in the runtime so far. Started working on preparing the appropriate notification to the language server instead It should be finished by 2023-03-03.

Next Day: Next day I will be working on the #5068 task. Continue working on the task

@JaroslavTulach JaroslavTulach removed their assignment Mar 2, 2023
@enso-bot
Copy link

enso-bot bot commented Mar 2, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-03-01):

Progress: Continue working on the task. Updated and rebased the PR on the current develop branch. Implemented a suggestions serialization logic based on new caching infrastructure. Started testing. It should be finished by 2023-03-03.

Next Day: Next day I will be working on the #5068 task. Continue working on the task

@enso-bot
Copy link

enso-bot bot commented Mar 3, 2023

Dmitry Bushev reports a new STANDUP for yesterday (2023-03-02):

Progress: Continue working on the task. Fixed the serialization logic. Ensured that the suggestions are successfully generated during the library building. It should be finished by 2023-03-03.

Next Day: Next day I will be working on the #5068 task. Continue working on the task

@enso-bot
Copy link

enso-bot bot commented Mar 6, 2023

Dmitry Bushev reports a new 🔴 DELAY for today (2023-03-06):

Summary: There is 3 days delay in implementation of the Pre-compute Suggestion DB during build time (#5068) task.
It will cause 1 day delay for the delivery of this weekly plan.

Delay Cause: The lazy compilation caused a lot of failures in the runtime tests. Had to go through pretty much all of them and fix the imports.

@enso-bot
Copy link

enso-bot bot commented Mar 6, 2023

Dmitry Bushev reports a new STANDUP for today (2023-03-06):

Progress: Continue working on the task. Started testing. Updated the language server logic, updated a bunch of failing tests, and fixed an issue with the checksum computation for the stored suggestions. PR is ready to review. It should be finished by 2023-03-06.

Next Day: Next day I will be working on the #5068 task. Continue working on the task

@mergify mergify bot closed this as completed in #5698 Mar 8, 2023
mergify bot pushed a commit that referenced this issue Mar 8, 2023
Close #5068

Cache suggestions during the `buildEngineDistribution` command, and read them from the disk when the library is loaded. Initial graph coloring takes ~20 seconds vs ~25 seconds on the develop branch.

[peek-develop-branch.webm](https://user-images.githubusercontent.com/357683/223504462-e7d48262-4f5e-4724-b2b0-2cb97fc05140.webm)
[peek-suggestions-branch.webm](https://user-images.githubusercontent.com/357683/223504464-0fe86c04-8c4b-443c-ba96-6c5e2fb1e396.webm)
@JaroslavTulach
Copy link
Member

JaroslavTulach commented Mar 10, 2023

After integration of this PR as well as #5568 and #5791 we managed to speed startup (according to our methodology - #5569) from 11s to 8s. I believe majority of the speed up comes from pre-computing the suggestion DB. Great work!

1074ms in SqlSuggestionRepo

Still I see at least a second spend in SqlSuggestionRepo, so I suggest to continue with

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
--low-performance -compiler p-high Should be completed in the next sprint x-new-feature Type: new feature request
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants