New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/10385 loadnexus speed #56
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This is the first step towards reusing the same spectrum information rather than fetching it renewed each time. One caveat with this is that you will HAVE to be sure that all workspaces in the group are equivalent in terms of instrument detector mappings etc. What i intend to do next is: 1) For workspaces that look to relate to multiperiod group workspaces, default to reuse the spectrum-detectector mappings, once obtained on a lazy basis. 2) Put a boolean property with value in place (which defaults to true) called "PredictiveInstrumentLoading". If false, we would go back to old behaviour.
There is a naming issue here, which is actually the root cause of the problem! The query that was being performed retrieved the number of workspace entries, NOT the number of periods. Going forward. If the number of periods exactly matches the number of workspace entries, we have a multiperiod group workspace, and can happily apply the optimisations we want. Next step will be to actually extract the nperiods. To do that we need to try to fetch it out of the logs for The first entry.
There are lots of TODOs still to resolve. We also need to add tests for this sort of multiperiod loading.
We are not currently handling log extraction properly, which is why the test is failing. File size is just over the threshold, but shouldn't be a problem. I've tried to trim it down as much as possible.
This fixes the regression test. We clear the logs off the temporary workspace, that way we don't have to do a similar thing in the loop, and we avoid copying them over each time. This is not a fully formed solution, but it's a start.
Also better robustness around workspace type.
Should read better now, although the entire algorithm is still a mess.
This does result in a speed up, but the caching is being bypassed at the moment since our strategy is to clone and overwrite. I've removed the caching code as part of the clean up, but I've kept the SpecInfo type around as it makes the usage clearer, and does not result in a performance penalty.
Extend tests. Do not process spectrum lists.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Saving of large group workspaces is much faster now. Load times are still poor. Profiling should reveal why.
Loading multi period group workspaces should be fast.