Improve the ironPython interpreter's re-execution performance #6879
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Purpose
The aim of this PR is to improve the ironPython interpreter's performance through the use of a static container that acts as a cache and stores a copy of the previously executed and compiled script - if the script doesn't change between runs, the previously compiled copy will be called from the cache and reused.
This is a very common scenario when you for example combine a custom node that contains a python script with a List.Map or one of the List.Combine nodes; or when the node has defined inputs and can be used straight - away without any function application nodes.
Currently, the script is rebuilt every time the ipy method is called and that penalty quickly adds up for large data sets. The net benefit will greatly depend upon the code complexity and the size and structure of the data set. Here are some of my test results:
Case 1
two flat lists, the node has undefined inputs but is programmed to work on a list-wide level and thus the ipy interpreter is called only twice. Tho code is relatively simple and we should expect virtually identical execution times between the two:
Case 2
Edit: Case 2 was being bottle-necked by the Revit API. It was making too many element collector calls that were slowing things down. I refactored the code and now the difference in performance is clearer.
one flat list of ~100 items fed through a list.map node. The ipy interpreter is called once per item for a total of hundred executions. The code is slightly more complex than the first example. With the change, we manage to halve execution time:
Case 3
two flat lists, one with 10 views and the other with 100 lines. The node is defined to work on a singleton
level and the lacing is set to cross product and the code is very simple. The ipy interpreter is called 10k times and we get a five-fold improvement in execution times.
My take from this is that python scripts set up to work on a singleton level will get the biggest uplift in performance. This would also allow people to code simpler nodes and rely on Dynamo's native node lacing.
Declarations
*.resx
filesReviewers
@aparajit-pratap do you think storing the compiled code like that will affect anything negatively? Do you have any other ideas on how to improve performance ?