Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the ironPython interpreter's re-execution performance #6879

Merged
merged 1 commit into from Jul 28, 2016
Merged

Improve the ironPython interpreter's re-execution performance #6879

merged 1 commit into from Jul 28, 2016

Conversation

dimven
Copy link
Contributor

@dimven dimven commented Jul 14, 2016

Purpose

The aim of this PR is to improve the ironPython interpreter's performance through the use of a static container that acts as a cache and stores a copy of the previously executed and compiled script - if the script doesn't change between runs, the previously compiled copy will be called from the cache and reused.

This is a very common scenario when you for example combine a custom node that contains a python script with a List.Map or one of the List.Combine nodes; or when the node has defined inputs and can be used straight - away without any function application nodes.

Currently, the script is rebuilt every time the ipy method is called and that penalty quickly adds up for large data sets. The net benefit will greatly depend upon the code complexity and the size and structure of the data set. Here are some of my test results:

Case 1

two flat lists, the node has undefined inputs but is programmed to work on a list-wide level and thus the ipy interpreter is called only twice. Tho code is relatively simple and we should expect virtually identical execution times between the two:

case1

2016-07-14_13-29-15

Case 2

Edit: Case 2 was being bottle-necked by the Revit API. It was making too many element collector calls that were slowing things down. I refactored the code and now the difference in performance is clearer.

one flat list of ~100 items fed through a list.map node. The ipy interpreter is called once per item for a total of hundred executions. The code is slightly more complex than the first example. With the change, we manage to halve execution time:

case2a

2016-07-15_11-00-31

Case 3

two flat lists, one with 10 views and the other with 100 lines. The node is defined to work on a singleton
level and the lacing is set to cross product and the code is very simple. The ipy interpreter is called 10k times and we get a five-fold improvement in execution times.

case3

2016-07-14_14-31-11

My take from this is that python scripts set up to work on a singleton level will get the biggest uplift in performance. This would also allow people to code simpler nodes and rely on Dynamo's native node lacing.

Declarations

  • The code base is in a better state after this PR
  • Is documented according to the standards
  • The level of testing this PR includes is appropriate
  • User facing strings, if any, are extracted into *.resx files
  • All tests pass using the self-service CI.
  • Snapshot of UI changes, if any.

Reviewers

@aparajit-pratap do you think storing the compiled code like that will affect anything negatively? Do you have any other ideas on how to improve performance ?

@mjkkirschner
Copy link
Member

mjkkirschner commented Jul 14, 2016

@dimven disk io is a big problem for python - I think your reuse of the script engine caches the imported modules. This is a big savings in time as modules like the entire revit api and protogeometry do not need to be reloaded from disk every time the python node is called. This is why your third test case has big improvements.

I think we'll need to carefully consider stateful bugs this could cause.

@dimven
Copy link
Contributor Author

dimven commented Jul 15, 2016

@mjkkirschner , thanks for looking into this. I revised the code in case 2 and now the difference is more apparent. I suspect that the additional compile step at line 50 helps. I had to also cache the code as a string because after the compile step, I could no longer extract it from the cached engine.

@aparajit-pratap
Copy link
Contributor

@dimven this is a clever fix indeed! Off the top of my head I can't think of any cases where this could lead to bugs due to maintaining a state. It would be good to run the test suite on your branch to begin with to ensure there are no regressions. Once we are confident of the changes it would be good to merge this in. Will keep you posted. Thanks again for the fix and keep them coming :)

@ke-yu
Copy link
Contributor

ke-yu commented Jul 26, 2016

@aparajit-pratap @mjkkirschner notice this PR has been pending for a while. As the change is minor, I think we could merge this PR now. What do you think?

@aparajit-pratap aparajit-pratap merged commit 08b21b1 into DynamoDS:master Jul 28, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants