Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In Java interpreter ignore subroutines and perform code split based on the AST size #158

Merged
merged 15 commits into from
Feb 6, 2020

Conversation

izeigerman
Copy link
Member

@izeigerman izeigerman commented Jan 26, 2020

After investigating possible solutions for #152, I came to a conclusion that with the existing design it's extremely hard to come up with the optimal algorithm to split code into subroutines on the interpreter side (and not in assemblers).
The primary reason for that is that since we always interpret one expression at a time it's hard to predict both the depth of the current subtree and the number of expressions that are left to interpret in other branches.
I've achieved some progress by splitting expressions into separate subroutines based on the size of the code generated so far (i.e. code size threshold), but more often than not I'll get some stupid subroutines like this one:

public static double subroutine2(double[] input) {
    return 22.640634908349323;
}

That's why I took a simpler approach and attempted to optimize an interpreter that caused trouble in the first place - the R one.
I slightly modified its behavior: when the binary expressions count threshold is exceeded, it no longer split them into separate variable assignments, but moves them into their own subroutines. Although it might not be the most optimal way for simpler models (like linear ones), it helps tremendously with gradient boosting and random forest models.
Since those models are summation of independent estimators, we end up putting every N (5 by default) estimators into their own subroutine, improving this way the execution time.
@StrikerRUS please let me know what you think.

@coveralls
Copy link

coveralls commented Jan 26, 2020

Coverage Status

Coverage decreased (-0.1%) to 95.806% when pulling 70e71f7 on iaroslav/issue-152 into a3082b5 on master.

@StrikerRUS
Copy link
Member

@izeigerman All your thoughts seems to be reasonable for me. However, R tests still have unacceptable execution time or even hang forever 🙁 .

@izeigerman
Copy link
Member Author

izeigerman commented Jan 26, 2020

Yeah, apparently now it's being killed due to hitting the memory constraint :( I'm now seriously considering to limit the number of models tested for R.

PS: it completes within 30m on my local machine, though it's not very reassuring, since more test models will definitely lead to violation of the time constraint.

@izeigerman
Copy link
Member Author

Ended up excluding large XGBoost and LGBM models from the R's test suite.

@StrikerRUS
Copy link
Member

OK, so removing subroutine wrapper for each tree actually means that XGBoost and LightGBM are not supported in R anymore - generated code for tiny model cannot be executed in a feasible time (I guess R session crashes, but Travis doesn't report anything for some reason and simply hangs forever). I don't think that it's a good alternative for slowness in Java.
I believe that it will be much better to continue support all models for both languages by adding a mechanism which will allow to filter languages. For instance, something like this: ast.SubroutineExpr(self._assemble_tree(t), exclude_lang={'Java'}). Or any other solution but not one which drops the support of subset of models.

@izeigerman
Copy link
Member Author

izeigerman commented Jan 26, 2020

Hm

generated code for tiny model cannot be executed in a feasible time (I guess R session crashes

I couldn't see any evidence to that. Can you please share more context on this one? Which tiny models? Can this be reproduced locally (I couldn't)?

For instance, something like this: ast.SubroutineExpr(self._assemble_tree(t), exclude_lang={'Java'}).

This will introduce a tight coupling between assemblers and interpreters which is IMHO a poor design decision.

Btw, tests are passing now (with large boosting models excluded).

@izeigerman
Copy link
Member Author

Additionally it doesn't seem like R was designed for interpreting extensive amounts of code. We're pushing its limits already and I have a feeling it's not hard to come up with a large enough model that will blow up the R interpreter.
So I suggest we admit that this platform is quite limited in its capabilities (either in memory constraints or runtime). Though I think it should still support smaller ensemble models.
Can you please provide an example of the model which fails?

@StrikerRUS
Copy link
Member

I couldn't see any evidence to that. Can you please share more context on this one?

I don't believe that 7.5GB is a harsh memory constraint you mentioned. Instead, I think it hangs due to some reason which is similar to context overflow and caused by a lot of ifs in one function.

What tiny models?

You excluded LightGBM with the following params LIGHTGBM_PARAMS_LARGE = dict(n_estimators=100, num_leaves=100, max_depth=64, random_state=RANDOM_SEED). I don't think that they can be treated as a serious model in a real world application.

This will introduce a tight coupling between assemblers and interpreters which is IMHO a poor design decision.

Agree! I just gave an example I imagined within 1m which will allow to save the support of LightGBM and XGBoost for R. Of course, I believe we will be able to develop quite smarter and more efficient solution.

@izeigerman
Copy link
Member Author

@StrikerRUS , somehow our conversation was very thought provoking. I've just got another idea and preliminary tests look very promising. Please ignore this PR for now and I'll tag you once it's ready to be revisited. Thanks!

@StrikerRUS
Copy link
Member

Additionally it doesn't seem like R was designed for interpreting extensive amounts of code. We're pushing its limits already and I have a feeling it's not hard to come up with a large enough model that will blow up the R interpreter.

I think it's applicable to practically every programming language, especially interpreted one. 😃

So I suggest we admit that this platform is quite limited in its capabilities (either in memory constraints or runtime). Though I think it should still support smaller ensemble models.

Yes, but wrapping each tree into a function allows to support bigger models. Why should we consciously limit the number of supported models while we know how to overcome it?
In addition, it's very naturally to do so. One tree - one function sounds super reasonable. For dealing with extremely deep trees we have a mixin which is useful addition.
It seems that Java has its problems too as it becomes slower with larger number of methods. And here we comes again to the problem about giving more love to one language...

Can you please provide an example of the model which fails?

Sorry, didn't get it. Which model did you mean?

I strongly believe that we should continue wrapping trees into subroutines for R and don't do it for Java somehow. It looks like the best solution we could offer for users.

@StrikerRUS
Copy link
Member

Ah, I'm answering to an outdated comment again! 😄 Seems that something is wrong with my browser it doesn't provide new comments without pressing F5.

I've just got another idea and preliminary tests look very promising.

Wow, it sounds very awesome! Will be happy to see it. Thank you very much for the constructive conversation!

@izeigerman izeigerman changed the title In R interpreter split the binary expressions into subroutine instead of just variables In Java interpreter ignore subroutines and perform code split based on the AST size Jan 27, 2020
@izeigerman
Copy link
Member Author

Hey @StrikerRUS, this PR is ready to be revisited.
It appears I've finally managed to come up with a good enough heuristics that is able to successfully compete with SubroutineExpr approach. I've compared models from E2E tests as well as some random configurations of boosting models and the number of generated Java methods is more or less the same.
Of course the new approach is not as stable and predictable as the old one so outliers are to be expected.
Another downside of the new algorithm is that it's computationally demanding and it takes more time to interpret AST. At this point I consider this downside to be negligible.
In a follow-up PR I'm planning to do the following:

  1. Generalize solution I came up with for Java and reuse it in R.
  2. Drop the SubroutineExpr altogether.

Thanks!

Copy link
Member

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@izeigerman Thank you very much for rethinking approach! I like it much more than the previous one. Here are some my initial comments before a detailed review.

.travis.yml Outdated Show resolved Hide resolved
.travis.yml Outdated Show resolved Hide resolved
m2cgen/assemblers/ensemble.py Outdated Show resolved Hide resolved
m2cgen/ast.py Show resolved Hide resolved
m2cgen/interpreters/c/interpreter.py Outdated Show resolved Hide resolved
Copy link
Member

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff! Here are my comments. Also, I believe code examples should be re-generated due to changes in Java interpreter and ensemble code.

.travis.yml Show resolved Hide resolved
m2cgen/ast.py Show resolved Hide resolved
m2cgen/ast.py Outdated Show resolved Hide resolved
m2cgen/ast.py Outdated Show resolved Hide resolved
tests/interpreters/test_java.py Outdated Show resolved Hide resolved
tests/test_ast.py Show resolved Hide resolved
@izeigerman
Copy link
Member Author

@StrikerRUS Sorry about the delay, this should be good to go. Thanks!

Copy link
Member

@StrikerRUS StrikerRUS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks a lot! Left two minor comments up to you.

Comment on lines +237 to +239
(PowExpr, lambda e: [e.base_expr, e.exp_expr]),
(VectorVal, lambda e: e.exprs),
(IfExpr, lambda e: [e.test, e.body, e.orelse]),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these expr can be wrapped into a tuples too for the consistent interface. Like, ((PowExpr), lambda e: [e.base_expr, e.exp_expr]),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I'm not 100% sold on this :D



def test_count_all_exprs_types():
expr = ast.BinVectorNumExpr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe even more complicated expr with deeper nesting? 🙂

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH, I don't see a value in having even deeper nesting since it won't really provide additional coverage. Additionally this will make the test more complex and somewhat obscure its purpose. Thanks for the feedback though.

@izeigerman
Copy link
Member Author

Thanks for your review 👍

@izeigerman izeigerman merged commit 81c3e6a into master Feb 6, 2020
@izeigerman izeigerman deleted the iaroslav/issue-152 branch February 6, 2020 16:12
@StrikerRUS StrikerRUS mentioned this pull request Mar 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants