Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement constant operation simplification #26

Closed
dktcoding opened this issue Jan 18, 2015 · 14 comments

Comments

@dktcoding
Copy link

commented Jan 18, 2015

There's a small optimization that can be easily implemented, which is to simplify operations that don't contain variables.

For instance: when this expression is parsed 2 + sin(x) + 12 / 4 ^ 2 exp4j generates 2.0 x sin + 12.0 4.0 2.0 ^ / + when it could generate 2.0 x sin + 0.75 + or even better 2.75 x sin +.

I currently have a small working (it passes all of the tests) implementation that is able do the first simplification assuming that all the functions are deterministic (BTW adding a non-deterministic flag is trivial), the second optimization will be pretty hard without using trees.

I believe that is a good starting point for other types of simplifications like * 1, / 1, f(x) + f(x) -> 2 * f(x), etc.

Is anyone interested?

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 19, 2015

Im certainly interested in improving exp4j performance. Let me wrap my head around it and take a look at your implementation. It will probably take me some days but I will come back to you.

@dktcoding

This comment has been minimized.

Copy link
Author

commented Jan 19, 2015

I've changed a bit PerformanceTest to evaluate a more complex formula, in this case log(x) - y * sqrt(x^cos(y)) + 43/9 * sin(x) - 3^(-3), in this case there are only two small simplifications which result in log(x) - y * sqrt(x^cos(y)) + 4.7778 * sin(x) - 0.037037 but there's a ~5% performance improvement.

+------------------------+---------------------------+--------------------+
| Implementation         | Calculations per Second   | Percentage of Math |
+------------------------+---------------------------+--------------------+
| Java Math              |                2575981.00 |           100.00 % |
| exp4j no opt           |                 605679.70 |            23.51 % |
| exp4j opt              |                 746622.00 |            28.98 % |
| JSR-223 (Java Script)  |                    202.70 |             0.01 % |
+------------------------+---------------------------+--------------------+

I will upload a couple more patches with some extra tests, the modifications to PerformanceTest and the deterministic flag.

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 20, 2015

This looks indeed very promising. I wish I had time today, but it will have to wait until the weekend at least. Im very sorry about the delay, please don't be discouraged. I very much like what I see ;)

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 27, 2015

Hey @dktcoding
looking at Simplifer.simplify() I noticed that you have to iterate over the whole expression a second time. This does increase the time complexity of the evaluation.
With n being the input size, s being the additional steps exp4j has to take backwards sometimes and p being the simplfied steps: exp4j's time complexity is currently O(n) = 2*(n+s), because for evaluation there's one go over the input for tokenization and one for calculation. After adding the Simplifier it seems to me it's O(n) = 3*(n+s) - p since there's an extra iteration over the input.
So while your simplifier helps with small expressions containing a rather large amount of simplifiable statements, a large expression with no simplifiable statements will become slower.
Maybe it makes sense to make this an optional mechanism in exp4j, so that a user can activate it if he does know that the optimization will benefit him.
I will check this with your implementation and a large expression and post the results.

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 27, 2015

So up to a expression length of 163840 statements it seems your implementation is still faster than the original one. Needless to say that I am impressed ;)

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 27, 2015

Im just thinking: Wouldn't it make sense to put this optimization directly into the tokenizer? Although it would have to check the following two tokens before returning the first one, but these can then be cached internally, so the perfomance gain would be even stabler with large inputs?

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jan 27, 2015

And now I see that you already have the ArrayStack optimization in the code base, so the impact of large inputs is even smaller with the much faster Stack impl.
I guess I will have to write a test with an astronomically long input to finally see the old naive implementation overtake your optimized one ;)

@dktcoding

This comment has been minimized.

Copy link
Author

commented Jan 27, 2015

Yes, the computational complexity of build() is increased.

Thinking of the library as a whole, in the worst case scenario there's an evaluate() for every build(), and more likely, several calls to evaluate() for every build(), so, for evaluate() the change is from O(n) = k*(n+s) to O(n) = k*(n-p+s) considering no simplifications there's no change whatsoever, but given a large expression it's highly unlikely to have p = 0.

This of course changes with the usage given to exp4j, my reasoning was:

  1. No problem with small expressions
  2. Big expressions and few variables -> lot's to simplify
  3. Big expressions and lots of variables -> computational hell no matter what
@dktcoding

This comment has been minimized.

Copy link
Author

commented Jan 27, 2015

Apparently I forgot to add the issue tag in commit 742f6e1, which actually adds a system property (exp4j.simplify_enabled) to disable the simplifier (I currently use it for testing the code).

About adding the code to Tokenizer, it might be a good idea, I did this way because it seemed cleaner, and probably easier to add other optimizations in the future.

Those numbers where before adding ArrayStack, currently the performance of exp4j is around ~30% of compiled Java math, which IMHO is blazing fast.

@fasseg

This comment has been minimized.

Copy link
Owner

commented Feb 23, 2015

How about something like this in the builder pattern instead:

new EpressionBuilder("3+4")
    .optimize()       // a clearly documented builder method that turns on co optimization
    .build();

I don't think I like very much the idea of using a property since it will not be convienient in a multi-threaded environment where optimized and non optimized expression have to be evaluated.

@fasseg

This comment has been minimized.

Copy link
Owner

commented Mar 2, 2015

I added a branch cherry picking your Simplifier approach and added some documentation for the suggested feature:
https://github.com/fasseg/exp4j/tree/optimize

fasseg added a commit that referenced this issue Mar 2, 2015

@dktcoding

This comment has been minimized.

Copy link
Author

commented Mar 29, 2015

Sorry for the long absence I had trouble with 2-factor auth (ps: be careful when changing phones... sigh).

The idea sounds great, and you are absolutely right about the system property.

@fasseg

This comment has been minimized.

Copy link
Owner

commented Apr 14, 2015

Okay so I will merge this into master soon, when I find some time...

@fasseg

This comment has been minimized.

Copy link
Owner

commented Jun 23, 2015

Feature has been merged to master

@fasseg fasseg closed this Jun 23, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.