Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port sumpy's cse implementation to sympy #13221

Merged
merged 16 commits into from Sep 23, 2017

Conversation

@isuruf
Copy link
Member

commented Aug 29, 2017

This uses @mattwala's code in sumpy. I've also bundled an implementation of OrderedSet from http://code.activestate.com/recipes/576694/ which was suggested in python docs. OrderedSet code is licensed MIT. Is that okay to have without changing sympy's license?

Fixes #12411

self.func_to_argset[func_i].update(new_args)


class Unevaluated(object):

This comment has been minimized.

Copy link
@isuruf

isuruf Aug 29, 2017

Author Member

This was initially added for symengine compatibility in sumpy, but this also has a performance benefit in that we can identify which expressions are needed to be evaluated at the end of the cse. (https://github.com/sympy/sympy/pull/13220/files#diff-44a512e7d608776ee035c0ab6a4baa75L411 was removed because of this)



# OrderedSet code is based on http://code.activestate.com/recipes/576694/ by
# Raymond Hettinger. Licensed under MIT License

This comment has been minimized.

Copy link
@ylemkimon

ylemkimon Aug 29, 2017

Contributor

If you've used MIT licensed code, you should include a copy of copyright notice, which looks like:

Copyright <YEAR> <COPYRIGHT HOLDER>

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@isuruf isuruf force-pushed the isuruf:sumpy branch from 4512488 to 6d9f2e4 Aug 29, 2017
@isuruf

This comment has been minimized.

Copy link
Member Author

commented Aug 29, 2017

Hmm, this PR uses sets in addition to OrderedSet and that introduces randomness which in turn affects the cse output. Is that okay?

@smichr

This comment has been minimized.

Copy link
Member

commented Aug 29, 2017

introduces randomness which in turn affects the cse output. Is that okay?

No. Wherever the items in the sets are iterated over, the set, S, should be replaced with ordered(S).

@isuruf isuruf force-pushed the isuruf:sumpy branch from 2c67bae to 5686040 Aug 30, 2017
@@ -445,7 +445,7 @@ def _find_opts(expr):
_find_opts(e)

# split muls into commutative
comutative_muls = set()
comutative_muls = OrderedSet()

This comment has been minimized.

Copy link
@smichr

smichr Aug 30, 2017

Member

commutative -> commutative (I know that python doesn't care, but if we're going to spell out this much, let's go ahead and give it an M&M. :-))

seen_subexp = set()
excluded_symbols = set()
seen_subexp = OrderedSet()
excluded_symbols = OrderedSet()

This comment has been minimized.

Copy link
@smichr

smichr Aug 30, 2017

Member

Hey, what's happening down at line 527 where we have list(map(...)) not being assigned to anything?

This comment has been minimized.

Copy link
@isuruf

isuruf Aug 30, 2017

Author Member

map is an iterator in python3 that's why there is a list call to make sure it's evaluated.


seen_subexp = set()
excluded_symbols = set()
seen_subexp = OrderedSet()

This comment has been minimized.

Copy link
@smichr

smichr Aug 30, 2017

Member

No need for OrderedSet here as far as I can see -- the sets are used for containment checks and are not iterated over. But double check.

@@ -555,6 +744,9 @@ def cse(exprs, symbols=None, optimizations=None, postprocess=None,
reduced_exprs[i] = m

if postprocess is None:
return replacements, reduced_exprs
result = replacements, reduced_exprs
print(result[1].args[1].args)

This comment has been minimized.

Copy link
@isuruf

isuruf Sep 7, 2017

Author Member

Adding this print statement here, makes the test failing in the previous commit pass. Any ideas

This comment has been minimized.

Copy link
@isuruf

isuruf Sep 7, 2017

Author Member

It doesn't. Ignore my above comment

@isuruf isuruf force-pushed the isuruf:sumpy branch from 96dba2d to 29cb692 Sep 7, 2017
@smichr

This comment has been minimized.

Copy link
Member

commented Sep 9, 2017

I see there was not much of a change in terms of expected results. What are the benefits of this new implementation?

A: efficiency; see link on issue page.

@asmeurer

This comment has been minimized.

Copy link
Member

commented Sep 13, 2017

You shouldn't change the git author like that. That's effectively dishonest, because he never created that commit.

@isuruf

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2017

Okay. I'll fix that and add the license above.

@asmeurer asmeurer added the codegen label Sep 13, 2017
@asmeurer

This comment has been minimized.

Copy link
Member

commented Sep 13, 2017

Also I suppose one could use OrderedDict as a proxy for OrderedSet, where each value is None. It might be faster since OrderedDict is written in C.

@isuruf isuruf force-pushed the isuruf:sumpy branch from 22d5eea to 980c844 Sep 13, 2017
isuruf added 8 commits Aug 29, 2017
ODE solving used cse and to extract the Constant terms in the ODEs
and then replaced terms like C2/2 with C2. Issue was that C2 can
occur in multiple places and therefore replacing is not valid
@isuruf isuruf force-pushed the isuruf:sumpy branch from 980c844 to 761e5f6 Sep 13, 2017
@isuruf

This comment has been minimized.

Copy link
Member Author

commented Sep 13, 2017

OrderedDict wouldn't have intersection, union defined.. That's a good idea. I'll do that

if iterable:
self.map = collections.OrderedDict((item, None) for item in iterable)
else:
self.map = collections.OrderedDict()

This comment has been minimized.

Copy link
@isuruf

isuruf Sep 14, 2017

Author Member

@asmeurer, thanks for the idea. This implementation is much simpler.

@asmeurer

This comment has been minimized.

Copy link
Member

commented Sep 14, 2017

The odict_keys objects support set operations (at least in Python 3), though it looks like they return unordered sets. So maybe there is room for OrderedSet in the Python collections module. Originally, before Python had a set object people used this dict with None values trick (see https://www.youtube.com/watch?v=V5-JH23Vk0I).

@isuruf

This comment has been minimized.

Copy link
Member Author

commented Sep 14, 2017

@asmeurer, now that the OrderedSet code is not used, there's no license issue. Is there anything to be done regarding the license of sumpy code? You mentioned that there is no need to do anything at #12411 (comment). Just double checking.

@asmeurer

This comment has been minimized.

Copy link
Member

commented Sep 14, 2017

@mattwala agreed to relicense the code, so there are no issues.


def intersection(self, other):
result = []
for val in self:

This comment has been minimized.

Copy link
@mattwala

mattwala Sep 14, 2017

Suggestion: The loop should iterate through the smaller of the two sets to reduce the number of lookups that are needed.

This comment has been minimized.

Copy link
@mattwala

mattwala Sep 14, 2017

Never mind, that does not preserve the order.

"""
return [self.value_number_to_value[argn] for argn in sorted(argset)]

def get_or_add_value_number(self, value):

This comment has been minimized.

Copy link
@smichr

smichr Sep 16, 2017

Member

rather than introducing a new name, can we just use the already known setdefault as is used for dictionaries?

>>> d = {1: 2}
>>> d.setdefault(1, 0)
2
>>> d.setdefault(3, 0)
0
>>> d
{1: 2, 3: 0}

This comment has been minimized.

Copy link
@smichr

smichr Sep 21, 2017

Member

This would introduce variation wrt sumpy...which might get updated and then it might be harder to update this. So I'm OK with it as it is. Are you ready to commit this?

This comment has been minimized.

Copy link
@isuruf

isuruf Sep 21, 2017

Author Member

Yes, it'll be easier if the implementations were same.

@isuruf

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2017

@moorepants, @bjodah would you mind running the benchmarks on this PR? Additionally, I can run the benchmarks if you can give me instructions on how to do so.

@isuruf

This comment has been minimized.

Copy link
Member Author

commented Sep 21, 2017

commit cse.TimeCSE.time_cse large_exprs.TimeLargeExpressionOperations.time_cse
This PR 3.48 ms 65.28 ms
master 4.07 ms 22.24 s
sympy-1.1.1 4.08 ms 189.66 ms
sympy-1.0 4.05 ms 195.33 ms
@smichr

This comment has been minimized.

Copy link
Member

commented Sep 23, 2017

OK, I don't see any reason to not move forward with this and nobody else has raised objections. This is in.

@smichr smichr merged commit 0852d92 into sympy:master Sep 23, 2017
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@isuruf isuruf deleted the isuruf:sumpy branch Sep 23, 2017
@moorepants

This comment has been minimized.

Copy link
Member

commented Oct 14, 2017

@isuruf

This comment has been minimized.

Copy link
Member Author

commented Oct 14, 2017

That's weird. I ran the same benchmark and there was a speedup. (I only checked the total time for that benchmark reported by asv, but it seems there is a slow down for all parameters of the benchmark)

@moorepants

This comment has been minimized.

Copy link
Member

commented Oct 14, 2017

When I get back to my office Monday I'll run them on this machine again. I'm not sure if it matters, but I do have the cache size set quite high on this machine.

@moorepants

This comment has been minimized.

Copy link
Member

commented Oct 19, 2017

It looks like for python 2.7 this implementation is slower and for python 3.6 it is faster.

@asmeurer

This comment has been minimized.

Copy link
Member

commented Oct 20, 2017

Are you sure it's this PR. I know there have been general slowdowns in Python 2.7 since #12945

@moorepants

This comment has been minimized.

Copy link
Member

commented Oct 20, 2017

If you zoom in to the portion after Bjorn reverted that cse change that slowed things down, you'll see it: http://www.moorepants.info/misc/sympy-asv/#cse.TimeCSE.time_cse

You can flip the switches on the left to focus on a single py2.7 to py3.6 comparison. It points right to this commit.

@moorepants

This comment has been minimized.

Copy link
Member

commented Oct 20, 2017

And yes, #12945 has contributed to slowdowns in a number of the benchmarks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.