Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coffee migration #98

Merged
merged 143 commits into from May 3, 2017
Merged

Coffee migration #98

merged 143 commits into from May 3, 2017

Conversation

tj-sun
Copy link
Contributor

@tj-sun tj-sun commented Feb 19, 2017

This commit migrates loop optimisations in coffee to gem.

Test if functionalities are migrated

Results of flops comparison are in the link below

The result shows that gem.optimise() matches or betters coffee level 2 by and large. It also demonstrates that coffee level 2 cannot further optimise ASTs generated by gem.optimise().

There are some cases for which gem.optimise() are slightly worse. One example is mixed_poisson, dim=2, p=1, q=1, nf=0.

tsfc output
coffee o2
this PR

Notice COFFEE can merge temporaries t24 and t23. Such optimisation is not implemented here yet.

Compile time is generally OK, e.g. holzapfel takes 4.02s, vs tsfc 0.91s, COFFEE O2 13.80s.

Minor problems:

  • Several subtle cases to refine further, e.g. Power nodes are not analysed yet (might be beneficial to expand the product).
  • latex printing not completed yet.

TODOs:

  • factorise non argument factors in the end (this will likely require expanding the products in rest which is probably expensive, maybe this is another optimisation level)

Results link

tj-sun and others added 30 commits December 16, 2016 12:35
- remove Terminal if statement (because all node have children field)
- singledispatch pattern for reassociation
- Memoizer for reassociation
- use the default reuse_if_untouched() instead of rewriting it
- pop queue from front to make it FIFO
- idempotent test fails, haven't figured out why
- new helper function to collect terms
- turns out it's problem with python set being not stable
- probably need to write a pass to rearrange operands in a specific sequence
(e.g. first by rank, then alphabetically, so that A+B+C == B+C+A) to make
testing easier
- only partially done, still a lot to do
- some basic latex formatting for Gem nodes, to help debugging
- return gem IR as well from compile_integral
- is_equal() and get_hash() for Sum, so that A+B == B+A
- more latex printing
- some examples
- new attempt to do factorise()
- fix bug in latex() for Literal
- remove dev.py, move functionality to optimise.py
- remove previous factorise function
- update collect_term function
- bug fix: expand child after trying factorisation with index i
- still have some problems with Memoizer
- recursive call times out sometimes
- fix bug in expanding products
- still need to Memoize count_flop()
- expand_all_product()
- count_flop()
- collect_terms()
- still quite slow, need to think about heuristics, such as avoid
unnecessary expansion of products
- Need to sort free indices before factorising because sometimes they appear
in different order
- update on tests
- streamlining the algorithm slightly, more to do there
if set(argument_indices) & set(new_node.free_indices):
new_atomics.append(new_node)
else:
new_rest = new_node
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branching looks icky to me. Is it because monomial_sum_to_expression does the wrong thing if there is only one monomial in the monomial sum? If so, we should fix that. It feels like the "generic" logic here should work for a "single monomial" case correctly too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch becomes:

node = monomial_sum_to_expression(sub_monomial_sum)
if set(argument_indices) & set(new_node.free_indices):
    new_monomial_sum.add(sum_indices, (oa, node), one)
else:
    new_monomial_sum.add(sum_indices, (oa, ), node)

# Create new MonomialSum for the factorised out terms
sub_monomial_sum = MonomialSum()
for _atomics, _rest in zip(all_atomics, all_rest):
sub_monomial_sum.add((), _atomics, _rest)
Copy link
Contributor

@wence- wence- Apr 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sub_monomial_sum = MonomialSum()
for monomial in monomials:
    atomics = list(monomial.atomics)
    atomics.remove(oa)
    sub_monomial_sum.add((), atomics, monomial.rest)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended.

# Just one monomial with this group, add to new MonomialSum straightaway
monomial, = monomials
new_monomial_sum.add(*monomial)
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This branch can go, we think.

new_monomial_sum.add(*monomial)
# We should not drop monomials
assert sum(map(len, itervalues(factor_group))) + len(new_monomial_sum) == len(monomial_sum)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sum_indices = next(iter(monomial_sum)).sum_indices

Should probably assert that all the provided monomials have the same sum_indices.

sub_monomial, = sub_monomial_sum
new_atomics = sub_monomial.atomics
new_atomics += (oa,) # add back common factor
new_rest = sub_monomial.rest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can just do:

new_monomial_sum.add(sum_indices, sub_monomial.atomics + (oa, ), sub_monomial.rest)

new_rest = new_node
# Pick sum indices from the first monomial
sum_indices = monomials[0].sum_indices
new_monomial_sum.add(sum_indices, new_atomics, new_rest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These lines disappear.

sub_monomial_sum = MonomialSum()
for _atomics, _rest in zip(all_atomics, all_rest):
sub_monomial_sum.add((), _atomics, _rest)
sub_monomial_sum = optimise_monomial_sum(sub_monomial_sum, argument_indices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This call does too much work, because it groups on sum_indices, but we know sub_monomial_sum only has sum_indices (). So factor out core part and call that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling optimise_mononials now, which doesn't do grouping (probably it should return an iterable of monomials rather than MonomialSum in the next version, will see)

if len(atomic_index) == 0:
return ((), ())
if len(atomic_index) == 1:
return ((next(iterkeys(atomic_index)), ), ())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this bug, should do:

optimal_atomic, = atomic_index.keys()
return (optimal_atomic, )

and the empty branch should just return ().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

- new optimise_monomials method which does group by sum indices
- better grouping by optimal atomics in factorise_atomics
- remove unnecessary IF statements
new_monomial_sum.add(sum_indices, (oa, node), one)
else:
new_monomial_sum.add(sum_indices, (oa, ), node)
return new_monomial_sum
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking, is this function returning a MonomialSum because this function is returning a MonomialSum? Or is there a deeper reason?

- factorise_atomics and optimise_monomials now take in and also
reutrn iterable of monomials, instead of MonomialSum
- add assertion to check all monomials have same sum indices in
optimise_monomials
- remove logic in early return from factorise_atomics
- better assertion for checking sum indices
- docstring
- typos
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am basically happy. The comments in this round a very minor (mostly more stylistic issues), what do you think of them?

atomics = list(monomial.atomics)
atomics.remove(oa) # remove common factor
sub_monomials.append(Monomial((), tuple(atomics), monomial.rest))
sub_monomials = optimise_monomials(sub_monomials, argument_indices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, so we pull out the common factor, then call back into optimise_monomials to potentially exploit further factorisation opportunities in the now remaining expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Added a line comment.

atomics.remove(oa) # remove common factor
sub_monomials.append(Monomial((), tuple(atomics), monomial.rest))
sub_monomials = optimise_monomials(sub_monomials, argument_indices)
assert len(sub_monomials) > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should get at least one monomial back. Otherwise presumably the entire expression was zero or similar?

Copy link
Contributor Author

@tj-sun tj-sun Apr 28, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm there might be a problem here. I might get no monomials back when factorising a*(stuff) + a*(-stuff), I guess I should change this to return [], and add base case to make_sum to handle empty input?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually it's probably okay since we don't simplify stuff + (-1)*stuff, and if we do (later), we will just get Zero out of monomial_sum_to_expression, so we still get a monomial with Zero as rest, and propagate upwards.
So I think I just need to remove this assertion. Will test it.

# new MonomialSum directly
sub_monomial, = sub_monomials
new_monomials.append(
Monomial(sum_indices, sub_monomial.atomics + (oa,), sub_monomial.rest))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Special case the "single monomial" case, because IIUC, there may be some structure here that you want to keep. Question: are the "atomics" in a monomial ordered? (I only ask because here you append the common factor to the new monomial, whereas below you prepend it).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will (potentially) change the generated code when we do grouping of nodes with same free indices. Changed to prepend.

Monomial(sum_indices, sub_monomial.atomics + (oa,), sub_monomial.rest))
else:
# result is a sum, we need to create a new node
node = monomial_sum_to_expression(sub_monomials)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got a sum of monomials, so convert to a gem node, so that we can create a new Monomial that multiplies by the common factor?

This does an optimised for flop minimisation (?) version of:

reduce(Sum, IndexSum(reduce(Product, m.atomics, m.rest), m.sum_indices) for m in sub_monomials)

I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we discussed having a short comment explanation in the code of why you choose to split the two cases apart (because the else branch would work on a singleton list of monomials, no?).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some notes.

# result is a sum, we need to create a new node
node = monomial_sum_to_expression(sub_monomials)
if set(argument_indices) & set(node.free_indices):
new_monomials.append(Monomial(sum_indices, (oa, node), one))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the new node's free indices intersect with the argument indices, then there might be opportunity for further refactorisation later (so the new node is a possible new ATOMIC), otherwise, it's a COMPOUND?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The terminology collect_monomials uses with @tj-sun's classifier: if node's free indices have intersection with the argument indices, then it's either ATOMIC (only one argument index) or COMPOUND (more than one argument index), otherwise it's OTHER.

Product(A3i, Product(Bj, Ek))),
Product(Z, Product(A1i, Product(Bj, Fk))))
result, = optimise_expressions([expr], ((j,), (k,)))
assert count_flop(result) == 2680
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these tests should also assert that the resulting expression is the expected one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All changed. Removed count_flop as well (might need it in tempoary_graph branch later)

# A * B[i]
assert result.children[1].children[1] == Product(A, Bi)
# t * P[i]
assert result.children[0].children[1].children[1] == Product(t, Pi)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to just assert that the result is equal to the expected expression?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

result = replace_division([d])[0]

assert isinstance(result, Product)
assert isinstance(result.children[1], Division)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, just test that the resulting expression is the expected one?

gem/optimise.py Outdated


def make_product(factors, sum_indices=()):
"""Create a Product from collection of factors. Calls sum_factorise to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can link here, say Uses :func:`sum_factorise` to ... ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended.

gem/optimise.py Outdated
return True
return False
mapper = Memoizer(_reassociate_product)
mapper.stop_at = stop_at
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As dicussed, this added code is not used anywhere except in the test suite. Should it be pulled out into a separate changeset?

@miklos1
Copy link
Member

miklos1 commented Apr 28, 2017 via email

@miklos1
Copy link
Member

miklos1 commented Apr 28, 2017 via email

@miklos1
Copy link
Member

miklos1 commented Apr 28, 2017

I suggest to make this a squash merge when we get there, since the overall lines/commit rate is very low.

@miklos1
Copy link
Member

miklos1 commented Apr 28, 2017 via email

tj-sun and others added 2 commits April 28, 2017 16:20
- remove count_flops
- remove reassociate and test
- optimisation testing for expression rather than flops
- update penality constant for ILP
- use defaultdict to map atomics in find_optimal_atomics
- rewrite some comments
@miklos1
Copy link
Member

miklos1 commented Apr 29, 2017

Pushed a little cleanup to gem.optimise. Among others, fixed the docstrings of make_sum and make_product.

@@ -14,7 +14,8 @@
"quadrature_degree": "auto",

# Default mode
"mode": "vanilla",
# "mode": "vanilla",
"mode": "coffee",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we aren't ready to make this switch yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we do make the switch. In this case, just remove the commented line.


:returns: an iterable of factorised :class:`Monomials`s
"""
if not optimal_atomics or len(monomials) < 2:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<= 1 would be slightly better style: what is special about 2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amended.

@miklos1
Copy link
Member

miklos1 commented Apr 29, 2017

Okay, I think we can merge this next business day.

@tj-sun tj-sun merged commit b7251e1 into master May 3, 2017
@tj-sun tj-sun deleted the coffee-migration branch December 15, 2017 12:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants