Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong contains in PartitionDiagrams #25662

Open
mantepse opened this issue Jun 25, 2018 · 27 comments
Open

Wrong contains in PartitionDiagrams #25662

mantepse opened this issue Jun 25, 2018 · 27 comments

Comments

@mantepse
Copy link
Contributor

In 8.3 beta 7:

sage: import sage.combinat.diagram_algebras as da
sage: PD = da.PartitionDiagrams(3/2)
sage: PD.list()
[{{-2, -1, 1, 2}},
 {{-2, 1, 2}, {-1}},
 {{-2, 2}, {-1, 1}},
 {{-2, -1, 2}, {1}},
 {{-2, 2}, {-1}, {1}}]
sage: [[-2,-1],[2,1]] in PD
True

The fix is to remove the individual __contains__ methods, and correct the generic one, and optimize afterwards. However, to avoid a mess, I'd like to wait for the dependencies.

Depends on #25659
Depends on #25462
Depends on #25642

CC: @alauve @tscrim @zabrocki

Component: combinatorics

Branch/Commit: u/mantepse/wrong_contains_in_partitiondiagrams @ ab58ad8

Reviewer: Daniel Krenn

Issue created by migration from https://trac.sagemath.org/ticket/25662

@mantepse mantepse added this to the sage-8.3 milestone Jun 25, 2018
@mantepse
Copy link
Contributor Author

comment:1

I do not understand these doctests:

 sage: import sage.combinat.diagram_algebras as da
            sage: R.<x> = QQ[]
            sage: PA = PartitionAlgebra(2, x, R, 'P')
            sage: PA([]) == PA.one()
            True
            sage: D = da.DiagramAlgebra(2, x, R, 'P', da.PartitionDiagrams(2))
            sage: D([]) == D.one()
            Traceback (most recent call last):
            ...
            ValueError: invalid input of []

Why is PA([]) supposed to give the identity, but D([]) throw an error? Is it because of UnitDiagramMixin magic?

Put differently, the check method of AbstractPartitionDiagram explicitely allows the empty diagram, irrespective of order. Should [] in PartitionDiagrams(2) return true or false?

@mantepse
Copy link
Contributor Author

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 15, 2018

Commit: c32cbed

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Aug 15, 2018

Branch pushed to git repo; I updated commit sha1. Last 10 new commits:

3d5f553Merge branch 'public/combinat/speedup_set_partitions-25462' of git://trac.sagemath.org/sage into t/25659/make_braueralgebra_faster
c7fdf9bInitial Cython version of the PerfectMatchings iterator.
dd4441aAdded fixed point free involution generator and convert to SP. Fixing bug in PerfectMatchingsIterator.
034589dOptimizing generation using fixed-point-free involutions.
7942eabUsing FPF involution iterator and skipping sorting.
80b13f4Use the new backend iterators in diagram_algebras.py.
b486103add reference
77d6902Minor tweaks.
14ed798Merge branch 'public/combinat/speedup_brauer_algebra-25659' of git://trac.sagemath.org/sage into t/25662/wrong_contains_in_partitiondiagrams
c32cbedfix doctests destroyed by merge

@mantepse
Copy link
Contributor Author

comment:5

ping?

@mantepse
Copy link
Contributor Author

mantepse commented Jan 3, 2019

comment:6

ping ping?

@dkrenn
Copy link
Contributor

dkrenn commented Jan 6, 2019

comment:7

Started reviewing:
1.

+            obj = self._element_constructor_(obj)

I think here self(obj) is the correct way to do as this includes the shortcuts for elements with the "correct" parent.

+        if self.order not in ZZ:
+            # check that k and -k are in the same block
+            k = ZZ(self.order + ZZ(1)/ZZ(2))

I have no glue when the order is not in ZZ (apperently it can have half values, but is this always the case? (I am a bit scared about this code...)

So, either someone tells me, where this is explained or someone else takes a look (I'd prefer the latter).

Rest looks good AFAICS.

@dkrenn
Copy link
Contributor

dkrenn commented Jan 6, 2019

Reviewer: Daniel Krenn

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 6, 2019

Branch pushed to git repo; I updated commit sha1. New commits:

e7d2738Merge branch 'u/mantepse/wrong_contains_in_partitiondiagrams' of git://trac.sagemath.org/sage into HEAD
fee85a7replace self._element_constructor_(obj) with self(obj)

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 6, 2019

Changed commit from c32cbed to fee85a7

@mantepse
Copy link
Contributor Author

mantepse commented Jan 6, 2019

comment:9

Wow, that was quick - thank you!

OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly, self(obj) will invoke __call__, which in turn calls _element_constructor_, right? I did not check whether there is a performance penality.

The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.

By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.

Martin


New commits:

e7d2738Merge branch 'u/mantepse/wrong_contains_in_partitiondiagrams' of git://trac.sagemath.org/sage into HEAD
fee85a7replace self._element_constructor_(obj) with self(obj)

@dkrenn
Copy link
Contributor

dkrenn commented Jan 6, 2019

comment:10

Replying to @mantepse:

OK, I admit I don't know about the first one, so I simply tried and it looks OK. If I understand correctly, self(obj) will invoke __call__, which in turn calls _element_constructor_, right? I did not check whether there is a performance penality.

The thing is that _element_constructor_ is not called if the element has already the parent self (maybe there are some additional shortcuts if the element_class already fits). So, therefore this is the way to convert an element. I do not think that the performance is here an issue (and I am very certain that one could construct an example, where your code does not work). So therefore, I am in favor of changing it.

The second one is by design of this class (which is not myself). For the partition algebra, it makes algebraically a lot of sense to have half-integer order. Combinatorially, this simply implies that the final elements of the top and the bottom row, by convention k and -k, are in the same block.

Ok. So then the code is somehow fine for me (because if for whatever reason the order is not half-integer, then there will be an error and not a false result).

It might be worth considering restructuring the code so that if self.order in ZZ: return True so that it is clear that the code ends here in this case and only otherwise we do something. (This is also the style on the lines before).

By design, all other diagram algebras are currently regarded as sub-algebras of the partition algebra. This may be up to debate, but is certainly outside the scope of this ticket.

Clear, this is not for discussion here.

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 6, 2019

Branch pushed to git repo; I updated commit sha1. New commits:

ab58ad8restructure flow

@sagetrac-git
Copy link
Mannequin

sagetrac-git mannequin commented Jan 6, 2019

Changed commit from fee85a7 to ab58ad8

@mantepse
Copy link
Contributor Author

mantepse commented Jan 6, 2019

comment:13

Done!

@dkrenn
Copy link
Contributor

dkrenn commented Jan 7, 2019

comment:14

Thanks, LGTM.

@tscrim
Copy link
Collaborator

tscrim commented Jan 7, 2019

comment:15

There are a number of things wrong or misleading. First, self(foo) redirects to self._element_constructor_(foo) unless foo is already an element of the parent self. I generally disagree with testing self(foo) and catching an error first. This is very slow for things that inherit from AbstractPartitionDiagram. I believe you also get None in PartitionDiagrams(0) returning True whereas the current version, this is False.

I think what you should do is actually properly fix the __contains__ that are there. This might include a better check method too.

@mantepse
Copy link
Contributor Author

mantepse commented Jan 7, 2019

comment:16

As with your comment on the permutation code, I am completely lost.

Maybe you could provide a patch yourself, so I can see what you mean?

Patching the individual __contains__ is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.

@tscrim
Copy link
Collaborator

tscrim commented Jan 7, 2019

comment:17

Replying to @mantepse:

As with your comment on the permutation code, I am completely lost.

Maybe you could provide a patch yourself, so I can see what you mean?

No because there is no patch to provide. What I am saying is that I do not think your proposed change is the way forward with reasoning why. Although I did see one typo above that I have fixed.

Patching the individual __contains__ is not the way to go in my opinion, because it is quite error prone (as visible in the original code). As I wrote above, I think it is better to have a working version first, and optimise afterwards. Note that there are many other diagram algebras that would be good to have, so it is important that the generic method works well.

It is definitely the way to go because it has better separation-of-concerns (checks are localized to each part), it is faster, and it should not be hard to do. If you want to make it work, then fix it properly.

@mantepse
Copy link
Contributor Author

mantepse commented Jan 7, 2019

comment:18

OK, I'll abandon this for the moment, I don't understand what you want me to do (except for implementing a method for each diagram algebra separately, which I believe is wrong). I'll do the quick fix in the permutations file, but this one is too much work right now. Exams coming up.

@mantepse
Copy link
Contributor Author

mantepse commented Jan 7, 2019

Changed author from Martin Rubey to none

@dkrenn
Copy link
Contributor

dkrenn commented Jan 8, 2019

comment:19

Replying to @tscrim:

There are a number of things wrong or misleading. First, self(foo) redirects to self._element_constructor_(foo) unless foo is already an element of the parent self. I generally disagree with testing self(foo) and catching an error first. This is very slow for things that inherit from AbstractPartitionDiagram. I believe you also get None in PartitionDiagrams(0) returning True whereas the current version, this is False.

I think what you should do is actually properly fix the __contains__ that are there. This might include a better check method too.

I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)

@tscrim
Copy link
Collaborator

tscrim commented Jan 8, 2019

comment:20

Replying to @dkrenn:

Replying to @tscrim:

There are a number of things wrong or misleading. First, self(foo) redirects to self._element_constructor_(foo) unless foo is already an element of the parent self. I generally disagree with testing self(foo) and catching an error first. This is very slow for things that inherit from AbstractPartitionDiagram. I believe you also get None in PartitionDiagrams(0) returning True whereas the current version, this is False.

I think what you should do is actually properly fix the __contains__ that are there. This might include a better check method too.

I am not sure either, why you think the proposed code is so bad. Isn't having a quite general implementation favored over individual implementations for each class. (But it might also be that I do not really understand your concerns; so, could we elaborate on this...?)

The __contains__ method should be fast as a general rule. Moreover, general code is usually more complicated as it has poorer separation-of-conerns. This is why we have subclasses. We also already have implementations that mostly work. I see this proposal as 1 step forward, 2 steps back.

@mantepse
Copy link
Contributor Author

mantepse commented Jan 8, 2019

comment:21

A few questions, comments and a clarification:

1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.

Fun fact: removing the "specialised" TemperleyLiebDiagrams.__contains__ results in a factor 0.8 speedup of the following test:

sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list()
sage: %timeit sum(1 for d in l if d in td)

I am guessing that this is because it first checks whether the candidate is in BrauerDiagrams, thus duplicating a lot of work.

2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the __contains__ from TemperleyLiebDiagrams and PlanarDiagrams, too.)

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from partition_algebra.py to this module, using the same generic __contains__ method.

I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.

3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.

@tscrim
Copy link
Collaborator

tscrim commented Jan 9, 2019

comment:22

Replying to @mantepse:

A few questions, comments and a clarification:

1.) I actually did propose to implement efficient methods once the generic check works, and as need arises. As you know I spent quite a bit of time speeding up the code for diagram algebras, because it was too slow for my computations. Thus, I think you could trust me that I care about speed and thought about how to achieve it.

Martin, I know you care about speed. Patronizing me is not useful to the discussion.

Fun fact: removing the "specialised" TemperleyLiebDiagrams.__contains__ results in a factor 0.8 speedup of the following test:

sage: r=5; td = da.TemperleyLiebDiagrams(r); l = da.PartitionDiagrams(r).list()
sage: %timeit sum(1 for d in l if d in td)

I am guessing that this is because it first checks whether the candidate is in BrauerDiagrams, thus duplicating a lot of work.

Probably. That is an interesting data point.

2.) I do think the general rule is to go for correctness first, and then optimize. (This is not saying that my patch is perfect, of course. In particular, I think I should have removed the __contains__ from TemperleyLiebDiagrams and PlanarDiagrams, too.)

IMO that is even more a step in the wrong direction.

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from partition_algebra.py to this module, using the same generic __contains__ method.

I feel that is a bit of a misrepresentation of those __contains__ as we are deprecating those classes.

I do not see what "separation-of-concerns" has to do with my approach: I am simply determining containment by trying to construct the diagram.

You are taking things that should be in separate classes and pulling them to one class. The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

3.) I understand that you think "it is definitively the way to go", but I do not see why. I think your argument about speed is invalid, because the current implementation is buggy, and duplication of code is often a source of bugs.

That does not make it invalid. It just means we need to fix a bug. We are only arguing about how to fix the bug. IMO, your fix makes the code less clear, harder to optimize in the future (as any optimization would undo your change), and likely generically makes the code slower (but I have not run timings yet).

There is also very little duplication of code. The only part that is duplicated is the part that converts a non (abstract) partition diagram to an object of the parent. If you are worried about that, you can make that into a function. However, the meat of the issue has no duplication.

I wonder if the only thing that is going wrong is this part:

# check that k and -k are in the same block

@mantepse
Copy link
Contributor Author

mantepse commented Jan 9, 2019

comment:23

Replying to @tscrim:

Martin, I know you care about speed. Patronizing me is not useful to the discussion.

reciprocally.

In the case at hand I think it is particularly important to have a well working generic check, to make it less work to implement new diagram algebras. In particular, #25637 moves three from partition_algebra.py to this module, using the same generic __contains__ method.

I feel that is a bit of a misrepresentation of those __contains__ as we are deprecating those classes.

No. Those classes (rooks, planar rooks and permutations) are migrated to the better framework in diagram_algebra.py, only the old implementations are deprecated.

You are taking things that should be in separate classes and pulling them to one class.

No. This is absolutely not what the patch does. I am only removing code.

Besides:

I believe you also get None in PartitionDiagrams(0) returning True [...]

No. I agree it is a good idea to check this corner case!

Anyway. My offer is as follows:

  • remove the __contains__ here
  • do deprecate module combinat.partition_algebra #25637, so that all algebras are in this class
  • discuss and implement a better __contains__ machinery - but after having migrated the other classes from partition_algebra.py to diagram_algebra.py.

A final question:

I wonder if the only thing that is going wrong is this part:

# check that k and -k are in the same block

I don't really know. The logic is (for my brain) scattered over too many places.

The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

However, I think it would make sense to swap the __contains__ and check methods: the meat of the check could be in __contains__ of the parent classes, and the check method could be generic in AbstractPartitionDiagram.

The advantage is speed, the cost is that one probably looses some information for the error message. For example, TemperleyLieb is the intersection of Brauer and Planar. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A generic check could probably only say "This is not a Temperley-Lieb diagram".

The other way to gain performance and remove duplication of code is probably to remove AbstractSetPartition.

@tscrim
Copy link
Collaborator

tscrim commented Jan 9, 2019

comment:24

Replying to @mantepse:

Replying to @tscrim:

You are taking things that should be in separate classes and pulling them to one class.

No. This is absolutely not what the patch does. I am only removing code.

Yes it is. That is precisely what you are doing. Everything goes into the base class's __contains__, so that is the only method that is being called instead of the subclasses' __contains__.

Anyway. My offer is as follows:

  • remove the __contains__ here

This is the sticking point and what I disagree with doing. How about we fix the actual bug?

#25637 is basically independent of this and those old classes are meant to be deprecated. So that is irrelevant to this discussion.

  • discuss and implement a better __contains__ machinery - but after having migrated the other classes from partition_algebra.py to diagram_algebra.py.

This is not a good strategy. Fix the problem at hand. It sounds like you are suggesting to remove code just to add it back in. If you are not sure and feel it warrants a full design discussion, then let us fix the bug using the current framework, which should be easy, and then have the discussion.

A final question:

I wonder if the only thing that is going wrong is this part:

# check that k and -k are in the same block

I don't really know. The logic is (for my brain) scattered over too many places.

Are you saying you are unable to debug this with the current implementation?

The containment check for being a Brauer diagram should be done by the Brauer diagram parent class after it has been determined that it is a partition diagram by the partition diagram class. Each class tests what it is suppose to be.

However, I think it would make sense to swap the __contains__ and check methods: the meat of the check could be in __contains__ of the parent classes, and the check method could be generic in AbstractPartitionDiagram.

I generally agree with this, but your proposed change actually goes the opposite direction of this.

The advantage is speed, the cost is that one probably looses some information for the error message. For example, TemperleyLieb is the intersection of Brauer and Planar. Currently the user is told whether the diagram has bad blocks, and if all blocks are OK, whether the diagram is planar. A generic check could probably only say "This is not a Temperley-Lieb diagram".

In properly optimized code, there would be no speed advantage to having a generic method. Generic code needs to work generically, and so it cannot take advantage of special structure.

The other way to gain performance and remove duplication of code is probably to remove AbstractSetPartition.

Removing ABCs is usually independent of performance, and it generally involves more code duplication because there is less commonality with related classes.

@mkoeppe mkoeppe removed this from the sage-8.3 milestone Dec 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants