You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the paper, it is implied that by default, all attributes and reference slots of a class belong to the same subproblem. However, the current implementation uses the opposite convention: that by default, each attribute or reference slot is in its own subproblem, requiring manual blocking in order to define bigger subproblems. We should consider changing this to match the paper.
The text was updated successfully, but these errors were encountered:
As a first-time user, the first code I wrote did not include subproblem blocking, and I got nonsense results without any indication that I did anything wrong. I think defaulting to bigger blocking -- and in particular blocking everything together -- makes sense from a new user perspective, because I think a common default user strategy will be to: (i) either be willing to wait for it to run because it looks like it is only taking a minute or so -- and they (hopefully) get somewhat sensible results, or (ii) they kill the run early and re-run it on less data. I think that subproblems should be seen by users as a performance optimization, not a 'thing needed to get good results'.
(And the type of subproblem solution strategy employed in PClean is great, because it has the property that it is often possible to guarantee the exact solution to a subproblem, so the "slow but exact" starting point in the iteration space is available, and only after users see some results that they can make sense of do they need to venture into the much more complex "faster but less accurate" part of the space.)
In the paper, it is implied that by default, all attributes and reference slots of a class belong to the same subproblem. However, the current implementation uses the opposite convention: that by default, each attribute or reference slot is in its own subproblem, requiring manual blocking in order to define bigger subproblems. We should consider changing this to match the paper.
The text was updated successfully, but these errors were encountered: