Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider changing default subproblem blocking #2

Closed
alex-lew opened this issue Mar 4, 2021 · 1 comment · Fixed by #16
Closed

Consider changing default subproblem blocking #2

alex-lew opened this issue Mar 4, 2021 · 1 comment · Fixed by #16

Comments

@alex-lew
Copy link
Contributor

alex-lew commented Mar 4, 2021

In the paper, it is implied that by default, all attributes and reference slots of a class belong to the same subproblem. However, the current implementation uses the opposite convention: that by default, each attribute or reference slot is in its own subproblem, requiring manual blocking in order to define bigger subproblems. We should consider changing this to match the paper.

@marcoct
Copy link
Contributor

marcoct commented Mar 6, 2021

As a first-time user, the first code I wrote did not include subproblem blocking, and I got nonsense results without any indication that I did anything wrong. I think defaulting to bigger blocking -- and in particular blocking everything together -- makes sense from a new user perspective, because I think a common default user strategy will be to: (i) either be willing to wait for it to run because it looks like it is only taking a minute or so -- and they (hopefully) get somewhat sensible results, or (ii) they kill the run early and re-run it on less data. I think that subproblems should be seen by users as a performance optimization, not a 'thing needed to get good results'.

(And the type of subproblem solution strategy employed in PClean is great, because it has the property that it is often possible to guarantee the exact solution to a subproblem, so the "slow but exact" starting point in the iteration space is available, and only after users see some results that they can make sense of do they need to venture into the much more complex "faster but less accurate" part of the space.)

@alex-lew alex-lew added this to To do in PClean Kanban Mar 6, 2021
@alex-lew alex-lew moved this from To do to In progress in PClean Kanban Mar 6, 2021
PClean Kanban automation moved this from In progress to Done Mar 6, 2021
alex-lew added a commit that referenced this issue Mar 6, 2021
…lems

Treat contiguous statements as belonging to the same subproblem by default (resolves #2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
2 participants