Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix implementation of Product.logprob_disjoint_union #12

Closed
fsaad opened this issue Feb 18, 2020 · 4 comments
Closed

Fix implementation of Product.logprob_disjoint_union #12

fsaad opened this issue Feb 18, 2020 · 4 comments

Comments

@fsaad
Copy link
Collaborator

fsaad commented Feb 18, 2020

https://github.com/probcomp/sum-product-dsl/blob/ba6e29362dec68422924cc23151cad2bce263eea/src/spn.py#L316

Faulty reasoning in disjoint-union algorithm for probabilities.

Key idea: Pr[A or B] // where A and B are in DNF
= Pr[A or (B and ~A)] // disjoint union property
= Pr[A] + Pr[B and ~A] // since events are disjoint

Since A and B are in DNF, write them as
A = ϕ(A; X) and ϕ(A; Y)
B = ϕ(B; X) and ϕ(B; Y)

To assess B and ~A, we have:
[ϕ(B; X) and ϕ(B; Y)] and ~[ϕ(A; X) and ϕ(B; Y)]

= [ϕ(B; X) and ϕ(B; Y)] and [~ϕ(A; X) or ~ϕ(B; Y)]

= [ϕ(B; X) and ϕ(B; Y) and ~ϕ(A; X)]
or [ϕ(B; X) and ϕ(B; Y) and ~ϕ(B; Y)]

= [ϕ(B; X) and ~ϕ(A; X) and ϕ(B; Y)]
or [ϕ(B; X) and ϕ(B; Y) and ~ϕ(B; Y)]

the issue is that clauses in this DNF expression not necessarily
disjoint.

Example:

ϕ(A; X) = X > 0
ϕ(A; Y) = Y < 1
ϕ(B; X) = X < 1
ϕ(B; Y) = Y < 3

Then the final expression is:

[(X < 1) and ~(X > 0) and (Y < 3)]
or [(X < 1) and (Y < 3) and ~(Y < 1)]
= [(X < 1) and (X ≤ 0) and (Y < 3)]
or [(X < 1) and (Y < 3) and (Y ≥ 1)]
= [ (X ≤ 0) and (Y < 3) ] or [ (X < 1) and (1 ≤ Y < 3) ]

which is not disjoint, and reduces to inclusion-exclusion.

@fsaad
Copy link
Collaborator Author

fsaad commented Feb 18, 2020

To assess B and ~A, we have:
[ϕ(B; X) and ϕ(B; Y)] and ~[ϕ(A; X) and ϕ(B; Y)]

= [ϕ(B; X) and ϕ(B; Y)] and [~ϕ(A; X) or ~ϕ(B; Y)]

Specifically, the current implementation fails to apply De Morgan's law in the final equality, and instead computes [ϕ(B; X) and ϕ(B; Y)] and [~ϕ(A; X) and ~ϕ(B; Y)] in the make_disjoint_conjunction:
https://github.com/probcomp/sum-product-dsl/blob/ba6e29362dec68422924cc23151cad2bce263eea/src/spn.py#L354-L362

@fsaad
Copy link
Collaborator Author

fsaad commented Feb 18, 2020

In general, the exponential (in the number of clauses in the DNF statement) query time needed to compute the probability of a union is not a function of the underlying model class but rather a function of the query class, and it does not seem possible to prove one can compute it efficiently under any circumstance.

@fsaad fsaad changed the title Expunge faulty implementation of Product.logprob_disjoint_union Fix implementation of Product.logprob_disjoint_union Feb 18, 2020
@fsaad
Copy link
Collaborator Author

fsaad commented Feb 18, 2020

This issue has a downstream effect on the condition, and is thus blocking #10.

The Inclusion-Exclusion principle can give us the required weights but the sub-terms do not yield a DNF expression with disjoint clauses, i.e.,

Pr[ A or B or C]
= Pr[A] + Pr[B and ~A] + Pr[C and ~A and ~B]

From Inclusion--Exclusion we have
Pr[A or B or C] =
(+1) Pr[A] Pr[B] Pr[C]
(-1) Pr[AB] Pr[AC] Pr[BC]
(+1) Pr[ABC]

Thus we have:
Pr[A] = Pr[A]
Pr[B and ~A] = Pr[B] - Pr[AB]
Pr[C and ~A and ~B] = Pr[C] - Pr[AC] - Pr[BC] + Pr[ABC]

The result is a Sum SPN, where the events and weights of each of the three conditioned children is shown above.

The challenge is how do we condition Product on an event such has Pr[B and ~A]. The previous (faulty) reasoning treated [B and ~A] as a single conjunction, but in fact it can be the disjunction of two non-disjoint sets as was shown in the original post, and so it needs to be decomposed into non-overlapping rectangles.

@fsaad
Copy link
Collaborator Author

fsaad commented Feb 18, 2020

A temporary WIP has been pushed to 3981c98.

In the meantime, we will side-step this issue by

  • use inclusion-exclusion algorithm to compute Product.logprob
  • discard all clauses that do not have probability zero.
  • disallow condition on an OR expression for Product with non pairwise disjoint clauses.
  • remove the disjoint_union algorithm from Product.

fsaad pushed a commit that referenced this issue Feb 19, 2020
@fsaad fsaad closed this as completed in b64f0cb Feb 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant