New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for functional dependencies within a variable table #471
Comments
This should have really been in the code a year ago @feiranwang. @zhangce Can you fix this? |
For functional dependencies, DeepDive can recognize columns of variable relations without |
The OneIsTrue factor type doesn't seem to be implemented as a constraint, so categorical variable seems like the way to go. For grounding, I'd do a GROUP BY @key to get an array of tuple IDs for each group; and each such array would be a categorical variable in the sampler. |
Right, unless we turn it into categorical, there's probably no difference in the blowup. This sounds also doable at the DDlog level, by desugaring such cases. I think you'll want to correlate such categorical variables with other boolean ones, but I believe that is currently not possible. I think we'll need to add a few more ways to mix them as well to make this actually useful, implications, etc. Please correct me if I'm wrong. With my limited exposure to such use cases, the current Categorical/Multinomial support doesn't really typecheck with the rest in my head, so I wish someone could clarify everything with a full blown example. |
Very good point! Yes, looks like we do have to keep the identities of the underlying boolean variables in those groups so that they can correlate with other variables. I don't know how categorical variables are currently implemented in the sampler. For things to work, the underlying boolean variables must be referenceable in the factors. Conceptually it'd be cleaner to say all variables are boolean and it's possible enforce one-is-true constraints among a group of variables. As for implementation, I don't know if it's easier for the Variable class or the Factor class to handle such constraints... |
@alldefector Sorry I'm not quite following here. It seems this can be done using multinomial variables or boolean with one-is-true constraints. Could you give a concrete example that requires more support in the system? |
Yes, it is the one-is-true constraint. However, the sampler doesn't seem to support this constraint currently. There is a factor type by the same name, but it's not a constraint. Also, there is no front-end support in DeepDive / DDLog to take advantage of it once we do have such support in the sampler. |
Vanilla gibbs sampling gets broken if you add in a "hard" constraint (i.e. Of course this problem goes away with actual categorical variables. I actually think implementing this simple block gibbs scheme might be On Thu, Jan 28, 2016 at 4:44 PM alldefector notifications@github.com
|
That'd be great! FWIW, these ancient systems probably had such blocking from the get-go: |
ancient systems... dastardly. On Thu, Jan 28, 2016 at 5:15 PM alldefector notifications@github.com
|
Bump. Can someone explain what's going on here? @netj |
Also, @netj and @feiranwang, it should be super easy to declare categorical that are linked (say for our old entity linking design, they need to be able to express LinksToOne(candidate, entity) as a categorical random variable (each candidate maps to one entity). This should be super easy (@key is fine). @zhangce, I know you're traveling but your thoughts are welcome when you get back on line :) @ajratner I don't want to redo the full constraints--no one seem to use them, and they have lots of code complexity... maybe in the summer :) |
OK, we do have categorical vars now. |
This is the front-end support for the multinomial variable type in the sampler.
This is critical for classification targets such as entity linking, where the table schema is <object, class> and for each value of 'object', only one 'class' can be true in any state. If we don't support this constraint, we'd have to use the pairwise exclusivity rule to emulate it -- which would result in dramatic blowups in the factor graph.
The text was updated successfully, but these errors were encountered: