Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unification issue #166

Closed
pguillebert opened this issue Jan 26, 2016 · 15 comments
Closed

Unification issue #166

pguillebert opened this issue Jan 26, 2016 · 15 comments
Milestone

Comments

@pguillebert
Copy link

Hi

I have this unexplainable behaviour :

(defrecord Thing [id data])
(defrecord Config [k])

(defrule rule-one
  [?config <- Config]
  [?new <- Thing (= id nil)
   ;; this unification works (case 1):
   ;;(= ?var (get-in data [1]))
   ;; this does not ! (case 2)
   (= ?var (get-in data [(:k ?config)]))
   ]
  [:not [Thing (not= id nil)
         ;; this works (case 1)
         ;; (= ?var (get-in data [1]))
         ;; does not (case 2)
         (= ?var (get-in data [(:k ?config)]))
         ]]
  =>
  (println "Found with k=" (:k ?config) ?new))

(defn gooo
  []
  (-> (mk-session :cache false)
      (insert  (->Config 1)
               ;; Things are different (at vector position 1),
               ;; this should print something
               (->Thing nil [ 1  4  5 ])
               (->Thing 131 [ 1  2  5 ])
               )
      (fire-rules)) nil)

So in case 1 it works, but if I try to extract the vector position from Config record, it silently fails. This may be related to the :not condition, I don't know.

Am I doing something wrong ?

@rbrush
Copy link
Contributor

rbrush commented Jan 26, 2016

This is a bug. What's happening is Clara is sees the = operator and is trying to do a hash-based join between ?var and (get-in data [(:k ?config)]). But that fails because the (get-in data [(:k ?config)]) doesn't have visibility to other facts at the point in the rete network where the hash is created.

The fix will be to check for nested use of other facts and skip the hash-based join in this case. I think we can get that out quickly for the upcoming 0.10 release. In the meantime, there is an ugly-ish workaround to get Clara to avoid attempting to do a hash-based join here:

Have a function that does the equality check:

(defn equals [a b]
  (= a b))

And them replace (= ?var (get-in data [(:k ?config)])) with a call to that function, like (equals ?var (get-in data [(:k ?config)])). This avoids the issue since Clara won't attempt to do a hash-based unification unless it sees = or ==.

Thanks for reporting this, and apologies for any issues it may have caused!

@pguillebert
Copy link
Author

Thanks for the quick reply. Unfortunately, the workaround you describe does not work, the ?var ends up unbound :

ExceptionInfo Using variable that is not previously bound. This can happen when an expression uses a previously unbound variable, or if a variable is referenced in a nested part of a parent expression, such as (or (= ?my-expression my-field) ...). 
Unbound variables: #{?var}  clojure.core/ex-info (core.clj:4593)

@rbrush
Copy link
Contributor

rbrush commented Jan 26, 2016

I think I misdiagnosed the problem. While Clara will unify variables, it won't yet define new variables that depend on function of previous variables, so it wouldn't define a new ?var that depends on (:k ?config). This is something we can add, and failing silently is the worst thing that we can do here.

I'm not quite sure what behavior you're looking for, but some variation of the following might work. Rather than defining a ?var variable that is the result of a function of ?config, we just get the data from each and do a comparison as a normal rule constraint.

(defrule rule-one
  [?config <- Config]
  [?new <- Thing (= id nil)
   (= ?first-data data)]
  [:not [Thing (not= id nil)
         (= (get-in ?first-data [(:k ?config)])
             (get-in data [(:k ?config)]))]]
  =>
  (println "Found with k=" (:k ?config) ?new))

@pguillebert
Copy link
Author

OK thanks this works around my issue. I liked the symmetry in expression in my initial issue but I guess I'll have to use the workaround for now.

I have absolutely no idea how complex this would be to add this feature, but I second failing silently is not what we want :)

Thanks again for you insight.

rbrush added a commit that referenced this issue Jan 27, 2016
@rbrush
Copy link
Contributor

rbrush commented Jan 27, 2016

It should be fixed in the above commit. (It's on the issue-166 branch if you want to take it for a spin.) I had actually added much of what was needed here previously, but it wasn't working in the case where a given variable was bound from other variables multiple times.

If you do get a chance to take the issue-166 branch for a spin, please let me know if you have any problems here. I'm targeting this for the 0.10.0 release, which we are going to push as soon as we get a couple unrelated optimizations finished.

@rbrush rbrush added this to the 0.10.0 milestone Jan 27, 2016
@pguillebert
Copy link
Author

I'm testing this branch, this indeed solves the issue with the clojure snippet above. But, in my real-world test suite, after editing my rules to match the new allowed syntax, one of the test fails. So somehow this proves a bit problematic.

In my real-world rules, facts are Java Beans, any chance this can impact the fix ?

I'm just editing one constraint to change from the workaround style to the initial issue style. Other than that, I'm not editing anything else.

Also other real-world tests relying on the edited rule seem to work as previously.

So... a bit puzzled.

@rbrush
Copy link
Contributor

rbrush commented Jan 27, 2016

It's hard to imagine that using Java Beans would have an impact on this, since that interaction is pretty well contained elsewhere in the code. (I'm assuming the Java Beans aren't being mutated while participating in the Clara session.)

If you leave the workaround to this issue in, does your code continue to work on this branch as it did in 0.9.2? Some other things changed on this stream since 0.9.2, so I'd want to make sure nothing broke independently of this refactoring.

Also, this might change the order in which rules fire or the results are exposed in a query. I've had tests break across changes because I had unknowingly made ordering assumptions that weren't guaranteed.

@pguillebert
Copy link
Author

Ok so Beans are not the issue.

When I leave the workaround, all tests run OK using 0.10.0-SNAPSHOT. So apparently only this change breaks my code.

I also tested the new syntax with 0.9.2, as you would expect many tests break.

Regarding order of rules, I don't think I make assumptions ; the only one is that truth maintenance will ensure that if conditions that fired a rule changes the result is retracted.

Output of the engine is a considered a Set so not considered ordered at all.

I'll try to reproduce in a simpler environment. There are maybe other conditions in the real rule that trigger differently.

Thanks

@pguillebert
Copy link
Author

I think I may hitting another limit with my real rule. Below, you will see that in the real use case, I also wanna make sure that another related fact ContainerType has not been already inserted :

(defrecord Thing [id data])
(defrecord Config [k])
(defrecord ContainerType [thing])

(defrule rule-one
  [?config <- Config]
  [?new <- Thing (= id nil)
   (= ?var (get-in data [(:k ?config)]))]
  [:not [Thing (not= id nil)
         (= ?var (get-in data [(:k ?config)]))]]
  ;; This is problematic
  [:not [ContainerType (= ?new thing)]]
  =>
  (println "Found with k=" (:k ?config) ?new))

(defn gooo
  []
  (-> (mk-session :cache false)
      (insert  (->Config 1)
               ;; Things are different (at vector position 1),
               ;; this should print something
               (->Thing nil [ 1  4  5 ])
               (->Thing 131 [ 1  2  5 ])
               ;; Add a completely unrelated ContainerType
               ;; with a different Thing inside
               (->ContainerType (->Thing 232 [8 6 4]))
               )
      (fire-rules)) nil)

So as you can see I try to unify ?new, my match, with the contents of a ContainerType. I added a completely unrelated ContainerType and suddenly this prevents my rule from firing (apparently for all values of ContainerType ?)

I guess this fails is because I'm using ?new in another condition ?

@rbrush
Copy link
Contributor

rbrush commented Jan 27, 2016

Ugh, looks like an edge case I didn't cover in the above fix. I'll get this taken care of either tonight or tomorrow.

These are hitting a number of complex join flows in negations that we obviously haven't exercised well. I really appreciate the simple reproducing examples and your patience as we work through them.

@pguillebert
Copy link
Author

No worries, I understand this use case is not necessarily the most trivial :)

@rbrush
Copy link
Contributor

rbrush commented Jan 28, 2016

Alright, the above commit addresses the latest issue described here. As a byproduct it is more aggressive about detecting variable bindings in negations that are not visible elsewhere and therefore don't make sense. That's probably a good thing, but if such unusable bindings are in existing code an exception may now be seen.

Hopefully there are no further edge cases I'm overlooking! I'm going to spend some more time making sure, but wanted to get this available at least as a draft.

@pguillebert
Copy link
Author

Congrats, it solves my issue with the failing test. 👍

@rbrush
Copy link
Contributor

rbrush commented Jan 29, 2016

And there was much rejoicing! Merged into master and targeted to 0.10. I might add some further tests for other use cases (like deeply nested rules that follow such patterns), but this is a good step as it is.

@rbrush
Copy link
Contributor

rbrush commented Feb 4, 2016

Closed in preparation for the 0.10 release.

@rbrush rbrush closed this as completed Feb 4, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants