Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to list all the facts? #48

Closed
tbatchelli opened this issue Mar 21, 2014 · 12 comments
Closed

Is there a way to list all the facts? #48

tbatchelli opened this issue Mar 21, 2014 · 12 comments
Milestone

Comments

@tbatchelli
Copy link

During development, it is important to have some introspection into what the rete network is doing as a result of facts and rules. The only way I have found to introspect the network has been via queries, which require adding new query rules and is not very interactive.

Is there a way to list all the facts, rules that fired and why? if not, consider this a request for adding such features. I am reading the paper on which this project is based on to see if I can give it a try.

@tbatchelli tbatchelli changed the title Is there a way to list all the facts Is there a way to list all the facts? Mar 21, 2014
@rbrush
Copy link
Contributor

rbrush commented Mar 22, 2014

This isn't available right now, but is something I'd like to add. A relatively simple addition would be an "explain" function, which given a rule could list all pending activations and all supporting facts for that rule. For a query it would list all items that could be queried by it, and all supporting information that caused a query to match.

Ultimately I'd like to expose the entire rete network and its contents for inspection tools, but this would be more complicated to build and use.

Did you have some use cases in mind you're looking to tackle? Are you generally wanting to understand why your rule base is behaving in a certain way, or is there something more specific?

Thanks!

-Ryan

@tbatchelli
Copy link
Author

The use case I care about is to both validate when things work as expected and provide some insight when things don’t.

If I were to put myself in the place a clara user and didn’t understand RETE, I would need to know which facts were deduced from the rule-set and fact-set, and why. Even better, I’d like to see how the outcome would change if I added/modified one rule.

The explain function that you mention seems to only cover one rule at a time? 

I did some exploration of the rete network, and I wasn’t sure at the time that all the information needed for the goal above is retained, for example: are all facts generated retained or the ones for which there isn’t a query for are ‘lost’?

Antoni Batchelli,


From: Ryan Brush notifications@github.com
Reply: rbrush/clara-rules reply@reply.github.com
Date: March 21, 2014 at 9:20:02 PM
To: rbrush/clara-rules clara-rules@noreply.github.com
Cc: Antoni Batchelli tbatchelli@acm.org
Subject:  Re: [clara-rules] Is there a way to list all the facts? (#48)

This isn't available right now, but is something I'd like to add. A relatively simple
addition would be an "explain" function, which given a rule could list all pending activations
and all supporting facts for that rule. For a query it would list all items that could be
queried by it, and all supporting information that caused a query to match.

Ultimately I'd like to expose the entire rete network and its contents for inspection
tools, but this would be more complicated to build and use.

Did you have some use cases in mind you're looking to tackle? Are you generally wanting
to understand why your rule base is behaving in a certain way, or is there something more
specific?

Thanks!

-Ryan


Reply to this email directly or view it on GitHub:
#48 (comment)

@rbrush
Copy link
Contributor

rbrush commented Mar 25, 2014

Clara doesn't maintain a reference to facts that don't match any criteria for any rule. This is actually done on purpose, for a use case where I'm getting an arbitrarily long sequence of events and I need not-matching items to be garbage collected. However, we can get a list of facts that match any criteria (even if they don't match the whole rule) and return the nodes in the Rete network that match those facts. If you need to keep track of facts that don't match any criteria at all, it's probably necessary to keep them in a list external to Clara's working memory.

I think the best way to approach this need might be to expose the entire working memory and the Rete network itself as a data structure, which we'd define with Prismatic Schema. Users could then write arbitrary functions that inspect that structure, and turn it into a displayable form for troubleshooting. I could imagine a table of every condition for every rule, and the set of facts that match that condition.

I should have some time in the next few days to hack around with this. Ultimately I'd love to render the Rete network in a browser so one can click on each node and see its contents...but just getting everything visible as a data structure would be a step in that direction, and allow simpler means of inspection as well.

Let me know if you have any further thoughts on how we might make this easier for users. Most rule engines don't do a great job of explainability, so I'd love to do something better here.

@tbatchelli
Copy link
Author

Having a catch-all rule that matches all facts would solve my use case, I believe, as this is something only needed at development time, but not for production. If I understand correctly, this would require no changes to the current implementation, right?

Exposing the rete network as a data structure would be fantasic. From this we could then iterate on what it is that makes it usable from the non-developer user.

In a way, I think it’s worth considering a development/production time duality, where at development time metadata of who, what, where and why is kept around, whereas at production time this info is not kept. On this note, a different approach could be that all firings for a ‘session’ are logged somewhere (log file, as data, as a channel, etc), and each log entry contains enough context to follow the RETE network, along with the rete-as-data model you suggested. If you have the RETE network that names with the node names, and the logs, then you kinda can go through RETE’s network firings step by step and understand what happened (as opposed to keeping data around in the nodes themselves), or even build a graphical tool that replays the log without necessarily running the network.

Antoni Batchelli,


From: Ryan Brush notifications@github.com
Reply: rbrush/clara-rules reply@reply.github.com
Date: March 24, 2014 at 8:39:09 PM
To: rbrush/clara-rules clara-rules@noreply.github.com
Cc: Antoni Batchelli tbatchelli@acm.org
Subject:  Re: [clara-rules] Is there a way to list all the facts? (#48)

Clara doesn't maintain a reference to facts that don't match any criteria for any rule.
This is actually done on purpose, for a use case where I'm getting an arbitrarily long
sequence of events and I need not-matching items to be garbage collected. However, we
can get a list of facts that match any criteria (even if they don't match the whole rule)
and return the nodes in the Rete network that match those facts. If you need to keep track
of facts that don't match any criteria at all, it's probably necessary to keep them in
a list external to Clara's working memory.

I think the best way to approach this need might be to expose the entire working memory
and the Rete network itself as a data structure, which we'd define with Prismatic Schema.
Users could then write arbitrary functions that inspect that structure, and turn it
into a displayable form for troubleshooting. I could imagine a table of every condition
for every rule, and the set of facts that match that condition.

I should have some time in the next few days to hack around with this. Ultimately I'd love
to render the Rete network in a browser so one can click on each node and see its contents...but
just getting everything visible as a data structure would be a step in that direction,
and allow simpler means of inspection as well.

Let me know if you have any further thoughts on how we might make this easier for users.
Most rule engines don't do a great job of explainability, so I'd love to do something better
here.


Reply to this email directly or view it on GitHub:
#48 (comment)

@rbrush
Copy link
Contributor

rbrush commented Mar 25, 2014

Haha, of course you're right that a catch-all rule or query that matches all facts would work if you want to enumerate everything. Not sure why that didn't occur to me.

I think we can add more development-time instrumentation as well, such as the change log you suggest. This is related to an idea way back in issue #16 that I haven't tackled yet, where changes to the working memory could be logged as a mechanism for fault recovery (essentially a write-ahead log of rule state changes), but that would be repayable to see the state transitions of the rule engine and could be useful as a development tool. I hadn't touched that one in some time due to competing priorities but might dust it off.

In any case, I'm going to move forward with the rete-as-data idea. Just exposing it as a data structure shouldn't be terribly involved, so hopefully I can get something committed in the next few days.

rbrush pushed a commit that referenced this issue Mar 28, 2014
@rbrush
Copy link
Contributor

rbrush commented Mar 28, 2014

Took an initial stab at this in the above commit. The clara.tools.inspect namespace will contain tools for inspecting the contents of a rule session. Right now it's just an inspect function that returns a map with the following keys:

  • :rule-matches -- a map of rule structures to their matching Rete tokens.
  • :query-matches -- a map of query structures to their matching Rete tokens.
  • :condition-matches -- a map of conditions pulled from each rule to facts they match.

The Rete tokens are simply a list of facts and bound variables that caused the rule to fire. That should help explain the behavior. The condition-matches would help troubleshooting non-firings, by seeing exactly what conditions were matched (and implicitly, which ones didn't) for a rule.

We can expose more internals of the rete network itself as well. That will be somewhat more complicated, but given in tandem with functions that walk or inspect it could support more sophisticated tooling.

Keeping this in a branch for now as we kick around the approach some more, but if this seems close as is we can merge into master soon.

@tbatchelli
Copy link
Author

I was finally able to work on it and I have some results to share and a question.

I made a function to print out the rule-matches. So given the following rule set and facts:

(defn run-test-2 []
  (let [rules
        [(make-rule
          'a-then-b
          [:fact [{disc :disc val :value}] (and )]
          [:fact [{val :value}] (= val :a)]
          =>
          (insert! {:type :fact :value :b}))
         (make-rule
          'a-then-d
          [:fact [{val :value}] (= val :a)]
          =>
          (insert! {:type :fact :value :d}))
         (make-rule
          'a-and-b-then-c
          [:fact [{val :value}] (= val :a)]
          [:fact [{val :value}] (= val :b)]
          =>
          (insert! {:type :fact :value :c}))
         (make-query
          'get-all-the-facts []
          [?fact <- :fact])]
        print-all-the-facts!
        (fn [s]
          (doseq [{:keys [?fact]} (query s 'get-all-the-facts)]
            (println "value:" (:value ?fact)))
          s)]
    (-> (mk-session rules :cache false :fact-type-fn :type)
;;        (insert {:type :fact :value :a :disc 2}) ;; use a discriminator to distinguish between same facts
        (insert {:type :fact :value :a :disc 1})
        (fire-rules)
        (print-all-the-facts!))))

I get this output:

value: :a
value: :b
value: :d
value: :c
rule a-and-b-then-c
    executed:
    (clara.rules/insert! {:type :fact, :value :c})
    because these facts:
    {:type :fact, :value :b}
    {:disc 1, :type :fact, :value :a}
    matched:
    type=:fact and [(= val :a)] where [{val :value}]
    type=:fact and [(= val :b)] where [{val :value}]

rule a-then-d
    executed:
    (clara.rules/insert! {:type :fact, :value :d})
    because these facts:
    {:disc 1, :type :fact, :value :a}
    matched:
    type=:fact and [(= val :a)] where [{val :value}]

rule a-then-b
    executed:
    (clara.rules/insert! {:type :fact, :value :b})
    because these facts:
    {:disc 1, :type :fact, :value :a}
    matched:
    type=:fact and [(= val :a)] where [{val :value}]

print-rule-matches is defined as:

(defn rule-matches [s]
  (for [[k v] (:rule-matches (inspect s))
        {:keys [facts bindings]} v]
    (let [name (:name k)]
      [name {:rule (select-keys k [:rhs :lhs])
             :facts facts
             :bindings bindings}])))

(defn print-rule-matches [s]
  (let [matches (rule-matches s)]
    (doseq [[name {:keys [rule facts bindings]}]  matches]
      (let [{:keys [lhs rhs]} rule]
        (printf "rule %s\n    executed:\n" name)
        (printf "\t%s\n" rhs)
        (printf "    because these facts:\n")
        (doseq [fact facts]
          (printf "\t%s\n" fact))
        (printf "    matched:\n")
        (doseq [{:keys [type constraints args]} lhs]
          (printf "\ttype=%s and %s where %s\n" type constraints args))
        (println)))))

The outcome is pretty much what I had initially wanted.

The only remaining question that I have is whether I can identify which facts matched which conditions. If you look at the printout of rule a-and-b-then-c, there are two matching facts, but the order in which the facts are listed does not match the order in which the conditions are listed. This would be the last thing needed to provide a full explanation, imho, and therefore we'd be very close to the initial goal :)

NOTES:

  1. Notice that as you suggested, having a catch-all rule keeps all the facts around.
  2. I use facts defined as maps to get around recompilation issues
  3. make-rule and make-query are just non-def versions of def-rule and def-query

@rbrush
Copy link
Contributor

rbrush commented Apr 2, 2014

Nice! This looks like it will be really useful. We might have to pull your print-rule-matches function into Clara itself once we get this ironed out. You have me wanting to use it.

I think we can achieve connecting the matched rules with the conditions, but it will require a bit more information to be propagated through the Rete network. Right now tokens we send through the network (terminating at the production nodes) simply contain a sequence of facts, so there isn't a direct way to get to the matching conditions. However, we can enhance those tokens to keep a map of {condition fact}, which can easily be inspected by tooling. Functions like print-rule-matches could then simply iterate through the {condition fact} map and show exactly how things were matched.

I'll try to spend some time on this tomorrow. Fun stuff!

@rbrush
Copy link
Contributor

rbrush commented Apr 4, 2014

Alright, in the above commit I included the matching condition in the token. I basically replaced the :facts field in the token with a :matches field, which is a sequence of [fact, condition] tuples indicating which condition each fact matched.

I haven't handled this well for accumulators yet, but it works for simple conditions. If you check out the test-inspect unit test you can see it in action. The test just compares the condition to the one in the rule to validate it's correct, but of course an explain function would grab the condition and display it accordingly.

One note of caution: I'm probably going to change/move the :cmeta part of the condition for unrelated reasons soon, so access to that might break, but the other parts of the condition should be usable.

@rbrush
Copy link
Contributor

rbrush commented Apr 5, 2014

I went ahead a merged these changes into master since I think they're a step forward as it is and I wanted to use them in unrelated work. We can continue to make further improvements on this issue, though. (Normally I'd log a separate issue, but I think we can be a bit less formal here since it's a small project.)

@rbrush rbrush added this to the 0.5.0 milestone May 11, 2014
@rbrush
Copy link
Contributor

rbrush commented May 11, 2014

The above commit adds human-readable explanations to the clara.tools.inspect namespace. I'm probably going to go ahead and close this in preparation for an 0.5.0 release. We can open additional issues if there are other features that are desired.

@rbrush
Copy link
Contributor

rbrush commented May 11, 2014

Closing this for the 0.5 release as discussed above. Feel free to open additional issues for anything else that would be nice in this space.

@rbrush rbrush closed this as completed May 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants