BackwardChainer: working with stricter assumption from AtomSpace by williampma · Pull Request #241 · opencog/atomspace

williampma · 2015-08-28T10:05:42Z

changes to make BC works with some strict assumption (not yet satisfied) from the AtomSpace
- all usage of VariableNode are unique & well-defined, in that the same named VariableNode is never declared and appears in more than one scope
- atoms like (SatisfyingSetLink (VariableNode $X) (humans eat $X)), (SatisfyingSetLink (VariableNode $Y) (humans eat $Y)) are treated as exactly the same atom
This allows me to remove hacky codes that were causing all sort of contradictions in terms of VariableNode, allows me to cleanly unify to the output of rules like member-to-evaluation-rule, etc. In general, BC probably won't work if the above 2 points are not satisfied.

However, there're still a little bit of work-around codes left. eg. right now all the rules' BindLink share same named VariableNode everywhere; PM will still return (SatisfyingSetLink (VariableNode $X) (humans eat $X)), (SatisfyingSetLink (VariableNode $Y) (humans eat $Y)) as separate results.

FC will also needed to be changed so that rules like https://github.com/opencog/opencog/blob/master/opencog/reasoning/pln/rules/evaluation-to-member-rule.scm will output unique VariableNode inside the output SatisfyingSetLink between steps.
optimize rule selection + unification bases on recent discussion
and other stuff

- when there's no free vars in the atom

- Need to always allow non-var VariableNode to match to any-name VariableNode

- Since BC should be working on the assumption that each VariableNode has one unique scope (ie. the same VariableNode should never appear on more than one scope)

- select rule first before unification, and save the unification result

BackwardChainer: working with stricter assumption from AtomSpace

linas · 2015-08-28T16:34:40Z

OK, so .. treating two different atoms as the same, because they differ only by a re-labelling of variables is called "alpha conversion". Ben started talking about this here: opencog/opencog#1742 I recommend that alpha-conversion be done using some sort of independent data-scrubber run periodically, or maybe run when an atom is added to the atomspace; it should be an independent unit of code, applying some data-scrubbing rules. It should not be hard-coded into the guts of things, as that will cause other troubles.

Also: you can only do alpha-conversion only if there are no conflicts with other variables. So, for example, you can convert (SatisfyingSetLink (VariableNode $X) (humans eat $X)) to (SatisfyingSetLink (VariableNode $Y) (humans eat $Y)) but ONLY if it doesn't appear inside of some other atoms that is already using VariableNode $Y because then you would accidentally have two different variables being given the same name.

bgoertzel · 2015-08-28T16:39:09Z

OK, so .. treating two different atoms as the same, because they differ

only by a re-labelling of variables is called "alpha conversion". Ben
started talking about this here: opencog/opencog#1742
opencog/opencog#1742 I recommend that
alpha-conversion be done using some sort of independent data-scrubber run
periodically, or maybe run when an atom is added to the atomspace; it
should be an independent unit of code, applying some data-scrubbing rules.
It should not be hard-coded into the guts of things, as that will cause
other troubles.

My suggestion is that alpha-conversion be done whenever a variable-scoping
link is added to the Atomspace

Also: you can only do alpha-conversion only if there are no conflicts with
other variables. So, for example, you can convert (SatisfyingSetLink
(VariableNode $X) (humans eat $X)) to (SatisfyingSetLink (VariableNode
$Y) (humans eat $Y)) but ONLY if it doesn't appear inside of some other
atoms that is already using VariableNode $Y because then you would
accidentally have two different variables being given the same name.

Hmmm, is this really a problem? In OpenCog we can assume VariableNode
names are unique; i.e. for each VariableNode named x, only one scoping link
will bind a variable named x...

linas · 2015-08-28T17:15:08Z

You may "intellectually" consider that each scoped variable node is unique, but nothing currently prevents users from using the same variable name in different scoped expressions. so e.g. people could write: (SatisfyingSetLink (VariableNode $x) (humans eat $x)) and also (BindLink (AndLink (humans eat $x) (InheritanceLink $x $y) ) and so now you have $x bound up in two different expressions. alpha-converting $x to $y is safe for the first expr, but would destroy the second one.

We can solve this second problem like so: Make every scoped link derive from LambdaLink. Currently, when you construct a LambdaLink, it runs some C++ code to figure ouot where the variables are, what the body is; some general pre-processing, so that later users can get fast access to the variables. What we could do in this C++ code is to create new, unique (random) variable names, go through the body, and paste those in. Now, you cannot accidentally bind the same variable in two different expressions.

This raises two issues:
(1) it still doesn't solve the alpha-equivalence problem. you could still have multiple alpha-equivalent links in the atomspace.

(2) There's a practical problem: whoever is creating the LambdaLink might get tripped up by the changes. I can't think of a convincing example right now, but these come up in the code, where old code assumes that A is done before B but newer code has to twiddle the contens of B that A doesn't know about, and it can be messy/tricky/confusing. I'm hitting an issue like that right now, with DefineLink, where I need to use variables before I know what they are. Yuck. Solvable, but a time-sink.

bgoertzel · 2015-08-28T17:26:16Z

You may "intellectually" consider that each scoped variable node is unique,
but nothing currently prevents users from using the same variable name in
different scoped expressions.

true. But the conclusion William and I came to, after significant
hand-wringing, was that something SHOULD prevent users from doing this...

so e.g. people could write: (SatisfyingSetLink (VariableNode $x) (humans
eat $x)) and also (BindLink (AndLink (humans eat $x) (InheritanceLink $x
$y) ) and so now you have $x bound up in two different expressions.

Right. We should not let this happen.

We can solve this second problem like so: Make every scoped link derive
from LambdaLink. Currently, when you construct a LambdaLink, it runs some
C++ code to figure ouot where the variables are, what the body is; some
general pre-processing, so that later users can get fast access to the
variables. What we could do in this C++ code is to create new, unique
(random) variable names, go through the body, and paste those in. Now, you
cannot accidentally bind the same variable in two different expressions.

Basically, at the time a new scoping link is created, we want to both

-- enforce alpha-equivalence

-- enforce the rule that each variable is bound only by a single scoping
link

If the easiest way to do this involves making all the scoping links inherit
from LambdaLink, that's certainly fine...

(2) There's a practical problem: whoever is creating the LambdaLink might

get tripped up by the changes. I can't think of a convincing example right
now, but these come up in the code, where old code assumes that A is done
before B but newer code has to twiddle the contens of B that A doesn't know
about, and it can be messy/tricky/confusing. I'm hitting an issue like that
right now, with DefineLink, where I need to use variables before I know
what they are. Yuck. Solvable, but a time-sink.

Hmmm....

One could make a function "get_new_variable_name" that gets a new
VariableNode name known not to be in the Atomspace already (care would have
to be take to cover the case that this is called in multiple threads
concurrently...). Then the workflow to follow when creating a
variable-bearing Atom structure would be:

-- first call get_new_variable_name to get new, unique variable names for
the variables to be used

-- then create the scoping link that scopes these variable names

This avoids ever changing the name of a VariableNode. But I'm not sure if
it would solve the problem you're alluding to, which I don't fully
understand

Anyway, it seems we need to

-- enforce alpha-equivalence

-- enforce the rule that each variable is bound only by a single scoping
link

to make backward chaining work in a non-horrible way...

ben

linas · 2015-08-28T17:51:10Z

OK. I guess you can do the alpha-equivalence check when an atom is added to the atomspace. You'd have to run the pattern matcher, see if there already is some other alpha-equivalent lambda-link in the atomspace, and return that.

The only problem I see is a stupid but nasty shared-library problem, where the atomspace needs the pattern matcher needs the atomspace etc. and CMake pukes on this. It has been very very painful. As far as I can tell, CMake enforces this because its a MS Windows limitation, since I think it would work OK with glibc. So far, I've been able to hack around this, but its ... fragile.

pennachin · 2015-08-29T00:56:07Z

Hi,

On Sat, Aug 29, 2015 at 1:26 AM, bgoertzel notifications@github.com wrote:

You may "intellectually" consider that each scoped variable node is unique,

but nothing currently prevents users from using the same variable name in
different scoped expressions.

true. But the conclusion William and I came to, after significant
hand-wringing, was that something SHOULD prevent users from doing this...

Maybe we could do what Lisp does for macros, where (gensym) is used to
automatically create a new, unique, symbol name, which can be reused within
that expression. So we'd have some syntactic sugar that lets people express
their wish for a new variable without having to name it. In scheme, this
would be straightforward, and it prevents the need for expensive checks
(what happens if $x isn't in RAM, but is in backing store, etc?)

Cassio

Cassio Pennachin

ngeiswei · 2015-08-31T07:04:55Z

Guys, maybe I'm missing something here, but if all variables are scoped (and I mean explicitly scoped, not like today). Then no need to constraint the user to have different variable names. Really it's a hard constraint to satisfy, you can't ask user A and user B working on 2 separate component of the same system not to use the same variable names. Neither can you have the user enter that

LambdaLink
   VariableNode "$X"
   PlusLink
     VariableNode  "$X"
     NumberNode 1

and have opencog rename that behind his back and display

LambdaLink
   VariableNode "$opencog-variable-h34ea9b"
   PlusLink
     VariableNode  "$opencog-variable-h34ea9b"
     NumberNode 1

the next time he wants to look it up.

I understand that, as William pointed out, you cannot know in advance whether a variable is gonna be scoped of not (cause before adding the scope link you need to add the variable). So let's make everything variable use explicitly scoped, if it's not then, it's got to be a global one, cause there would be no such thing as a local unscoped variable.

Alpha-equivalence should be short live and transparently performed for the user, the result of an alpha-convertion should only live in the RAM of the execution mechanism, not in the AtomSpace.

Do you agree?

linas · 2015-08-31T19:46:54Z

Nil,

I agree, except that "explicit scoping" has nothing to do with it. The problem is that the scoping is not being honored. If some variable is implicitly scoped, and the code does not respect that, and gloms it anyway, then, yes, bad things will happen. Its exactly the same bad things that would happen if it was explicitly scoped. So, the answer is if a variable is scoped, explicitly or implicitly, then code should honor that scoping!

Examples where scoping is not being honored would make this conversation easier, though. I'm guessing that these show up in the chainers, and they might be hard to deal with. I can't tell from here.

williampma · 2015-09-01T03:08:53Z

I am nervous coming up with an example, since I am still kind of confused.

Here is one perhaps:

Given

EvaluationLink
   eat
   ListLink
      humans
      birds

applying evaluation-to-member-rule can give (assuming variables naming are not unique yet)

MemberLink
   humans
   SatisfyingSetLink
       $X
       EvaluationLink
           eat
           ListLink
                $X
                birds
MemberLink
   birds
   SatisfyingSetLink
       $X
       EvaluationLink
           eat
           ListLink
                humans
                $X

Now, applying evaluation-to-member-rule again, it will find the EvaluationLink inside these scoped one and produce

MemberLink
   birds
   SatisfyingSetLink
       $X
       EvaluationLink
           eat
           ListLink
                $X
                $X

One of the $X in the internal EvaluationLink is actually from a different scope, but we cannot tell. With different variable naming we see

MemberLink
   birds
   SatisfyingSetLink
       $Y
       EvaluationLink
           eat
           ListLink
                $X
                $Y

ie. this new MemberLink is actually living inside the scope of

   SatisfyingSetLink
       $X
       EvaluationLink
           eat
           ListLink
                $X
                birds

from within the top-most MemberLink.

williampma · 2015-09-01T03:14:20Z

Sorry, messed up the example. Fixed (I think).

bgoertzel · 2015-09-01T03:15:20Z

William --

yeah, without unique variable names in the Atomspace, what needs to be done
here is a step of "alpha conversion" --- whenever an inference step is
done, unique names would need to be given to all the variables involved,
for use within that inference... to avoid this sort of confusion

Or, enforcing uniqueness of variable names in the Atomspace avoids the
confusion...

ben

On Tue, Sep 1, 2015 at 11:08 AM, William Ma notifications@github.com
wrote:

I am nervous coming up with an example, since I am still kind of confused.

Here is one perhaps:

Given

EvaluationLink
eat
ListLink
humans
birds

applying evaluation-to-member-rule can give (assuming variables naming are
not unique yet)

MemberLink
humans
SatisfyingSetLink
$X
EvaluationLink
eat
ListLink
$X
birds
MemberLink
birds
SatisfyingSetLink
$X
EvaluationLink
eat
ListLink
humans
$X

Now, applying evaluation-to-member-rule again, it will find the
EvaluationLink inside these scoped one and produce

MemberLink
$X
SatisfyingSetLink
$X
EvaluationLink
eat
ListLink
$X
$X

One of the $X in the internal EvaluationLink is actually from a different
scope, but we cannot tell. With different variable naming we see

MemberLink
$X
SatisfyingSetLink
$Y
EvaluationLink
eat
ListLink
$X
$Y

ie. this new MemberLink is actually living inside the scope of

SatisfyingSetLink
$X
EvaluationLink
eat
ListLink
$X
birds

from within the top-most MemberLink.

—
Reply to this email directly or view it on GitHub
#241 (comment).

Ben Goertzel, PhD
http://goertzel.org

"The reasonable man adapts himself to the world: the unreasonable one
persists in trying to adapt the world to himself. Therefore all progress
depends on the unreasonable man." -- George Bernard Shaw

linas · 2015-09-01T05:00:16Z

All that Nil was saying is that, for the second rule application, you cannot use $X again; you must not paste such a thing into the inside of an expression where $X is already bound (implicitly or explicitly). Basically, when you are converting an expression using a rule of some kind, you must scan the entire expression for variables, and then pick a variable name that does not yet occur inside that expression.

williampma · 2015-09-01T05:31:35Z

That's not the only problem. Even with new var name as I did at the end with

MemberLink
   birds
   SatisfyingSetLink
       $Y
       EvaluationLink
           eat
           ListLink
                $X
                $Y

You cannot find out where $X is scoped because $X is under two different scope. So the original creation of them need to have different name.

Also, with rule's output using new var name, it will create a bunch of same atom (SatisfyingSetLink (VariableNode $X) (humans eat $X)), (SatisfyingSetLink (VariableNode $Y) (humans eat $Y)), hence the "alpha conversion" idea.

There are other ideas that I have floating in my mind (like never allow PM to match to stuff already inside a scope, or make the output somehow insert the new MemberLink into the other scoping link...), but I think the restriction mentioned at the top seems cleanest.

linas · 2015-09-01T05:59:17Z

Hmm. I guess I don't understand the problem. When you write

MemberLink
   birds
   SatisfyingSetLink
       $Y
       EvaluationLink
           eat
           ListLink
                $X
                $Y

the $X is clearly free, since its not explicitly scoped, and there is nowhere where it can be implicitly scoped. What's wrong with $X being free? It seems to say "birds belong to the set of things that get eaten", and that seems like a valid statement that can be assigned a plausible TV.

But I'm just saying things; I don't quite understand the "big picture" that is being pursued here.

If you want to scope both at the same time, you could stack one on the other, or as Nil suggested, do them in "parallel":

MemberLink
   ListLink 
       humans
       birds
   SatisfyingSetLink
       $X $Y
       EvaluationLink
           eat
           ListLink
                $X
                $Y

either way, stacking them, or doing them in "parallel" would require some new kind of rule.

linas · 2015-09-01T06:01:55Z

You could even convert the "parallel" case to a stacked case using two different curry rules (for two different orders) and converted the stacked version to a parallel version with an uncurry rule.

williampma added 12 commits August 26, 2015 12:21

Allow non-var VariableNode to match to any VariableNode

299b575

SatisfyingSetLink is scoping

e8ef045

Add const to definition

3564dbe

Refactor code into new find_premises method

d50ae65

Allow something to match to itself

ea3cd25

- when there's no free vars in the atom

Moved general non-var matching to general PMCB

bcb0ea1

- Need to always allow non-var VariableNode to match to any-name VariableNode

Makes the variable on a rule's output completely unique

c57dc9c

Removed QuoteLink-ing the variable

a3f7520

- Since BC should be working on the assumption that each VariableNode has one unique scope (ie. the same VariableNode should never appear on more than one scope)

Update unit-test to reflect changes to var-name assumption

c278ca1

Allow Rule object to be UNDEFINED

465d932

Optimize rule selection + unification

db771cd

- select rule first before unification, and save the unification result

Fixes bug with standardized-apart

e73edf0

williampma changed the title ~~BackwardChainer: working with strictly assumption from AtomSpace~~ BackwardChainer: working with stricter assumption from AtomSpace Aug 28, 2015

williampma added a commit that referenced this pull request Aug 28, 2015

Merge pull request #241 from williampma/BCMisc2

d84ea31

BackwardChainer: working with stricter assumption from AtomSpace

williampma merged commit d84ea31 into opencog:master Aug 28, 2015

williampma deleted the BCMisc2 branch August 28, 2015 10:16

linas mentioned this pull request Aug 28, 2015

Loading VariableNodes from backing store opencog/opencog#1742

Closed

Conversation

williampma commented Aug 28, 2015

Uh oh!

linas commented Aug 28, 2015

Uh oh!

bgoertzel commented Aug 28, 2015

Uh oh!

linas commented Aug 28, 2015

Uh oh!

bgoertzel commented Aug 28, 2015

Uh oh!

linas commented Aug 28, 2015

Uh oh!

pennachin commented Aug 29, 2015

Cassio

Uh oh!

ngeiswei commented Aug 31, 2015

Uh oh!

linas commented Aug 31, 2015

Uh oh!

williampma commented Sep 1, 2015

Uh oh!

williampma commented Sep 1, 2015

Uh oh!

bgoertzel commented Sep 1, 2015

Uh oh!

linas commented Sep 1, 2015

Uh oh!

williampma commented Sep 1, 2015

Uh oh!

linas commented Sep 1, 2015

Uh oh!

linas commented Sep 1, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants