Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Research Notes - Analogy comparison experiment #7

Open
dnorman opened this issue Jun 12, 2020 · 0 comments
Open

Research Notes - Analogy comparison experiment #7

dnorman opened this issue Jun 12, 2020 · 0 comments

Comments

@dnorman
Copy link
Contributor

dnorman commented Jun 12, 2020

Following on Notes #1 and #3 ...

Within the analogy_compare experiment, I have managed to get Analogy querying working, such that a candidate analogy may be tested against an AnalogyQuery struct.

This now successfully provides the correct results:

fn experiment() {
    // In this experiment, we are approximating the following MBQL
    // $x = Bind("Hot")
    // $y = Ground($x : "Cold")

    let mut x = Symbol::null();
    let mut y = FuzzySet::new();

    // For simplicity, lets say these are all the analogies in the system
    let candidates = [//
                      Analogy::from_left_right("a1", sym!["Hot1", "Hot2", "Heated1"], sym!["Mild1", "Mild2", "Cold3"]),
                      Analogy::from_left_right("a2", sym!["Hot3"], sym!["Cold1", "Cold2"]),
                      Analogy::from_left_right("a3", sym!["Cold3"], sym!["Hot3"])];

    // Imagine we looked up all ClaimIDs for all Claims related to Artifacts "Hot" and "Cold"
    // query is an AnalogyQuery struct, which contains a FuzzySet<AnalogyMember>. 
    // from_left_right constructs an AnalogyQuery with one AnalogyMember per each ClaimID "Hot1", "Hot2", etc 
    let query = AnalogyQuery::from_left_right(sym!["Hot1", "Hot2", "Hot3"], sym!["Cold1", "Cold2", "Cold3"]);
    println!("Query is: {}", query);

    for candidate in &candidates {
        let v: FuzzySet<analogy::AnalogyMember> = candidate.interrogate(&query).expect("All of the above should match");
        println!("v is {}", v);
        x.set.union(v.left());
        y.union(v);
    }

    println!("symbol x is: {}", x);
    println!("symbol y is: {}", y);
}

Which renders the following:

Query is: [Hot1~1.00, Hot2~1.00, Hot3~1.00 <-> Cold1~1.00, Cold2~1.00, Cold3~1.00]
v is [Hot1~0.33, Hot2~0.33 <-> Cold3~0.67]
v is [Hot3~0.67 <-> Cold1~0.33, Cold2~0.33]
v is [Hot3~0.33 <-> Cold3~0.33]
symbol x is: {Hot1~0.33, Hot2~0.33, Hot3~0.67}
symbol y is: [Hot1~0.33, Hot2~0.33, Hot3~0.67 <-> Cold1~0.33, Cold2~0.33, Cold3~0.67]

When both sides of the candidate analogy match the query to at least some degree, we are including the matching terms from each side, but we have to scale its degree within the output set based on the degree of the match on the opposite side.

With that in mind, we can see that the first candidate Analogy experiences a 2/3 match to the members of the left side of the AnalogyQuery and a 1/3 match to the right side. That means we want to scale those matching members of the right by the degree of the match on the left, and vice versa. Thus yielding [Hot1~0.33, Hot2~0.33 <-> Cold3~0.67]

(Of course, if either side of the AnalogyQuery has a zero degree of matching, then the candidate Analogy is fully rejected, and there is no resultant FuzzySet for that candidate.)

So, why do we want to do this scaling of output members within the set by the degree of the opposing side? This is because an Analogy creates an associative relationship between the Left Symbol and the right Symbol (both of which are themselves FuzzySets, at least before we convert them into a "sided" fuzzyset within the Analogy).

We expressly lack pairwise relationships between members of these two FuzzySets, because each one represents some abstract concept. We are inferring the applicability of the right side based on the matching of the left side, and vice versa. It therefore stands to reason that such inference be scaled by the strength of the match of the opposing symbol.

A brief digression on Symbols:

A Symbol represents / abstracts some concept by constraining degrees of freedom within a highly dimensional semantic space with its Members, and their respective degrees.

When I conjure the abstract notion of a "Dog" within my mind, there is some ephemeral meaning which is constrained in its degrees of freedom based on an elaborate unspoken context. Maybe I was thinking of the fur, plus a specific dog from childhood, plus the abstract idea of kinship with another creature, plus several other dimensions which I would have difficulty articulating.

Even if we imagine that a "Perfect" brain-computer interface magically existed, the concept which I conjured would still occupy a more or less fuzzy region within semantic space which has been constrained to some degree. This is not a question of perfect rendering. While the region within said semantic space may be more sharply or dully partitioned, said boundary cannot ever be fully crisp. That we clamp that degree between 0.0 and 1.0 is merely an implementation detail rather than a representation of actually-perfect confidence or non-confidence.

Weighted union of candidate Analogies

One thing within the experimental code which is almost certainly wrong is the way unions are being performed across the output of each candidate Analogy interrogation.

We must explore a more appropriate means of composing these candidate Analogy interrogation outputs in a weighted fashion, rather than simply taking the maximum degree of each discrete matching member into the final output FuzzySet.

This is screwy, because we likely don't want Members from a small subset of candidate Analogies with a high degree of matching to compete on equal footing with a corpus of thousands with a low matching degree, as a simple maximum-degree of membership union might provide. (current code does this)

However, we also don't want to attenuate the signal of such a well-matching subset of candidate Analogies as a simple weighted score would suggest either. Presumably there is some middle ground which must be found, whereby these considerations are balanced. Not a simple weighted score, and not a maximum-degree of FuzzySet membership either.

For the time being, I will call this the Fuzzyset-union signal-to-noise ratio problem.

dnorman added a commit that referenced this issue Jun 12, 2020
…ms to work correctly. Identified an open question about the manner of unifying these outputs into the final output symbols, as described in #7
@dnorman dnorman added this to Research in ankurah Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
ankurah
  
Research
Development

No branches or pull requests

1 participant