Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GMM identifiability problem #302

Closed
jacowp357 opened this issue Nov 20, 2020 · 11 comments
Closed

GMM identifiability problem #302

jacowp357 opened this issue Nov 20, 2020 · 11 comments

Comments

@jacowp357
Copy link

jacowp357 commented Nov 20, 2020

Hi,

My understanding is that there are K! equivalent solutions for K components in a Gaussian mixture model (known as the identifiability issue). In other words, if K=2:

  • component 0: N(-1,1), component 1: N(1,1) is the same solution as
  • component 0: N(1,1), component 1: N(-1,1).

In my Infer.NET application I need component 0 to always represent category 0 of the latent variable Z. I have tried initialising the prior mean of component 0 and component 1 to "enforce" this, but I think my data is too overwhelming. What other ways can I try to get this kind of behaviour from a GMM in Infer.NET? Maybe a small number of labels/observations for some Z nodes?

Screenshot 2020-11-20 at 12 06 26

@tminka
Copy link
Contributor

tminka commented Nov 20, 2020

Can you clarify what you mean by "component 0 to always represent category 0 of the latent variable Z". How do you define "component 0" other than "category 0 of the latent variable Z"?

@jacowp357
Copy link
Author

Hi Tom,

My apologies for not providing you with the entire model and more background. I assume my issue is related to the mixture model (top part of the graph below) and it reminded me of the "identifiability problem" (hope my drawing of the mixture is correct):

Screenshot 2020-11-23 at 13 52 59

I created a dataset that reproduces my issue, which is in the code below. The x observed is sampled from 2 weighted Gaussian distributions (0.9)N(-70,20) and (0.1)N(-120,20), where the former is suppose to be represented by state zero of the Bernoulli random variable sIsBad. In other words, higher x values should be good (sIsBad=0). But the results show the opposite. I'm wondering if this could be due to my data being imbalanced? Or can I create some kind of constraint to enforce that means[1] will represent N(-120,20)? My understanding is that this is what is happening in the using (var block = Variable.ForEach(rangeS)) below. Or should I create another Bernoulli layer between sIsBad and x with a noisy factor?

using System;
using Microsoft.ML.Probabilistic.Models;
using Microsoft.ML.Probabilistic.Models.Attributes;
using Microsoft.ML.Probabilistic.Distributions;
using Microsoft.ML.Probabilistic.Algorithms;
using Microsoft.ML.Probabilistic.Math;
using Microsoft.ML.Probabilistic.Compiler;
using Microsoft.ML.Probabilistic.Compiler.Transforms;
using Microsoft.ML.Probabilistic.Compiler.Visualizers;
using Range = Microsoft.ML.Probabilistic.Models.Range;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using System.Runtime.InteropServices;
using System.IO;

namespace TestInfer
{
    class Program
    {
        static void Main(string[] args)
        {
            int[][] edgeConnectionsObserved = new int[20][]
            {
                new int [] { 0, 2, 7, 10, 13, 16, 23, 25, 27, 29, 30, 35, 42, 46, 47 },
                new int [] { 1, 3, 7, 12, 14, 19, 22, 25, 27, 28, 29, 34, 38, 39, 45, 49 },
                new int [] { 0, 1, 4, 11, 13, 15, 29, 31, 33, 41, 43 },
                new int [] { 0, 4, 5, 8, 9, 11, 12, 15, 16, 18, 19, 23, 24, 27, 29, 30, 35, 42, 47 },
                new int [] { 0, 18, 19, 25, 29, 30, 32, 35, 36, 39, 40, 47 },
                new int [] { 5, 6, 8, 12, 25, 28, 39, 41, 48, 49 },
                new int [] { 1, 10, 11, 13, 14, 19, 27, 31, 39, 42, 44, 46, 49 },
                new int [] { 1, 2, 5, 9, 22, 24, 35, 38, 40, 42, 43, 45, 46, 47 },
                new int [] { 0, 3, 7, 15, 16, 18, 26, 27, 29, 34, 35, 45, 46, 47, 49 },
                new int [] { 1, 8, 15, 18, 30, 31, 34, 41, 45, 46, 48 },
                new int [] { 5, 8, 14, 15, 17, 24, 25, 26, 27, 34, 36, 39, 45 },
                new int [] { 3, 4, 9, 13, 16, 17, 18, 23, 29, 31, 36, 44, 46 },
                new int [] { 2, 3, 10, 14, 15, 16, 20, 26, 29, 30, 31, 34, 35, 36, 46 },
                new int [] { 0, 2, 9, 13, 14, 15, 20, 27, 31, 33, 40 },
                new int [] { 7, 8, 11, 18, 22, 27, 29, 39, 41, 42, 46, 48 },
                new int [] { 5, 10, 17, 19, 23, 24, 31, 42, 46, 47 },
                new int [] { 0, 9, 20, 23, 24, 26, 31, 34, 35, 40, 49 },
                new int [] { 5, 6, 13, 17, 29, 30, 33, 34, 35, 37, 39, 40 },
                new int [] { 8, 10, 11, 13, 15, 19, 25, 26, 27, 31, 34, 40, 42, 47 },
                new int [] { 0, 1, 3, 7, 13, 14, 18, 22, 24, 29, 30, 37, 38, 41, 46 }
            };

            int numSNodes = 50;
            int numCNodes = 20;

            int[] numEdgesForEachCNodeObserved = new int[numCNodes];
            for (int n = 0; n < numCNodes; n++)
            {
                numEdgesForEachCNodeObserved[n] = edgeConnectionsObserved[n].Length;
            }

            bool[] isUnhappyObserved = new bool[20] { false,
                                                      true,
                                                      true,
                                                      true,
                                                      true,
                                                      false,
                                                      false,
                                                      true,
                                                      true,
                                                      false,
                                                      true,
                                                      false,
                                                      true,
                                                      true,
                                                      true,
                                                      true,
                                                      true,
                                                      true,
                                                      true,
                                                      true };

            double[] xObserved = new double[50] { -69.34, -69.13, -65.86, -68.79, -69.17, -70.47, -64.30, -70.27,
                                                  -72.05, -74.26, -65.50, -64.35, -70.80, -68.66, -71.52, -69.03,
                                                  -62.69, -66.53, -58.74, -69.71, -71.30, -65.22, -63.22, -72.34,
                                                  -73.02, -72.74, -70.76, -71.53, -68.51, -120.13, -64.89, -58.61,
                                                  -81.15, -67.62, -61.73, -69.80, -67.65, -71.92, -74.05, -75.44,
                                                  -117.79, -73.21, -116.93, -71.09, -67.50, -67.14, -71.81, -69.91, -71.26, -65.53 };

            Range rangeS = new Range(numSNodes);
            Range rangeC = new Range(numCNodes);

            VariableArray<int> numOfEdgesForEachC = Variable.Array<int>(rangeC).Named("numOfEdgesForEachC").Attrib(new DoNotInfer());
            Range cToS = new Range(numOfEdgesForEachC[rangeC]).Named("cToS");

            VariableArray<VariableArray<int>, int[][]> sTouched = Variable.Array(Variable.Array<int>(cToS), rangeC).Named("sTouched").Attrib(new DoNotInfer());

            VariableArray<bool> s = Variable.Array<bool>(rangeS).Named("s");
            s[rangeS] = Variable.Bernoulli(0.5).ForEach(rangeS);

            VariableArray<bool> isUnhappy = Variable.Array<bool>(rangeC).Named("isUnhappy");

            VariableArray<bool> hadBads = Variable.Array<bool>(rangeC).Named("hadBads");

            Variable<double> trueUnhappy = Variable.Beta(7, 2).Named("trueUnhappy");
            Variable<double> falseUnhappy = Variable.Beta(2, 7).Named("falseUnhappy");

            // the Gaussian mixture part of the model
            Range k = new Range(2);
            VariableArray<double> means = Variable.Array<double>(k).Named("means");
            means[k] = Variable.GaussianFromMeanAndPrecision(0, 0.0001).ForEach(k);

            VariableArray<double> precisions = Variable.Array<double>(k).Named("precisions");
            precisions[k] = Variable.GammaFromShapeAndRate(1, 1).ForEach(k);

            VariableArray<double> xNodes = Variable.Array<double>(rangeS).Named("xNodes");

            using (var block = Variable.ForEach(rangeS))
            {
                using (Variable.If(s[rangeS]))
                {
                    // this is for the bad s - Gaussian component 1 is for bad x
                    xNodes[rangeS] = Variable.GaussianFromMeanAndPrecision(means[1], precisions[1]);
                }
                using (Variable.IfNot(s[rangeS]))
                {
                    xNodes[rangeS] = Variable.GaussianFromMeanAndPrecision(means[0], precisions[0]);
                }
            }

            using (Variable.ForEach(rangeC))
            {
                var relevantSites = Variable.Subarray(s, sTouched[rangeC]).Named("relevantS");

                //create the AnyTrue factor
                var notRelevantS = Variable.Array<bool>(cToS);
                notRelevantS[cToS] = !relevantSites[cToS];
                hadBads[rangeC] = !Variable.AllTrue(notRelevantS).Named("anyTrue");

                // add noise factor for isUnhappy observations
                using (Variable.If(hadBads[rangeC]))
                {
                    isUnhappy[rangeC].SetTo(Variable.Bernoulli(trueUnhappy));
                }
                using (Variable.IfNot(hadBads[rangeC]))
                {
                    isUnhappy[rangeC].SetTo(Variable.Bernoulli(falseUnhappy));
                }
            }

            /********* observations *********/
            isUnhappy.ObservedValue = isUnhappyObserved;
            numOfEdgesForEachC.ObservedValue = numEdgesForEachCNodeObserved;
            sTouched.ObservedValue = edgeConnectionsObserved;
            xNodes.ObservedValue = xObserved;
            /*******************************/

            /********** inference **********/
            var InferenceEngine = new InferenceEngine(new ExpectationPropagation());
            InferenceEngine.NumberOfIterations = 50;

            Bernoulli[] sPosteriors = InferenceEngine.Infer<Bernoulli[]>(s);
            Bernoulli[] hadBadPosteriors = InferenceEngine.Infer<Bernoulli[]>(hadBads);
            Beta trueUnhappyPosteriors = InferenceEngine.Infer<Beta>(trueUnhappy);
            Beta falseUnhappyPosteriors = InferenceEngine.Infer<Beta>(falseUnhappy);
            Gaussian[] postMean = InferenceEngine.Infer<Gaussian[]>(means);
            Gamma[] postPrecision = InferenceEngine.Infer<Gamma[]>(precisions);
            /*******************************/

            for (int i = 0; i < 2; i++)
            {
                Console.WriteLine("Posterior Gaussian (Gaussian mean): {0}", postMean[i]);
                Console.WriteLine("Posterior Gamma (Gaussian precision): {0}", postPrecision[i]);
            }

            Console.WriteLine("True unhappy: {0}", trueUnhappyPosteriors);
            Console.WriteLine("False unhappy: {0}", falseUnhappyPosteriors);

        }
    }
}

@jacowp357
Copy link
Author

jacowp357 commented Nov 24, 2020

When I add a Beta prior over the sIsBad random variable (i.e., the weights) it seems to help, but I do not know how to explain it. Maybe this prior allows for "perspective" among the sIsBad random variables?

Adding this prior: Variable<double> weights = Variable.Beta(1, 1).Named("weights"); results in the two Gaussians flipping the other way:

Posterior Gaussian (Gaussian mean): Gaussian(-68.94, 0.3859) (for the zero state of sIsBad)
Posterior Gaussian (Gaussian mean): Gaussian(-118.3, 1.276) (for the 1 state of sIsBad).

@tminka
Copy link
Contributor

tminka commented Nov 24, 2020

It sounds like you simply want means[1] < means[0]. You can enforce that by a constraint:
Variable.ConstrainTrue(means[1] < means[0]);

@jacowp357
Copy link
Author

jacowp357 commented Sep 23, 2022

Hi Tom, how can the constraint Variable.ConstrainTrue(means[1] < means[0]); be done for a vector? I get this error:

Neither of the operators (<,<=,>,>=) has a registered factor for argument type Microsoft.ML.Probabilistic.Math.Vector.
at Microsoft.ML.Probabilistic.Models.Variable1.GreaterThan(Variable 1 a, Variable 1 b

@tminka
Copy link
Contributor

tminka commented Sep 23, 2022

Variable.ConstrainTrue(mean[1][i] < means[0][i]);

@jacowp357
Copy link
Author

jacowp357 commented Sep 26, 2022

Hi Tom,

I'm getting this error when trying that "Cannot apply indexing with [] to an expression of type 'Variable'"

I'm not sure how to access the individual vector elements.

@tminka
Copy link
Contributor

tminka commented Sep 26, 2022

What is the type of mean? I need more context here. Maybe open a new issue?

@jacowp357
Copy link
Author

jacowp357 commented Sep 26, 2022

Range k = new Range(2);
VariableArray<Vector> means = Variable.Array<Vector>(k).Named("means");

means[k] = Variable.VectorGaussianFromMeanAndPrecision(Vector.FromArray(0, 0), PositiveDefiniteMatrix.IdentityScaledBy(2, 1)).ForEach(k);

Variable.ConstrainTrue(means[1][0] < means[0][0]);  //the first element of the vector means[1] should be smaller than the first element of the vector means[0]
Variable.ConstrainTrue(means[1][1] < means[0][1]);   //the second element of the vector means[1] should be smaller than the second element of the vector means[0]

but this code does not work

@tminka
Copy link
Contributor

tminka commented Sep 26, 2022

Vectors are indexed using Variable.GetItem

@jacowp357
Copy link
Author

Spot on! Many thanks Tom!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants