Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear Regression with NuML #83

Open
johnstaveley opened this issue Mar 26, 2018 · 0 comments
Open

Linear Regression with NuML #83

johnstaveley opened this issue Mar 26, 2018 · 0 comments

Comments

@johnstaveley
Copy link

I'm trying to do a really basic linear regression (Z = 2 * X + 1) prediction using NuML. Given the data is so linear I can't understand why the predicted value is so far off unless I am doing something wrong. I have the target class

public class Sample
{
public float V { get; set; }
public float X { get; set; }
public float Y { get; set; }
public float Z { get; set; }

    public Func<float, float, float, float> OutputStrategy { get; set; }
    public Sample(Func<float, float, float, float> outputStrategy)
    {
        OutputStrategy = outputStrategy;
    }
    public void Seed(int i)
    {
        V = (float) i;
        X = (float) 2 * i;
        Y = (float) 3 * i;
        Z = OutputStrategy(V, X, Y);
    }
}

and I have the NuML code to set up the source values and predict an answer for an arbitrary new data point:

NB: The output strategy is a simple 2 * A + 1. I've tried it with multivariate analysis and the prediction is further away

public static void Main(string[] args)
{
// Generate sample data
int sampleSize = 1000;
Sample[] samples = new Sample[sampleSize];
Func<float, float, float, float> outputStrategy = (A, B, C) => 2 * A + 1;
for (int i = 0; i < sampleSize; i++)
{
samples[i] = new Sample(outputStrategy);
samples[i].Seed(i);
}

    // calculate model
    var generator = new LinearRegressionGenerator();
    var descriptor = Descriptor.New("Samples")
        .With("V").As(typeof(float))
        .With("X").As(typeof(float))
        .With("Y").As(typeof(float))
        .Learn("Z").As(typeof(float));
    generator.Descriptor = descriptor;
    var model = Learner.Learn(samples, 0.6, 50, generator);

    // Use prediction
    var targetSample = new Sample(outputStrategy);
    targetSample.Seed(sampleSize + 1);
    var predictedSample = model.Model.Predict(targetSample);
    var predictedValue = predictedSample.Z;
    var actualValue = outputStrategy(targetSample.V, targetSample.X, targetSample.Y);
    Console.Write("Predicted Value = {0}, Actual Value = {1}, Difference = {2} {3:0.00}%", predictedValue, actualValue, actualValue - predictedValue, (decimal) (actualValue - predictedValue) / (decimal) predictedValue * 100M);
    Console.ReadKey();
}

This gives a difference of about 0.5% which considering the line is completely straight was surprising. I have tried using different % of the dataset for training and number of iterations of the model but it makes no difference to the output.

If I use even a more slightly more complicated model I get much worse predictive capabilities. If I use logistic regression, the predicted output of Z is always 1?!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant