-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Closed
Description
I'm having trouble training a kmeans clustering pipeline. Even though all my features have the same type (double) the result is a Schema mismatch between double and single
my data structure class looks something like this:
public class Loja
{
public double encrypted_5_zipcode { get; set; }
public double faturamento_total { get; set; }
public double transacoes_total { get; set; }
public static IEnumerable<Loja> ReadCsvSkipingErrors(string filePath)
{
//...
}
}and main program:
var lojas = Loja.ReadCsvSkipingErrors(fpath);
MLContext mlContext = new MLContext(seed: 1);
IDataView data = mlContext.Data.LoadFromEnumerable(lojas);
var dataProcessPipeline = mlContext.Transforms.Concatenate("Features",
nameof(Loja.encrypted_5_zipcode),
nameof(Loja.faturamento_total),
nameof(Loja.transacoes_total)
);
var trainer = mlContext.Clustering.Trainers.KMeans(featureColumnName: "Features", numberOfClusters: 3);
var trainigPipeline = dataProcessPipeline.Append(trainer);
var trainedModel = trainigPipeline.Fit(data);the result is:
Unhandled Exception: System.ArgumentOutOfRangeException: Schema mismatch for feature column 'Features': expected Vector<Single>, got Vector<Double>
Parameter name: inputSchema
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.CheckInputSchema(SchemaShape inputSchema)
at Microsoft.ML.Trainers.TrainerEstimatorBase`2.GetOutputSchema(SchemaShape inputSchema)
at Microsoft.ML.Data.EstimatorChain`1.GetOutputSchema(SchemaShape inputSchema)
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
If I change the type of variables to float in class "Loja" it works normally. Is it possible to use a double in this case?
Metadata
Metadata
Assignees
Labels
No labels