Skip to content

Is it possible to use 64 bit floats when concatenating features? #4141

@bandozia

Description

@bandozia

I'm having trouble training a kmeans clustering pipeline. Even though all my features have the same type (double) the result is a Schema mismatch between double and single

my data structure class looks something like this:

public class Loja
{
    public double encrypted_5_zipcode { get; set; }
    public double faturamento_total { get; set; }
    public double transacoes_total { get; set; } 
    public static IEnumerable<Loja> ReadCsvSkipingErrors(string filePath)
    {
        //...
    }
}

and main program:

var lojas = Loja.ReadCsvSkipingErrors(fpath);
MLContext mlContext = new MLContext(seed: 1);
IDataView data = mlContext.Data.LoadFromEnumerable(lojas);            

var dataProcessPipeline = mlContext.Transforms.Concatenate("Features",
                nameof(Loja.encrypted_5_zipcode),
                nameof(Loja.faturamento_total),
                nameof(Loja.transacoes_total)
            );

var trainer = mlContext.Clustering.Trainers.KMeans(featureColumnName: "Features", numberOfClusters: 3);            
var trainigPipeline = dataProcessPipeline.Append(trainer);
var trainedModel = trainigPipeline.Fit(data);

the result is:

Unhandled Exception: System.ArgumentOutOfRangeException: Schema mismatch for feature column 'Features': expected Vector<Single>, got Vector<Double>
Parameter name: inputSchema
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.CheckInputSchema(SchemaShape inputSchema)
   at Microsoft.ML.Trainers.TrainerEstimatorBase`2.GetOutputSchema(SchemaShape inputSchema)
   at Microsoft.ML.Data.EstimatorChain`1.GetOutputSchema(SchemaShape inputSchema)
   at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)

If I change the type of variables to float in class "Loja" it works normally. Is it possible to use a double in this case?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions