Skip to content

Model builder reports input data and validation data in csv do not match #2928

@pjsgsy

Description

@pjsgsy

Model builder reports input data and validation data in csv do not match. I have checked the format of each with several utilities and they all report the correct number for columns and correctly formatted numbers in each file. As the files are several hundred MB and that is the only error model builder reports, it's kind of hard to find WHAT it does not like.

Would it be possible to at least output the line number of the csv or the line of data it has stumbled on, so the matter can at least be investigated? As far as I can see in the logs, all we get at the moment is an error. Nothing to indicate why or where.

This is kind of driving me crazy. Every lint I do with separate tools says the csv are OK. If I select it and let model builder do the training split, it's fine. If I use the separate validation file, I get this

Train dataset and validate dataset schema is not compatible. Column(s) col0, col1, col2, col3, col4, col5, col6, col7, col8, col9, col10, col11, col12, col13, col14, col15, col16, col17, col18, col19, col20, col21, col22, col23, col24, col25, col26, col27, col28, col29, col30, col31, col32, col33, col34, col35, col36, col37, col38, col39, col40, col41, col42, col43, col44, col45, col46, col47, col48, col49, col50, col51, col52, col53, col54, col55, col56 are not found in validate dataset.

at Microsoft.ML.ModelBuilder.AutoMLEngine.GetValidateDatasetLoaderOption(ITrainingConfiguration config, IEnumerable`1 validateColumnInfo, Char decimalSeparator, String delimiter, Boolean hasHeader) in //src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 318
at Microsoft.ML.ModelBuilder.AutoMLEngine.d__31.MoveNext() in /
/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 410
--- End of stack trace from previous location where exception was thrown ---
at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw()
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Microsoft.ML.ModelBuilder.AutoMLEngine.d__21.MoveNext() in /_/src/Microsoft.ML.ModelBuilder.AutoMLService/AutoMLEngineService/AutoMLEngine.cs:line 106

Yet, the validation file seems fine and correctly formatted.

As stated, a few more details about where and what it is stumbling on would help immensely.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions