[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

JustLuoyu · 2019-10-10T09:57:05Z

Throwing an exception while defining the pipeline:
// 4. Define the model's training pipeline var pipeline = mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelAsKey", inputColumnName: "Label", keyOrdinality: ValueToKeyMappingEstimator.KeyOrdinality.ByValue) .Append(mlContext.Model.ImageClassification("ImagePath", "LabelAsKey", arch: ImageClassificationEstimator.Architecture.ResnetV2101, epoch: 100, //An epoch is one learning cycle where the learner sees the whole training data set. batchSize: 30, // batchSize sets the number of images to feed the model at a time. It needs to divide the training set evenly or the remaining part won't be used for training. metricsCallback: (metrics) => Console.WriteLine(metrics), //OPTIONAL (*1*) validationSet: transformedValidationDataView));

Below are exception messages and stack trace:

Google.Protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length. at Google.Protobuf.CodedInputStream.RefillBuffer(Boolean mustSucceed) at Google.Protobuf.CodedInputStream.ReadRawBytes(Int32 size) at Google.Protobuf.CodedInputStream.ReadBytes() at Tensorflow.TensorProto.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Tensorflow.AttrValue.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Google.Protobuf.FieldCodec.<>c__DisplayClass16_01.b__0(CodedInputStream input)
at Google.Protobuf.Collections.MapField2.Codec.MessageAdapter.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Google.Protobuf.Collections.MapField2.AddEntriesFrom(CodedInputStream input, Codec codec)
at Tensorflow.NodeDef.MergeFrom(CodedInputStream input)
at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder)
at Google.Protobuf.FieldCodec.<>c__DisplayClass16_01.<ForMessage>b__0(CodedInputStream input) at Google.Protobuf.Collections.RepeatedField1.AddEntriesFrom(CodedInputStream input, FieldCodec1 codec) at Tensorflow.GraphDef.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Tensorflow.MetaGraphDef.MergeFrom(CodedInputStream input) at Google.Protobuf.MessageExtensions.MergeFrom(IMessage message, Byte[] data) at Google.Protobuf.MessageParser1.ParseFrom(Byte[] data)
at Tensorflow.saver._import_meta_graph_with_return_elements(String meta_graph_or_file, Boolean clear_devices, String import_scope, String[] return_elements)
at Microsoft.ML.Transforms.Dnn.DnnUtils.<>c__DisplayClass5_0.b__0(Graph graph)
at Tensorflow.Python.tf_with[TIn,TOut](TIn py, Func2 action)

The text was updated successfully, but these errors were encountered:

CESARDELATORRE · 2019-10-11T17:11:17Z

I was not able to repro this issue.
In any case, we just updated this feature to Preview-2 which has great improvements such as:

Can score with in-memory images
Can use GPU

I just updated the same sample (still not using GPU in the sample, though, since it needs to use a different NuGet package and you need to install CUDA, too), but can you try this new version of the sample? In the tests we're doing it works out-of-the-box.

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_Training

Read the Samples's README.MD if you want so see the changed highlighted compared to Preview1. Most of the changes are related to the in-memory images usage.

Thanks,

l1pus · 2019-10-11T20:58:19Z

Thank you for reply.

I have tried new version (preview-2) , but still i expierence exceptions... Program is running until line :

Program.cs Line 74 ITransformer trainedModel = pipeline.Fit(trainDataView);
Which produce exception :

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

Still , at the begining of program execution i`m getting info :

2019-10-11 22:53:35.596168: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

Maybe this is causing errors during executing of the program ?
Mine current setup CPU is : i7-6700K

CESARDELATORRE · 2019-10-11T21:45:24Z

The warning 'Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2' is normal. It comes from TensorFlow and it does not impact the training.

I am not able to repro your problem as the sample works out of the box in all the machines we're trying. It must be related to the software/dependencies in the machine, OS version, not sure..

Can you provide details on what OS version (Win10?, Win7?, macOS?etc.) and related software (VS version?) you have installed, please?

l1pus · 2019-10-11T21:48:08Z

I`m working on Windows 10 Pro,
Version 1903
Compile version : 18326.356

Visual Studio 16.4.0 Preview 1.0

CESARDELATORRE · 2019-10-11T21:49:55Z

@codemzs - Any idea? Looks like a low level TensorFlow dependency issue?

codemzs · 2019-10-11T22:45:16Z

It’s because the meta file was not fully downloaded. @ashbhandare has fixed this issue. She can comment more.

JustLuoyu · 2019-10-12T02:16:40Z

I was not able to repro this issue.
In any case, we just updated this feature to Preview-2 which has great improvements such as:

Can score with in-memory images

Can use GPU

I just updated the same sample (still not using GPU in the sample, though, since it needs to use a different NuGet package and you need to install CUDA, too), but can you try this new version of the sample? In the tests we're doing it works out-of-the-box.

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_Training

Read the Samples's README.MD if you want so see the changed highlighted compared to Preview1. Most of the changes are related to the in-memory images usage.

Thanks,

Thank you! I just updated my repository. And here comes another problem:
.Append(mlContext.Transforms.LoadImages(outputColumnName: "Image", imageFolder: fullImagesetFolderPath, useImageType: false, inputColumnName: "ImagePath"))
The LoadImages method doesn't have a parameter named "useImageType".

codemzs · 2019-10-12T02:28:08Z

Have you upgraded the image analytics nuget to the latest version?

JustLuoyu · 2019-10-12T02:54:52Z

The version I used is 1.4.0-preview2, it's the latest but a preview version.

CESARDELATORRE · 2019-10-14T16:49:05Z

Can you try deleting the TensorFlow meta file (all /BIN folder would do within your project) and re-compile and re-run?
It could be that the TensorFlow meta file was partially downloaded.

l1pus · 2019-10-14T17:19:48Z

Well deleting meta didn`t help sadly , still same error. :(

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

paga-coder · 2019-10-16T09:20:32Z

The error was resolved when I modified the following in TextLoader.cs (project Microsoft.ML.Data):
if (needInputSize && inputSize == 0)
{
int min = 0;
int max = 0;
if (Utils.Size(lines) > 0)
Parser.GetInputSize(parent, lines, out min, out max);
if (max == 0)
throw ch.ExceptUserArg(nameof(Column.Source), "Can't determine the number of source columns without valid data");
ch.Assert(min <= max);
//comment the this lines:
//if (min < max)
// throw ch.ExceptUserArg(nameof(Column.Source), "The size of input lines is not consistent");
// We reserve SrcLim for variable.
inputSize = Math.Min(min, SrcLim - 1);
}
(Sorry for design, i am beginner in git editor)
Modified line numbers: 701 and 702.

drjahu · 2019-10-21T09:57:43Z

I'm also having this problem: "The size of input lines is not consistent Arg_ParamName_Name" at
"ITransformer trainedModel = pipeline.Fit(trainDataView);"
I use Microsoft.ML 1.4.0-preview2 and Microsoft.ML.DNN 0.16.0-preview2

CESARDELATORRE · 2019-10-21T18:10:16Z

@codemzs - Any idea about this issue some folks are getting?
I cannot repro this issue. I have the sample running properly in several machines training on CPU or GPU.

drjahu · 2019-10-22T07:44:15Z

How can I help to debug this issue? I would love to get this sample up and running.

ashbhandare · 2019-10-23T20:31:50Z

Well deleting meta didn`t help sadly , still same error. :(

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

This exception looks different from the initial reported exception:

Google.Protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.

Can you add the stack trace and the complete exception message?

mmieczynski · 2019-10-26T10:09:56Z

Hi, I've got the same issue, here's my error message:

Unhandled Exception: System.ArgumentOutOfRangeException: The size of input lines is not consistent
Parameter name: Source
at Microsoft.ML.Data.TextLoader.Bindings..ctor(TextLoader parent, Column[] cols, IMultiStreamSource headerFile, IMultiStreamSource dataSample)
at Microsoft.ML.Data.TextLoader..ctor(IHostEnvironment env, Options options, IMultiStreamSource dataSample)
at Microsoft.ML.Transforms.ImageClassificationTransformer.GetShuffledData(String path)
at Microsoft.ML.Transforms.ImageClassificationTransformer.TrainAndEvaluateClassificationLayer(String trainBottleneckFilePath, Options options, String validationSetBottleneckFilePath)
at Microsoft.ML.Transforms.ImageClassificationTransformer..ctor(IHostEnvironment env, Options options, DnnModel tensorFlowModel, IDataView input)
at Microsoft.ML.Transforms.ImageClassificationEstimator.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
at ImageClassification.Train.Program.Main(String[] args) in C:\Users\macie\source\repos\DeepLearning_ImageClassification_Training\ImageClassification.Train\Program.cs:line 74

I think the image data is somehow not correct because just before the error during "Bottleneck Computation" phase, there are no image names shown. Expample:

Phase: Bottleneck Computation, Dataset used: Train, Image Index: 1, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 2, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 3, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 4, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 5, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 6, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 7, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 8, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 9, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 10, Image Name:

ashbhandare · 2019-11-05T22:18:14Z

The original issue "Google.Protobuf.InvalidProtocolBufferException" has been resolved. The second issue "The size of input lines is not consistent" is being tracked here: #717

ashbhandare closed this as completed Nov 18, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

JustLuoyu commented Oct 10, 2019

CESARDELATORRE commented Oct 11, 2019 •

edited

l1pus commented Oct 11, 2019 •

edited

CESARDELATORRE commented Oct 11, 2019

l1pus commented Oct 11, 2019

CESARDELATORRE commented Oct 11, 2019

codemzs commented Oct 11, 2019

JustLuoyu commented Oct 12, 2019

codemzs commented Oct 12, 2019

JustLuoyu commented Oct 12, 2019

CESARDELATORRE commented Oct 14, 2019

l1pus commented Oct 14, 2019 •

edited

paga-coder commented Oct 16, 2019 •

edited

drjahu commented Oct 21, 2019

CESARDELATORRE commented Oct 21, 2019

drjahu commented Oct 22, 2019 •

edited

ashbhandare commented Oct 23, 2019

mmieczynski commented Oct 26, 2019

ashbhandare commented Nov 5, 2019

[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

Comments

JustLuoyu commented Oct 10, 2019

CESARDELATORRE commented Oct 11, 2019 • edited

l1pus commented Oct 11, 2019 • edited

CESARDELATORRE commented Oct 11, 2019

l1pus commented Oct 11, 2019

CESARDELATORRE commented Oct 11, 2019

codemzs commented Oct 11, 2019

JustLuoyu commented Oct 12, 2019

codemzs commented Oct 12, 2019

JustLuoyu commented Oct 12, 2019

CESARDELATORRE commented Oct 14, 2019

l1pus commented Oct 14, 2019 • edited

paga-coder commented Oct 16, 2019 • edited

drjahu commented Oct 21, 2019

CESARDELATORRE commented Oct 21, 2019

drjahu commented Oct 22, 2019 • edited

ashbhandare commented Oct 23, 2019

mmieczynski commented Oct 26, 2019

ashbhandare commented Nov 5, 2019

CESARDELATORRE commented Oct 11, 2019 •

edited

l1pus commented Oct 11, 2019 •

edited

l1pus commented Oct 14, 2019 •

edited

paga-coder commented Oct 16, 2019 •

edited

drjahu commented Oct 22, 2019 •

edited