Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DeepLearning_ImageClassification_Training] Failed to run the sample. #689

Closed
JustLuoyu opened this issue Oct 10, 2019 · 18 comments
Closed

Comments

@JustLuoyu
Copy link

Throwing an exception while defining the pipeline:
// 4. Define the model's training pipeline var pipeline = mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelAsKey", inputColumnName: "Label", keyOrdinality: ValueToKeyMappingEstimator.KeyOrdinality.ByValue) .Append(mlContext.Model.ImageClassification("ImagePath", "LabelAsKey", arch: ImageClassificationEstimator.Architecture.ResnetV2101, epoch: 100, //An epoch is one learning cycle where the learner sees the whole training data set. batchSize: 30, // batchSize sets the number of images to feed the model at a time. It needs to divide the training set evenly or the remaining part won't be used for training. metricsCallback: (metrics) => Console.WriteLine(metrics), //OPTIONAL (*1*) validationSet: transformedValidationDataView));

Below are exception messages and stack trace:

Google.Protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length. at Google.Protobuf.CodedInputStream.RefillBuffer(Boolean mustSucceed) at Google.Protobuf.CodedInputStream.ReadRawBytes(Int32 size) at Google.Protobuf.CodedInputStream.ReadBytes() at Tensorflow.TensorProto.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Tensorflow.AttrValue.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Google.Protobuf.FieldCodec.<>c__DisplayClass16_01.b__0(CodedInputStream input)
at Google.Protobuf.Collections.MapField2.Codec.MessageAdapter.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Google.Protobuf.Collections.MapField2.AddEntriesFrom(CodedInputStream input, Codec codec)
at Tensorflow.NodeDef.MergeFrom(CodedInputStream input)
at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder)
at Google.Protobuf.FieldCodec.<>c__DisplayClass16_01.<ForMessage>b__0(CodedInputStream input) at Google.Protobuf.Collections.RepeatedField1.AddEntriesFrom(CodedInputStream input, FieldCodec1 codec) at Tensorflow.GraphDef.MergeFrom(CodedInputStream input) at Google.Protobuf.CodedInputStream.ReadMessage(IMessage builder) at Tensorflow.MetaGraphDef.MergeFrom(CodedInputStream input) at Google.Protobuf.MessageExtensions.MergeFrom(IMessage message, Byte[] data) at Google.Protobuf.MessageParser1.ParseFrom(Byte[] data)
at Tensorflow.saver._import_meta_graph_with_return_elements(String meta_graph_or_file, Boolean clear_devices, String import_scope, String[] return_elements)
at Microsoft.ML.Transforms.Dnn.DnnUtils.<>c__DisplayClass5_0.b__0(Graph graph)
at Tensorflow.Python.tf_with[TIn,TOut](TIn py, Func2 action)

@CESARDELATORRE
Copy link
Contributor

CESARDELATORRE commented Oct 11, 2019

I was not able to repro this issue.
In any case, we just updated this feature to Preview-2 which has great improvements such as:

  • Can score with in-memory images
  • Can use GPU

I just updated the same sample (still not using GPU in the sample, though, since it needs to use a different NuGet package and you need to install CUDA, too), but can you try this new version of the sample? In the tests we're doing it works out-of-the-box.

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_Training

Read the Samples's README.MD if you want so see the changed highlighted compared to Preview1. Most of the changes are related to the in-memory images usage.

Thanks,

@l1pus
Copy link

l1pus commented Oct 11, 2019

Thank you for reply.

I have tried new version (preview-2) , but still i expierence exceptions... Program is running until line :

Program.cs Line 74 ITransformer trainedModel = pipeline.Fit(trainDataView);
Which produce exception :

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

Still , at the begining of program execution i`m getting info :

2019-10-11 22:53:35.596168: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2

Maybe this is causing errors during executing of the program ?
Mine current setup CPU is : i7-6700K

@CESARDELATORRE
Copy link
Contributor

The warning 'Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2' is normal. It comes from TensorFlow and it does not impact the training.

I am not able to repro your problem as the sample works out of the box in all the machines we're trying. It must be related to the software/dependencies in the machine, OS version, not sure..

Can you provide details on what OS version (Win10?, Win7?, macOS?etc.) and related software (VS version?) you have installed, please?

@l1pus
Copy link

l1pus commented Oct 11, 2019

I`m working on Windows 10 Pro,
Version 1903
Compile version : 18326.356

Visual Studio 16.4.0 Preview 1.0

@CESARDELATORRE
Copy link
Contributor

@codemzs - Any idea? Looks like a low level TensorFlow dependency issue?

@codemzs
Copy link
Member

codemzs commented Oct 11, 2019

It’s because the meta file was not fully downloaded. @ashbhandare has fixed this issue. She can comment more.

@JustLuoyu
Copy link
Author

I was not able to repro this issue.
In any case, we just updated this feature to Preview-2 which has great improvements such as:

  • Can score with in-memory images
  • Can use GPU

I just updated the same sample (still not using GPU in the sample, though, since it needs to use a different NuGet package and you need to install CUDA, too), but can you try this new version of the sample? In the tests we're doing it works out-of-the-box.

https://github.com/dotnet/machinelearning-samples/tree/master/samples/csharp/getting-started/DeepLearning_ImageClassification_Training

Read the Samples's README.MD if you want so see the changed highlighted compared to Preview1. Most of the changes are related to the in-memory images usage.

Thanks,

Thank you! I just updated my repository. And here comes another problem:
.Append(mlContext.Transforms.LoadImages(outputColumnName: "Image", imageFolder: fullImagesetFolderPath, useImageType: false, inputColumnName: "ImagePath"))
The LoadImages method doesn't have a parameter named "useImageType".

@codemzs
Copy link
Member

codemzs commented Oct 12, 2019

Have you upgraded the image analytics nuget to the latest version?

@JustLuoyu
Copy link
Author

The version I used is 1.4.0-preview2, it's the latest but a preview version.

@CESARDELATORRE
Copy link
Contributor

Can you try deleting the TensorFlow meta file (all /BIN folder would do within your project) and re-compile and re-run?
It could be that the TensorFlow meta file was partially downloaded.

@l1pus
Copy link

l1pus commented Oct 14, 2019

Well deleting meta didn`t help sadly , still same error. :(

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

@paga-coder
Copy link

paga-coder commented Oct 16, 2019

The error was resolved when I modified the following in TextLoader.cs (project Microsoft.ML.Data):
if (needInputSize && inputSize == 0)
{
int min = 0;
int max = 0;
if (Utils.Size(lines) > 0)
Parser.GetInputSize(parent, lines, out min, out max);
if (max == 0)
throw ch.ExceptUserArg(nameof(Column.Source), "Can't determine the number of source columns without valid data");
ch.Assert(min <= max);
//comment the this lines:
//if (min < max)
// throw ch.ExceptUserArg(nameof(Column.Source), "The size of input lines is not consistent");

// We reserve SrcLim for variable.
inputSize = Math.Min(min, SrcLim - 1);
}
(Sorry for design, i am beginner in git editor)
Modified line numbers: 701 and 702.

@drjahu
Copy link

drjahu commented Oct 21, 2019

I'm also having this problem: "The size of input lines is not consistent Arg_ParamName_Name" at
"ITransformer trainedModel = pipeline.Fit(trainDataView);"
I use Microsoft.ML 1.4.0-preview2 and Microsoft.ML.DNN 0.16.0-preview2

@CESARDELATORRE
Copy link
Contributor

@codemzs - Any idea about this issue some folks are getting?
I cannot repro this issue. I have the sample running properly in several machines training on CPU or GPU.

@drjahu
Copy link

drjahu commented Oct 22, 2019

How can I help to debug this issue? I would love to get this sample up and running.

@ashbhandare
Copy link

Well deleting meta didn`t help sadly , still same error. :(

System.ArgumentOutOfRangeException: „The size of input lines is not consistent
Arg_ParamName_Name”

This exception looks different from the initial reported exception:

Google.Protobuf.InvalidProtocolBufferException: While parsing a protocol message, the input ended unexpectedly in the middle of a field. This could mean either that the input has been truncated or that an embedded message misreported its own length.

Can you add the stack trace and the complete exception message?

@mmieczynski
Copy link

Hi, I've got the same issue, here's my error message:

Unhandled Exception: System.ArgumentOutOfRangeException: The size of input lines is not consistent
Parameter name: Source
at Microsoft.ML.Data.TextLoader.Bindings..ctor(TextLoader parent, Column[] cols, IMultiStreamSource headerFile, IMultiStreamSource dataSample)
at Microsoft.ML.Data.TextLoader..ctor(IHostEnvironment env, Options options, IMultiStreamSource dataSample)
at Microsoft.ML.Transforms.ImageClassificationTransformer.GetShuffledData(String path)
at Microsoft.ML.Transforms.ImageClassificationTransformer.TrainAndEvaluateClassificationLayer(String trainBottleneckFilePath, Options options, String validationSetBottleneckFilePath)
at Microsoft.ML.Transforms.ImageClassificationTransformer..ctor(IHostEnvironment env, Options options, DnnModel tensorFlowModel, IDataView input)
at Microsoft.ML.Transforms.ImageClassificationEstimator.Fit(IDataView input)
at Microsoft.ML.Data.EstimatorChain`1.Fit(IDataView input)
at ImageClassification.Train.Program.Main(String[] args) in C:\Users\macie\source\repos\DeepLearning_ImageClassification_Training\ImageClassification.Train\Program.cs:line 74

I think the image data is somehow not correct because just before the error during "Bottleneck Computation" phase, there are no image names shown. Expample:

Phase: Bottleneck Computation, Dataset used: Train, Image Index: 1, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 2, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 3, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 4, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 5, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 6, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 7, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 8, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 9, Image Name:
Phase: Bottleneck Computation, Dataset used: Train, Image Index: 10, Image Name:

@ashbhandare
Copy link

The original issue "Google.Protobuf.InvalidProtocolBufferException" has been resolved. The second issue "The size of input lines is not consistent" is being tracked here: #717

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants