Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Searching for documentation on image regression #5595

Open
Titibo26 opened this issue Jan 21, 2021 · 5 comments
Open

Searching for documentation on image regression #5595

Titibo26 opened this issue Jan 21, 2021 · 5 comments
Labels
image Bugs related image datatype tasks question Further information is requested regression Bugs related regression tasks

Comments

@Titibo26
Copy link

Titibo26 commented Jan 21, 2021

Hi,

  1. I'd like to train a model to make depth estimation on monocular rgb picture.
    I think this can be done though regression with resnet or densenet.

I have a dataset ( https://cs.nyu.edu/~silberman/datasets/nyu_depth_v2.html ) with pairs of pictures (input / result needed) :
Rgb_img_1 / depth_img_1
And i have an Excel file with each path to each files.

I started with the multiclassification tutorial ( https://docs.microsoft.com/fr-fr/dotnet/machine-learning/tutorials/image-classification ) but now, i have to translate it to a regression model as i'm searching for depth values for each pixel of a picture.

I know that i have to change my model generation :
`
public static ITransformer GenerateModel(MLContext mlContext)
{

        IDataView trainingData = mlContext.Data.LoadFromTextFile<ImageData>(path: _trainTagsCsv, separatorChar: ',', hasHeader: false);




        IEstimator<ITransformer> pipeline = mlContext.Transforms.LoadImages(outputColumnName: "input", imageFolder: _imagesFolder, inputColumnName: nameof(ImageData.InputImagePath))
              // The image transforms transform the images into the model's expected format.
              .Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: InceptionSettings.ImageWidth, imageHeight: InceptionSettings.ImageHeight, inputColumnName: "input"))
              .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: InceptionSettings.ChannelsLast, offsetImage: InceptionSettings.Mean))
              .Append(mlContext.Model.LoadTensorFlowModel(_inceptionTensorFlowModel)
              .ScoreTensorFlowModel(outputColumnNames: new[] { "softmax2_pre_activation" }, inputColumnNames: new[] { "input" }, addBatchDimensionInput: true))
              .Append(mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: "LabelKey", inputColumnName: "Label"))
              .Append(mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy(labelColumnName: "LabelKey", featureColumnName: "softmax2_pre_activation"))
              .Append(mlContext.Transforms.Conversion.MapKeyToValue("PredictedLabelValue", "PredictedLabel"))
              .AppendCacheCheckpoint(mlContext);

        ITransformer model = pipeline.Fit(trainingData);

        IDataView testData = mlContext.Data.LoadFromTextFile<ImageData>(path: _testTagsCsv, hasHeader: false);
        IDataView predictions = model.Transform(testData);

        // Create an IEnumerable for the predictions for displaying results
        IEnumerable<ImagePrediction> imagePredictionData = mlContext.Data.CreateEnumerable<ImagePrediction>(predictions, true);
        DisplayResults(imagePredictionData);

        MulticlassClassificationMetrics metrics = mlContext.MulticlassClassification
            .Evaluate(predictions, labelColumnName: "LabelKey", predictedLabelColumnName: "PredictedLabel");

        Console.WriteLine($"LogLoss is: {metrics.LogLoss}");
        Console.WriteLine($"PerClassLogLoss is: {String.Join(" , ", metrics.PerClassLogLoss.Select(c => c.ToString()))}");

        return model;
    }

`

Could you tell me where can i find docs and ressources to understand :

  • how to choose and use a appropriated model
  • how to transform my inputs to make it usable for the model.
  1. Also, i have a .onnx of densenet. would it be easier to go this way instead of using a ml.net model ? (but i'd like to deeply understand ml.net framework)

  2. Also i took a look on autoMl but i dont think it can resolve my regression problem with images input. Is this right ?

Thanks,

@michaelgsharp michaelgsharp added question Further information is requested regression Bugs related regression tasks labels Jan 21, 2021
@michaelgsharp
Copy link
Member

@justinormont could I have to take a quick look at this?

@justinormont
Copy link
Contributor

Using AutoML

@Titibo26 -- You can have AutoML run the model selection and hyperparameter optimization for you. You would want to feed your current image pipeline as a preFeaturizer to AutoML. See: preFeaturizer example

You'll want to also tell AutoML to ignore the original filenames, otherwise it will treat them as features, causing leakage.

Non-AutoML

For you current pipeline, you can just replace your current trainer with a regression trainer. You also won't need the MapValueToKey and MapKeyToValue.

For the evaluation step, you'll need to switch from MulticlassClassificationMetrics to RegressionMetrics and switch mlContext.MulticlassClassification to mlContext.Regression along with reading corresponding regression metrics, like r2, instead of log-loss.

@justinormont justinormont added the image Bugs related image datatype tasks label Jan 21, 2021
@Titibo26
Copy link
Author

Hi, thank you for the answer.
I'm giving a try to AutoML as it can be very usefull for some applications. After a few attempts i'm starting back with the preFeaturizer example. I don't understand some parts of the process to run an experiment.

Firstly, the DataView interface that i need should be a collection of object containing the RGB string path AND the depth string Path, right ?

Regarding the example, the process create a dataview containing paths to images files as shown :
// Show data in DataView: Showing 1 rows with the columns // ###################################################### // Row--> | Label:data/nyu2_train/living_room_0038_out/37.jpg| col1:data/nyu2_train/living_room_0038_out/37.png

Next step is about prefeaturizing data. Regarding method's doc, i guess the aim is to tell autoML which value is what.
So, my keyvaluePair List should be KeyValuePair<string, string> as for each images, path of rgb is input and path of depth is output, right ?

Next step is about customizing columninformations. This is where i need to tell autoML that my column with rbg path and my column with depth path are images. I also need to tell it to not take column name / label into account.

I can then handle next steps of th experiment...
So my biggest missunderstanding is about how to go from a csv file containing imgs path and depth path, to ready-to-experiment data.

So far, when i run my experiment with experiment.Execute(TrainDataView, columnInformation, preFeaturizer, progressHandler); , this exception is thrown :
System.ArgumentException : 'Provided label column 'Label' was of type String, but only type Single is allowed.'

I think my mistakes are in "prefeaturizing " and in "customizing columninformations".
Here is my code so far (i implemented alsmot everything in 1 method to make tests and reporting easier)

` public static void GenerateModel()
{
var mlContext = new MLContext();

        //My CSV file named "_trainTagsCsv" contains 1 column with the path of a rgb img followed by the path of the corresponding depth img (separated by ',')
        //EX : data/nyu2_train/living_room_0038_out/37.jpg,data/nyu2_train/living_room_0038_out/37.png

        //Feeding the columnInference with the CSV containing  previouse data
        ColumnInferenceResults columnInference = mlContext.Auto().InferColumns(_trainTagsCsv,0, false,',', null, null,false, false);
        //columnInference should provide one column with paths to rgb image and one column with paths to depth image

        // Load data from files using inferred columns.
        TextLoader textLoader = mlContext.Data.CreateTextLoader(columnInference.TextLoaderOptions);
        TrainDataView = textLoader.Load(_trainTagsCsv);
        TestDataView = textLoader.Load(_testTagsCsv);
        //Each dataView should contains for each image, the path to the rgb and the path to the depth
        

        // STEP 1: Display first few rows of the training data.
        ConsoleHelper.ShowDataViewInConsole(mlContext, TrainDataView);
        //ConsoleHelper output : 
       // Show data in DataView: Showing 4 rows with the columns
      //  ######################################################
       // Row--> | Label:data/nyu2_train/living_room_0038_out/37.jpg| col1:data/nyu2_train/living_room_0038_out/37.png

      // STEP 2: Build a pre-featurizer for use in the AutoML experiment.
      IEstimator<ITransformer> preFeaturizer = mlContext.Transforms.Conversion.MapValue("depth_img", new[] { new KeyValuePair<string, string>("rgb_path", "depth_path") }, "rgb_img");

      // STEP 3: Customize column information returned by InferColumns API.
      ColumnInformation columnInformation = columnInference.ColumnInformation;
      columnInformation.CategoricalColumnNames.Remove("label");
      columnInference.ColumnInformation.ImagePathColumnNames.Add("label");

      // STEP 4: Initialize a cancellation token source to stop the experiment.
      var cts = new CancellationTokenSource();

      // STEP 5: Initialize our user-defined progress handler that AutoML will 
      // invoke after each model it produces and evaluates.
      var progressHandler = new RegressionExperimentProgressHandler();

     // STEP 6: Create experiment settings
     var experimentSettings = new RegressionExperimentSettings();
     experimentSettings.MaxExperimentTimeInSeconds = 3600;

     // Set the metric that AutoML will try to optimize over the course of the experiment.
     experimentSettings.OptimizingMetric = RegressionMetric.RootMeanSquaredError;

        // Set the cache directory to null.
        // This will cause all models produced by AutoML to be kept in memory 
        // instead of written to disk after each run, as AutoML is training.
        experimentSettings.CacheDirectory = null;

        // Don't use LbfgsPoissonRegression and OnlineGradientDescent trainers during this experiment.
        experimentSettings.Trainers.Remove(RegressionTrainer.LbfgsPoissonRegression);
        experimentSettings.Trainers.Remove(RegressionTrainer.OnlineGradientDescent);

        // Cancel experiment after the user presses any key
        experimentSettings.CancellationToken = cts.Token;
        CancelExperimentAfterAnyKeyPress(cts);

        
        // STEP 7: Run AutoML regression experiment.
       var experiment = mlContext.Auto().CreateRegressionExperiment(experimentSettings);

        ConsoleHelper.ConsoleWriteHeader("=============== Running AutoML experiment ===============");
        Console.WriteLine($"Running AutoML regression experiment...");
        var stopwatch = Stopwatch.StartNew();
                    // Cancel experiment after the user presses any key.
        CancelExperimentAfterAnyKeyPress(cts);
        ExperimentResult<RegressionMetrics> experimentResult = experiment.Execute(TrainDataView, columnInformation, preFeaturizer, progressHandler);
        //ERROR THROWN : System.ArgumentException : 'Provided label column 'Label' was of type String, but only type Single is allowed.'

        Console.WriteLine($"{experimentResult.RunDetails.Count()} models were returned after {stopwatch.Elapsed.TotalSeconds:0.00} seconds{Environment.NewLine}");


        /*
        IDataView predictions = model.Transform(TestDataView);
        var metrics = mlContext.Regression.Evaluate(predictions, labelColumnName: LabelColumnName, scoreColumnName: "Score");
       */

        //To use if i need to manipulate images ?
        /*
                                      .Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: MidasSettings.ImageWidth, imageHeight: MidasSettings.ImageHeight, inputColumnName: "input"))
                                      .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input"))
        */

    }

`
Could you please take a look at step 2 and 3 (pre-featurizer and Customize column) to tell me where am i wrong ?
Thank you,

TD

@justinormont
Copy link
Contributor

'Provided label column 'Label' was of type String, but only type Single is allowed.'

For a regression task, you'll have ensure the datatype of the "Label" column is Single.

It would be helpful if you could post a zip of the full solution, or more simply place a copy unzipped in a new github repo.

PreFeaturizer

The preFeaturizer will consist of the pipeline which loads the image from disk, extracts the pixels, and runs the ONNX/TF model.

IEstimator<ITransformer> preFeaturizer = mlContext.Transforms.LoadImages(outputColumnName: "input", imageFolder: _imagesFolder, inputColumnName: nameof(ImageData.InputImagePath))
              // The image transforms transform the images into the model's expected format.
              .Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: InceptionSettings.ImageWidth, imageHeight: InceptionSettings.ImageHeight, inputColumnName: "input"))
              .Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: InceptionSettings.ChannelsLast, offsetImage: InceptionSettings.Mean))
              .Append(mlContext.Model.LoadTensorFlowModel(_inceptionTensorFlowModel)
              .ScoreTensorFlowModel(outputColumnNames: new[] { "softmax2_pre_activation" }, inputColumnNames: new[] { "input" }, addBatchDimensionInput: true))

Column purposes

      // STEP 3: Customize column information returned by InferColumns API.
      ColumnInformation columnInformation = columnInference.ColumnInformation;
      columnInformation.CategoricalColumnNames.Remove("label");
      columnInformation.LabelColumnName = "label";
      columnInformation.IgnoredColumnNames.Add(nameof(ImageData.InputImagePath));

@Titibo26
Copy link
Author

Titibo26 commented Jan 22, 2021

Here is a repo with the zipped Solution :
https://github.com/Titibo26/Depth_Estimation_withAutoMl
Code is in "Depth_estimator.cs". Some RGB and depth pictures are in "data" and the "nyu2_subset_train.csv" contains all the paths. Normally you should be able to launch the app which will execute "GenerateModel()" when loaded.

I still can't figure how to convert my CSV with images' Paths to image input for autoML.
Please forget about running an ONNX/TF model for now, i'd like to understand AutoML.
I still think my error is with the preFeaturizer AND/OR the olumnInference.

Thank you

EDIT: On my final implementation it seems that i get my data loaded but an exception is fired when i execute the experiment :
System.ArgumentException : 'Duplicate column name Label is present in two or more distinct properties of provided column information even if i ignore the label column. In my columninformation object, i can't see two label name being the same...
Can it be related to https://github.com/dotnet/machinelearning/issues/4263 ?
Or can it be because my img files and my depth files have the same name expect for the extension (in this case preFeaturizer ignore .png and .jpeg ?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
image Bugs related image datatype tasks question Further information is requested regression Bugs related regression tasks
Projects
None yet
Development

No branches or pull requests

3 participants