In [1]:
#r "nuget: Microsoft.ML"
using Microsoft.ML;
using Microsoft.ML.Data;
using System.Linq;

This example comes from the ML.NET documentation: https://docs.microsoft.com/en-us/dotnet/api/microsoft.ml.transformextensionscatalog.copycolumns?view=ml-dotnet

In [2]:
class InputData
{
    public int ImageId { get; set; }
    public float[] Features { get; set; }
}

class TransformedData : InputData
{
    public int Label { get; set; }
}

In [3]:
var mlContext = new MLContext();

In [4]:
var samples = new List<InputData>()
{
    new InputData(){ ImageId = 1, Features = new [] { 1.0f, 1.0f, 1.0f } },
    new InputData(){ ImageId = 2, Features = new [] { 2.0f, 2.0f, 2.0f } },
    new InputData(){ ImageId = 3, Features = new [] { 3.0f, 3.0f, 3.0f } },
    new InputData(){ ImageId = 4, Features = new [] { 4.0f, 4.0f, 4.0f } },
    new InputData(){ ImageId = 5, Features = new [] { 5.0f, 5.0f, 5.0f } },
    new InputData(){ ImageId = 6, Features = new [] { 6.0f, 6.0f, 6.0f } },
};

In [5]:
var dataview = mlContext.Data.LoadFromEnumerable(samples);

CopyColumns is commonly used to rename columns. For example, if you want to train towards ImageId, and your trainer expects a "Label" column, you can use CopyColumns to rename ImageId to Label. Technically, the ImageId column still exists, but it won't be materialized unless you actually need it somewhere (e.g. if you were to save the transformed data without explicitly dropping the column). This is a general property of IDataView's lazy evaluation.

In [6]:
var pipeline = mlContext.Transforms.CopyColumns("Label", "ImageId");

In [7]:
var transformedData = pipeline.Fit(dataview).Transform(dataview);

In [8]:
mlContext.Data.CreateEnumerable<TransformedData>(transformedData, reuseRowObject: false)

index,Label,ImageId,Features
0,1,1,"[ 1, 1, 1 ]"
1,2,2,"[ 2, 2, 2 ]"
2,3,3,"[ 3, 3, 3 ]"
3,4,4,"[ 4, 4, 4 ]"
4,5,5,"[ 5, 5, 5 ]"
5,6,6,"[ 6, 6, 6 ]"
