This library is a utility library used to transform targets which can come in an array of ways (booleans, $(-1,1)$, continuous variables) into specific labels. This is very important in many classification problems, since for instance NNs expect vectors of the form $[1,0,0]$ for three classes, while SVMs expect scalars in the set ${-1,1}$.


In [69]:
using MLLabelUtils
include("load_titanic.jl");

In [74]:
train, train_targets, test, test_targets = load();

Note that $train\_targets$ is an Int64 array of ones and zeros.
If we wanted to quickly find this, we could use:

In [79]:
# Automatically finds most adapted label type
encoded_tragets = labelenc(train_targets)

MLLabelUtils.LabelEnc.ZeroOne{Int64,Float64}(0.5)

Note that for the previous step to work, the array must be truly 1-dimensional. If a column is taken from a matrix as is often the case since targets tend to be stored as one of the columns in the training dataset, this method will fail as it will think of the 1-dimensional array as a matrix with one column, and will therefore give a "OneVsK" type as can be seen. 

To avoid this:

In [76]:
println("Current type is: $(typeof(train_targets))")
train_targets = train_targets[:]
println("The modified type is: $(typeof(train_targets))")

Current type is: Array{Int64,2}
The modified type is: Array{Int64,1}


And now we would get the correct format:

In [None]:
encoded_tragets = labelenc(true_targets);


If we were to train an SVM, we would need margin based targets, which can be calculated from:

In [78]:
convertlabel(LabelEnc.MarginBased, train_targets);
# Returns arrays of 1 & -1

In [8]:

# Tells if the current variable is of this encoding type
islabelenc(true_targets, LabelEnc.ZeroOne); # True
islabelenc(true_targets, LabelEnc.MarginBased); # False

In [7]:
# Can also compute the label map, which returns a dictionary of telling for each label, which indeces belong to it
true_targets = [:yes,:no,:maybe,:yes, :maybe];
labelmap(true_targets)

Dict{Symbol,Array{Int64,1}} with 3 entries:
  :yes   => [1, 4]
  :maybe => [3, 5]
  :no    => [2]

One can convert one type of label to another:

In [9]:
true_targets = Int8[0, 1, 0, 1, 1];

In [10]:
convertlabel([:yes,:no], true_targets); # Equivalent to LabelEnc.NativeLabels([:yes,:no])

In [11]:
convertlabel(LabelEnc.MarginBased, true_targets) # Preserves eltype

5-element Array{Int8,1}:
 -1
  1
 -1
  1
  1

In [12]:
convertlabel(LabelEnc.MarginBased(Float32), true_targets) # Force new eltype

5-element Array{Float32,1}:
 -1.0
  1.0
 -1.0
  1.0
  1.0

In [15]:
# Converting to categorical data for NNs
convertlabel(LabelEnc.OneOfK(Float32,3), [-1,1,-1,1,0,1,-1])

3×7 Array{Float32,2}:
 1.0  0.0  1.0  0.0  0.0  0.0  1.0
 0.0  1.0  0.0  1.0  0.0  1.0  0.0
 0.0  0.0  0.0  0.0  1.0  0.0  0.0