Ordinals and minor changes to LogitDistLoss #88

mihirparadkar · 2017-07-01T01:12:47Z

LogitDistLoss incurs numerical problems for large values of diff, since there is exp(abs2(...)) which blows up to Inf very quickly. These problems are improved somewhat by simplifying the equation.

I also added an 'ordinalization' of margin losses that allows them to predict over ordinal targets, so that an output greater than the maximum or less than the minimum is not penalized strongly as in a DiffLoss

mihirparadkar · 2017-07-01T01:15:51Z

@madeleineudell @Evizero @joshday

Evizero · 2017-07-01T08:54:31Z

To address the question you wrote me on Discourse.

My end goal is structs and functions that look like the following:
struct OrdinalMarginLoss{T,N} <: SupervisedLoss where {T<:MarginLoss, N}
   loss::MarginLoss
end
I'm having some trouble figuring out how to write a constructor that takes in a MarginLoss and number as function arguments and returns a new instance with the required type parameters. I looked at your code for ScaledLoss but I'm having a bit of trouble understanding that code.

That is a very fair question, since that file uses a couple of advanced concepts.

I have not yet looked into ordinal losses, so I don't have any comments on the concrete implementation yet, but I am happy to walk through how to define a "decorator" type similar to ScaledMarginLoss.

First we need to define the type itself which needs two type parameters as you correctly assumed. This looks like the following. We bind a type variable L to the concrete loss type and we will use N for some non-changing numeric value (I am assuming the number of levels?)

struct OrdinalMarginLoss{L<:MarginLoss,N} <: SupervisedLoss
    loss::L
end

With this type defined we have in principle everything we need to work with it, but it would be unnecessarily verbose

julia> OrdinalMarginLoss{HingeLoss,5}(HingeLoss())
OrdinalMarginLoss{LossFunctions.L1HingeLoss,5}(LossFunctions.L1HingeLoss())

To make this type more convenient to work with we should also define an outer constructor. We will use the type Val (see docs) to allow specifying N in a type-stable way.

OrdinalMarginLoss(loss::T, ::Type{Val{N}}) where {T,N} = OrdinalMarginLoss{T,N}(loss)

This will allow a shorter contructor where we don't have to repeat typing the loss

julia> OrdinalMarginLoss(HingeLoss(), Val{5})
OrdinalMarginLoss{LossFunctions.L1HingeLoss,5}(LossFunctions.L1HingeLoss())

Evizero · 2017-07-01T09:14:03Z

src/supervised/ordinal.jl

+
+for fun in (:value, :deriv, :deriv2)
+    @eval @fastmath function ($fun)(loss::OrdinalMarginLoss, target::Number, output::Number)
+        retval = 0


I think a problem you may have is that initializing this variable with an Int may cause type instability since it is likely that value or deriv return some float

You're absolutely right, I just checked it with @code_warntype and it shows that retval is of type Any. I'm thinking that I should initialize it with the type of the output, since the type instability is making the function quite slow.

Since you are only using + to accumulate items it should work if you initialize with the first result.

i.e. set it to retval = ($fun)(loss.loss, 1, output - 1) and change the first loop to for t = 2:target-1

Actually, it seems like even if I specify

retval = zero(output)

that I get type instability because the type of retval is inferred to be Any.
How would you enforce type stability in this case?

Thats because in

struct OrdinalMarginLoss <: SupervisedLoss loss::MarginLoss nlevels::Int end

the member loss is weakly typed. This means no matter how beautifully clean the function value, the call value(loss.loss,...) will always be a run-time look up where the compiler can't know which method will be called . Try defining the type as I outlined in an earlier post using type variables

Sorry, I forgot to provide some more context.

I've redefined

struct OrdinalMarginLoss{L<:MarginLoss, N} <: SupervisedLoss loss::MarginLoss end function OrdinalMarginLoss(loss::T, ::Type{Val{N}}) where {T<:MarginLoss,N} typeof(N) <: Number || _serror() OrdinalMarginLoss{T,N}(loss) end

As you outlined

I'm trying to define the value, deriv, and deriv2 functions as follows:

for fun in (:value, :deriv, :deriv2) @eval @fastmath @generated function ($fun)(loss::OrdinalMarginLoss{T, N}, target::Number, output::Number) where {T <: MarginLoss, N} quote retval = zero(output) @nexprs $N t -> begin not_target = (t != target) sgn = sign(target - t) retval += not_target * ($($fun))(loss.loss, sgn, output - t) end retval end end end

It gives the correct answer, but @code_warntype still flags retval as being of type ::Any. Is there a different reason for this?

struct OrdinalMarginLoss{L<:MarginLoss,N} <: SupervisedLoss loss::L end

note the loss::L. This is the important part and the sole reason we even want the type of loss as the type parameter

I wouldn't bother with the generated function and fastmath at first. Try to make the initial implementation that is currently submitted type stable first. Otherwise you are dealing with many complex factors at the same time, which can be quite a hassle when trying to deduce issues

Evizero · 2017-07-01T18:20:04Z

plus I think that almost all the performance benefits will come from type stability with the unrolling probably being a small micro optimization

mihirparadkar · 2017-07-01T18:24:11Z

I missed the loss::L in the struct definition, and now it's type-stable. I'm going to benchmark the loop vs unrolled next, but I do think that the type stability is huge. Thanks for all your help! I'll probably have an update to the PR by the end of the day.

Evizero · 2017-07-01T18:26:09Z

Thanks for all your help!

Thanks for investing time in this

mihirparadkar · 2017-07-01T20:23:24Z

After some testing, it seems that the loop version and unrolled version are almost equally fast, but I think the loop is more readable. I just pushed the update to this PR.

Evizero · 2017-07-02T09:04:21Z

src/supervised/ordinal.jl

+for fun in (:value, :deriv, :deriv2)
+    @eval @fastmath function ($fun)(loss::OrdinalMarginLoss{T, N},
+                    target::Number, output::Number) where {T <: MarginLoss, N}
+        retval = zero(output)


I would still suggest using the first loop element as initialization to catch all kinds of sneaky type instability edge cases. For example this code is unstable in the following case

julia> using LossFunctions julia> output = 1 # Int 1 julia> value(L2HingeLoss(), -1., output) 4.0 julia> value(LogitMarginLoss(), -1, output) 1.3132616875182228

This is because the type of the return value depends on the type of target, output, and on what the loss does. Some losses such as LogitMarginLoss will always result in a float no matter the type of output.

That does make sense, and it's easier to reason about types if the underlying MarginLoss was responsible for determining the output type.

Changes made.

Evizero · 2017-07-02T09:06:33Z

src/supervised/ordinal.jl

+        retval = zero(output)
+        for t = 1:N
+            not_target = (t != target)
+            sgn = sign(target - t)


does this always work out? I remember sign having some inconvenient behaviour. for example

julia> sign(0) 0

The only time that sign(x) ∉{-1,1} is indeed when x == 0. However, in this case, target == t so the whole thing gets multiplied by 0 anyway, which is what should happen.

Evizero · 2017-07-10T09:20:56Z

sorry for the delay. This PR is on my agenda today!

Evizero · 2017-07-10T09:55:28Z

very nice thanks

Mihir Paradkar added 5 commits June 25, 2017 00:57

Fixed numerical issues with LogitDistFLoss

5504e5f

Fixed numerical issues with LogitDistLoss

ddb53e7

Added ordinal margin losses

ed9e525

Added full API for OrdinalMarginLoss

28b86c3

Added docs and test

503665b

Evizero self-requested a review July 1, 2017 08:19

Evizero reviewed Jul 1, 2017

View reviewed changes

Made OrdinalMarginLoss type-stable

6a9688a

Evizero reviewed Jul 2, 2017

View reviewed changes

Fixed edge-case type-instability

e635127

Evizero self-assigned this Jul 5, 2017

Evizero merged commit 2a87c9e into JuliaML:master Jul 10, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ordinals and minor changes to LogitDistLoss #88

Ordinals and minor changes to LogitDistLoss #88

mihirparadkar commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Evizero commented Jul 1, 2017 •

edited

Evizero Jul 1, 2017

mihirparadkar Jul 1, 2017

Evizero Jul 1, 2017

Evizero Jul 1, 2017

mihirparadkar Jul 1, 2017

Evizero Jul 1, 2017

mihirparadkar Jul 1, 2017

Evizero Jul 1, 2017

Evizero Jul 1, 2017

Evizero commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Evizero commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Evizero Jul 2, 2017

mihirparadkar Jul 2, 2017 •

edited

Evizero Jul 2, 2017

mihirparadkar Jul 2, 2017

Evizero commented Jul 10, 2017

Evizero commented Jul 10, 2017

Ordinals and minor changes to LogitDistLoss #88

Ordinals and minor changes to LogitDistLoss #88

Conversation

mihirparadkar commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Evizero commented Jul 1, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Evizero commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Evizero commented Jul 1, 2017

mihirparadkar commented Jul 1, 2017

Choose a reason for hiding this comment

mihirparadkar Jul 2, 2017 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Evizero commented Jul 10, 2017

Evizero commented Jul 10, 2017

Evizero commented Jul 1, 2017 •

edited

mihirparadkar Jul 2, 2017 •

edited