You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
fist of all thank you very much for your awesome work!
I am currently looking through the code that fits the flame model to the 2D landmarks of a given image and I am trying to understand it. However, I am not sure what the logic behind the formula is that was used to create the loss for finding the appropriate scale, translation and rotation to align the projected flame landmarks with the target landmarks.
More specifically I am talking about this two lines:
I do understand that the factor is the maximum range of either the x- or y-coordinates of the target landmarks. But why is the result of the subtraction of the projected landmarks and the target landmarks squared, reduced and then divided by the square of the factor? Why don't calculate the average distance of the projected and the target landmarks and reduce those?
The text was updated successfully, but these errors were encountered:
D0miH
changed the title
Understanding loss for the optimization of the rigid transformation
Understanding the loss function for the optimization of the rigid transformation
Jun 21, 2020
Hello,
the loss itself is just the sum of squared L2 distances between the projected 2D landmarks and the target landmarks. One could also use a difference distance (including the average of absolute distances, this resembles the L1 distance). Taking the average vs summing over all landmarks is just a fixed factor which is not important, as the number of landmarks remains fixed.
The overall minimization is a weighted sum of losses, where the landmark loss is the only objective function that depends on the actual image size (all other regularizes are independent of the image size and only depend on the model dimensions). To compensate for this, we divide by some normalization factor that depends on the face size within the image. Without this normalization, the influence of the regularizes versus the landmark loss would strongly be influenced by the size of the face within the image.
Hello,
fist of all thank you very much for your awesome work!
I am currently looking through the code that fits the flame model to the 2D landmarks of a given image and I am trying to understand it. However, I am not sure what the logic behind the formula is that was used to create the loss for finding the appropriate scale, translation and rotation to align the projected flame landmarks with the target landmarks.
More specifically I am talking about this two lines:
I do understand that the factor is the maximum range of either the x- or y-coordinates of the target landmarks. But why is the result of the subtraction of the projected landmarks and the target landmarks squared, reduced and then divided by the square of the factor? Why don't calculate the average distance of the projected and the target landmarks and reduce those?
The text was updated successfully, but these errors were encountered: