New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

weights in curve_fit #69

Closed
jcrbloch opened this Issue May 2, 2018 · 6 comments

Comments

Projects
None yet
3 participants
@jcrbloch

jcrbloch commented May 2, 2018

I should have opened a new issue for this, so here it comes:
The parameter w for weights in curve_fit is not very thoroughly documented. What I see is:
w: (optional) weight applied to the residual; can be a vector (of length(x) size or empty) or matrix (inverse covariance matrix)
From this I assumed that when providing w as a vector the routine expects the inverse of the variance, as this would be in line with the concept of covariance matrix if w is given as a matrix.

However from my application, and after comparing my results with fits done in C and with gnuplot, it looks as if curve_fit uses the weight vector as inverse standard deviations rather than inverse variances. I had a quick non-expert look in the curve_fit source, and that is also what I think I see in the source code. Could you confirm this and explain the logics?

Kind regards,
Jacques Bloch

@iewaij

This comment has been minimized.

Show comment
Hide comment
@iewaij

iewaij Jun 19, 2018

Contributor

Hi @jcrbloch thanks for reporting the issue!

Could you elaborate on why "curve_fit uses the weight vector as inverse standard deviations"? In LsqFit.jl, the weight is assumed as 1/var(ε_i) and the covariance matrix is calculated as inv(J' * fit.wt * J), in the same way defined in GNU Scientific Library. According to Weighted and General Least Squares, I think essentially the logic behind is:

Under heteroskedastic error where Ω is a diagonal matrix, the GLS estimator \hat{\beta} = (J' \Omega^{-1} J)^{-1}J' \Omega^{-1} Y has cov(\hat{\beta}) = σ^2 (J'  \Omega^{-1}  J)^{-1}. If w = cov(ε)^{-1} = \Omega^{-1} / σ^2, then cov(\hat{\beta}) = \sigma^2 (J'  \Omega^{-1}  J)^{-1} = (J'  w  J)^{-1} and β is B.L.U.E.

Contributor

iewaij commented Jun 19, 2018

Hi @jcrbloch thanks for reporting the issue!

Could you elaborate on why "curve_fit uses the weight vector as inverse standard deviations"? In LsqFit.jl, the weight is assumed as 1/var(ε_i) and the covariance matrix is calculated as inv(J' * fit.wt * J), in the same way defined in GNU Scientific Library. According to Weighted and General Least Squares, I think essentially the logic behind is:

Under heteroskedastic error where Ω is a diagonal matrix, the GLS estimator \hat{\beta} = (J' \Omega^{-1} J)^{-1}J' \Omega^{-1} Y has cov(\hat{\beta}) = σ^2 (J'  \Omega^{-1}  J)^{-1}. If w = cov(ε)^{-1} = \Omega^{-1} / σ^2, then cov(\hat{\beta}) = \sigma^2 (J'  \Omega^{-1}  J)^{-1} = (J'  w  J)^{-1} and β is B.L.U.E.

@jcrbloch

This comment has been minimized.

Show comment
Hide comment
@jcrbloch

jcrbloch Jun 19, 2018

jcrbloch commented Jun 19, 2018

@jcrbloch

This comment has been minimized.

Show comment
Hide comment
@jcrbloch

jcrbloch Jun 19, 2018

To comment further, after looking at issue 71: I think the implementation of the fit with error matrix was perfectly fine, and there is no need to change it. The problem is only when passing an error vector, as i explained in my detailed post above.

jcrbloch commented Jun 19, 2018

To comment further, after looking at issue 71: I think the implementation of the fit with error matrix was perfectly fine, and there is no need to change it. The problem is only when passing an error vector, as i explained in my detailed post above.

@iewaij

This comment has been minimized.

Show comment
Hide comment
@iewaij

iewaij Jun 20, 2018

Contributor

Yes it is a mistake. Thanks very much for pointing out! #72 should fix this.

Contributor

iewaij commented Jun 20, 2018

Yes it is a mistake. Thanks very much for pointing out! #72 should fix this.

@iewaij

This comment has been minimized.

Show comment
Hide comment
@iewaij

iewaij Jun 23, 2018

Contributor

I think a possible explanation for passing the reciprocal of standard deviation (σ) to residual function is to reduce the computation needed for sqrt() if the weight is estimated using fit.resid.^2 (but current implementation is still wrong either way). I don't know much about computational complexity but here is my reasoning.

Scenario 1: pass weight vector as the reciprocal of estimated variances of error, which is PR #72:

var_error =fit.resid.^2
wt = 1./var_error
curve_fit(model, tdata, ydata, wt, p0)

# behind the code
sqrt_wt = sqrt.(wt) # which costs a lot of time
f(p) = sqrt_wt .* ( model(xpts, p) - ydata ) # the residual function for least squares algorithm
fit = lmfit(f, p0, wt; kwargs...)
covar = inv(J'*Diagonal(fit.wt)*J)

Scenario 2: Pass weight vector as the reciprocal of estimated standard deviation of error, which is essentially PR #74:

std_error  = abs.(fit.resid) # which doesn't cost much time
sqrt_wt = 1./std_error
curve_fit(model, tdata, ydata, sqrt_wt, p0)

# behind the code
f(p) = sqrt_wt .* ( model(xpts, p) - ydata ) # the residual function for least squares algorithm
wt = sqrt_wt.^2
fit = lmfit(f, p0, wt; kwargs...)
covar = inv(J'*Diagonal(fit.wt)*J)

Just for reference, I ran through 2 different scenarios, Scenario 1 costs nearly 2x time of Scenario 2. The notebook could be viewed and reproduced here, make sure you've restarted kernel.

Contributor

iewaij commented Jun 23, 2018

I think a possible explanation for passing the reciprocal of standard deviation (σ) to residual function is to reduce the computation needed for sqrt() if the weight is estimated using fit.resid.^2 (but current implementation is still wrong either way). I don't know much about computational complexity but here is my reasoning.

Scenario 1: pass weight vector as the reciprocal of estimated variances of error, which is PR #72:

var_error =fit.resid.^2
wt = 1./var_error
curve_fit(model, tdata, ydata, wt, p0)

# behind the code
sqrt_wt = sqrt.(wt) # which costs a lot of time
f(p) = sqrt_wt .* ( model(xpts, p) - ydata ) # the residual function for least squares algorithm
fit = lmfit(f, p0, wt; kwargs...)
covar = inv(J'*Diagonal(fit.wt)*J)

Scenario 2: Pass weight vector as the reciprocal of estimated standard deviation of error, which is essentially PR #74:

std_error  = abs.(fit.resid) # which doesn't cost much time
sqrt_wt = 1./std_error
curve_fit(model, tdata, ydata, sqrt_wt, p0)

# behind the code
f(p) = sqrt_wt .* ( model(xpts, p) - ydata ) # the residual function for least squares algorithm
wt = sqrt_wt.^2
fit = lmfit(f, p0, wt; kwargs...)
covar = inv(J'*Diagonal(fit.wt)*J)

Just for reference, I ran through 2 different scenarios, Scenario 1 costs nearly 2x time of Scenario 2. The notebook could be viewed and reproduced here, make sure you've restarted kernel.

@pkofod

This comment has been minimized.

Show comment
Hide comment
@pkofod

pkofod Oct 8, 2018

Collaborator

@jcrbloch this took a while, but sqrt is now applied to the weights for consistency with the matrix form. It's not in the latest release, but it's in the master versions. Documentation is not up to date though.

Collaborator

pkofod commented Oct 8, 2018

@jcrbloch this took a while, but sqrt is now applied to the weights for consistency with the matrix form. It's not in the latest release, but it's in the master versions. Documentation is not up to date though.

@pkofod pkofod closed this Oct 11, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment