Problems with BigFloat #844

protogeezer · 2020-08-14T19:04:09Z

Here are some problems where optimization with BigFloat data sometimes results in
ERROR: UndefRefError: access to undefined reference

There is a possibly related problem where supplying a gradient function doesn't work at all:
ERROR: ArgumentError: Value and slope at step length = 0 must be finite.
yet seemingly similar problems don't work.

test.jl.zip

pkofod · 2020-08-14T19:34:43Z

You're missing a resids allocation somewhere. I will look at the undefined referenced. Guessing from the NLsolve issue, there's a similar in here somewhere as well.

pkofod · 2020-08-14T20:08:09Z

I have a fix for the bigfloat stuff, but you should know that


	function hess!(H,x)
		H = fill(T(0.0),(3,3));
		for i = 1:3
			v = m[i,:];
			n = norm(x - m[i,:]);
			n3 = n*n*n;
			H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3;
			H[1,2] +=     - (x[2] - v[2])*(x[1] - v[1])/n3;
			H[1,3] +=     - (x[3] - v[3])*(x[1] - v[1])/n3;

			H[2,1] +=     - (x[1] - v[1])*(x[2] - v[2])/n3;
			H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3;
			H[2,3] +=     - (x[3] - v[3])*(x[2] - v[2])/n3;

			H[3,1] +=     - (x[1] - v[1])*(x[3] - v[3])/n3;
			H[3,2] +=     - (x[2] - v[2])*(x[3] - v[3])/n3;
			H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3;
		end
	end

does not do what you think it does :) You're assigning a new matrix fill(T(0), (3,3)) to the name H, so you end up not updating the hessian in place. You need


	function hess!(H,x)
		H .= T(0.0)
		for i = 1:3
			v = m[i,:];
			n = norm(x - m[i,:]);
			n3 = n*n*n;
			H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3;
			H[1,2] +=     - (x[2] - v[2])*(x[1] - v[1])/n3;
			H[1,3] +=     - (x[3] - v[3])*(x[1] - v[1])/n3;

			H[2,1] +=     - (x[1] - v[1])*(x[2] - v[2])/n3;
			H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3;
			H[2,3] +=     - (x[3] - v[3])*(x[2] - v[2])/n3;

			H[3,1] +=     - (x[1] - v[1])*(x[3] - v[3])/n3;
			H[3,2] +=     - (x[2] - v[2])*(x[3] - v[3])/n3;
			H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3;
		end
	end

pkofod · 2020-08-14T20:09:18Z

the same for g

pkofod · 2020-08-14T20:19:00Z

I'm also going to leave this as an exercise for the user, but I think your gradient might be wrong. If I use AD it takes one step and has a gradient norm of around 1e-30, but if I use your grad!, it comes up at around 3.

protogeezer · 2020-08-18T15:33:47Z

Your comment about how I used fill to initialize H was a revelation. I’m unclear as to why that assignment isn’t an error. The manual section on functions (https://docs.julialang.org/en/v1.5/manual/functions/) states that changes to function arguments are visible to the calling function. No exceptions are documented. In fact, it is colorable that the phrase "Modifications to mutable values (such as Arrays) made within a function will be visible to the caller” implies what I did was in-bounds - it should have just worked. It doesn't say something like “(which, for example, is limited to updating Array elements)." Or maybe the arg binding (as opposed to the data) should be immutable so trying to overwrite it would cause an error? I can’t imagine there is a use case where it is desirable to not return the result of a function, given that the args are shared and mutable. Finally, the code I submitted was created to demonstrate calls to Optim.optimize that didn’t execute when I thought they should have. It was not a working example at all levels. Stephen

…

On Aug 14, 2020, at 2:08 PM, Patrick Kofod Mogensen ***@***.***> wrote: I have a fix for the bigfloat stuff, but you should that function hess!(H,x) H = fill(T(0.0),(3,3)); for i = 1:3 v = m[i,:]; n = norm(x - m[i,:]); n3 = n*n*n; H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3; H[1,2] += - (x[2] - v[2])*(x[1] - v[1])/n3; H[1,3] += - (x[3] - v[3])*(x[1] - v[1])/n3; H[2,1] += - (x[1] - v[1])*(x[2] - v[2])/n3; H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3; H[2,3] += - (x[3] - v[3])*(x[2] - v[2])/n3; H[3,1] += - (x[1] - v[1])*(x[3] - v[3])/n3; H[3,2] += - (x[2] - v[2])*(x[3] - v[3])/n3; H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3; end end does not do what you think it does :) You're assigning a new matrix fill(T(0), (3,3)) to the name H, so you end up not updating the hessian in place. You need function hess!(H,x) H .= T(0.0) for i = 1:3 v = m[i,:]; n = norm(x - m[i,:]); n3 = n*n*n; H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3; H[1,2] += - (x[2] - v[2])*(x[1] - v[1])/n3; H[1,3] += - (x[3] - v[3])*(x[1] - v[1])/n3; H[2,1] += - (x[1] - v[1])*(x[2] - v[2])/n3; H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3; H[2,3] += - (x[3] - v[3])*(x[2] - v[2])/n3; H[3,1] += - (x[1] - v[1])*(x[3] - v[3])/n3; H[3,2] += - (x[2] - v[2])*(x[3] - v[3])/n3; H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3; end end — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

pkofod · 2020-08-18T18:06:36Z

Your comment about how I used fill to initialize H was a revelation. I’m unclear as to why that assignment isn’t an error. The manual section on functions (https://docs.julialang.org/en/v1.5/manual/functions/) states that changes to function arguments are visible to the calling function. No exceptions are documented. In fact, it is colorable that the phrase "Modifications to mutable values (such as Arrays) made within a function will be visible to the caller” implies what I did was in-bounds - it should have just worked. It doesn't say something like “(which, for example, is limited to updating Array elements)." Or maybe the arg binding (as opposed to the data) should be immutable so trying to overwrite it would cause an error? I can’t imagine there is a use case where it is desirable to not return the result of a function, given that the args are shared and mutable. Finally, the code I submitted was created to demonstrate calls to Optim.optimize that didn’t execute when I thought they should have. It was not a working example at all levels. Stephen

I don't know if you're reading my response as anything but an attempt to help you, but I do get a sense that you're annoyed with me the other way around. The discussion around the docs is probably not optimal to aim at me as I had very little to do with the language documentation over the years, this package is a separate thing. I saw a common mistake made by new users, and just wanted to point it out.

I'm closing this, as the newest tag of NLSolversBase and the next tag of this package will fix your problems. For language or usage questions please refer to https://discourse.julialang.org/ .

Thank you for reporting the bigfloat bug.

pkofod closed this as completed Aug 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problems with BigFloat #844

Problems with BigFloat #844

protogeezer commented Aug 14, 2020

pkofod commented Aug 14, 2020

pkofod commented Aug 14, 2020 •

edited

pkofod commented Aug 14, 2020

pkofod commented Aug 14, 2020

protogeezer commented Aug 18, 2020 via email

pkofod commented Aug 18, 2020

Problems with BigFloat #844

Problems with BigFloat #844

Comments

protogeezer commented Aug 14, 2020

pkofod commented Aug 14, 2020

pkofod commented Aug 14, 2020 • edited

pkofod commented Aug 14, 2020

pkofod commented Aug 14, 2020

protogeezer commented Aug 18, 2020 via email

pkofod commented Aug 18, 2020

pkofod commented Aug 14, 2020 •

edited