Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems with BigFloat #844

Closed
protogeezer opened this issue Aug 14, 2020 · 6 comments
Closed

Problems with BigFloat #844

protogeezer opened this issue Aug 14, 2020 · 6 comments

Comments

@protogeezer
Copy link

Here are some problems where optimization with BigFloat data sometimes results in
ERROR: UndefRefError: access to undefined reference

There is a possibly related problem where supplying a gradient function doesn't work at all:
ERROR: ArgumentError: Value and slope at step length = 0 must be finite.
yet seemingly similar problems don't work.

test.jl.zip

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

You're missing a resids allocation somewhere. I will look at the undefined referenced. Guessing from the NLsolve issue, there's a similar in here somewhere as well.

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

I have a fix for the bigfloat stuff, but you should know that


	function hess!(H,x)
		H = fill(T(0.0),(3,3));
		for i = 1:3
			v = m[i,:];
			n = norm(x - m[i,:]);
			n3 = n*n*n;
			H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3;
			H[1,2] +=     - (x[2] - v[2])*(x[1] - v[1])/n3;
			H[1,3] +=     - (x[3] - v[3])*(x[1] - v[1])/n3;

			H[2,1] +=     - (x[1] - v[1])*(x[2] - v[2])/n3;
			H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3;
			H[2,3] +=     - (x[3] - v[3])*(x[2] - v[2])/n3;

			H[3,1] +=     - (x[1] - v[1])*(x[3] - v[3])/n3;
			H[3,2] +=     - (x[2] - v[2])*(x[3] - v[3])/n3;
			H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3;
		end
	end

does not do what you think it does :) You're assigning a new matrix fill(T(0), (3,3)) to the name H, so you end up not updating the hessian in place. You need


	function hess!(H,x)
		H .= T(0.0)
		for i = 1:3
			v = m[i,:];
			n = norm(x - m[i,:]);
			n3 = n*n*n;
			H[1,1] += 1/n - (x[1] - v[1])*(x[1] - v[1])/n3;
			H[1,2] +=     - (x[2] - v[2])*(x[1] - v[1])/n3;
			H[1,3] +=     - (x[3] - v[3])*(x[1] - v[1])/n3;

			H[2,1] +=     - (x[1] - v[1])*(x[2] - v[2])/n3;
			H[2,2] += 1/n - (x[2] - v[2])*(x[2] - v[2])/n3;
			H[2,3] +=     - (x[3] - v[3])*(x[2] - v[2])/n3;

			H[3,1] +=     - (x[1] - v[1])*(x[3] - v[3])/n3;
			H[3,2] +=     - (x[2] - v[2])*(x[3] - v[3])/n3;
			H[3,3] += 1/n - (x[3] - v[3])*(x[3] - v[3])/n3;
		end
	end

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

the same for g

@pkofod
Copy link
Member

pkofod commented Aug 14, 2020

I'm also going to leave this as an exercise for the user, but I think your gradient might be wrong. If I use AD it takes one step and has a gradient norm of around 1e-30, but if I use your grad!, it comes up at around 3.

@protogeezer
Copy link
Author

protogeezer commented Aug 18, 2020 via email

@pkofod
Copy link
Member

pkofod commented Aug 18, 2020

Your comment about how I used fill to initialize H was a revelation. I’m unclear as to why that assignment isn’t an error. The manual section on functions (https://docs.julialang.org/en/v1.5/manual/functions/) states that changes to function arguments are visible to the calling function. No exceptions are documented. In fact, it is colorable that the phrase "Modifications to mutable values (such as Arrays) made within a function will be visible to the caller” implies what I did was in-bounds - it should have just worked. It doesn't say something like “(which, for example, is limited to updating Array elements)." Or maybe the arg binding (as opposed to the data) should be immutable so trying to overwrite it would cause an error? I can’t imagine there is a use case where it is desirable to not return the result of a function, given that the args are shared and mutable. Finally, the code I submitted was created to demonstrate calls to Optim.optimize that didn’t execute when I thought they should have. It was not a working example at all levels. Stephen

I don't know if you're reading my response as anything but an attempt to help you, but I do get a sense that you're annoyed with me the other way around. The discussion around the docs is probably not optimal to aim at me as I had very little to do with the language documentation over the years, this package is a separate thing. I saw a common mistake made by new users, and just wanted to point it out.

I'm closing this, as the newest tag of NLSolversBase and the next tag of this package will fix your problems. For language or usage questions please refer to https://discourse.julialang.org/ .

Thank you for reporting the bigfloat bug.

@pkofod pkofod closed this as completed Aug 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants