Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add accumulate_Hessian_times_input to STIR objective function and prior #1244

Closed
KrisThielemans opened this issue Apr 25, 2024 · 11 comments · Fixed by #1253
Closed

add accumulate_Hessian_times_input to STIR objective function and prior #1244

KrisThielemans opened this issue Apr 25, 2024 · 11 comments · Fixed by #1253
Assignees
Milestone

Comments

@KrisThielemans
Copy link
Member

STIR has functionality to multiplying the Hessian with an image for priors and objective functions.

In STIR it's called accumulate_Hessian_times_input but "input" is a bit confusing (it does H . input), so suggestions welcome (@epapoutsellis @gschramm @Imraj-Singh @paskino).

This would have to be added to sirf.STIR.ObjectiveFunction. I think the C++ version can be as-is, but the Python one should be

ret = accumulate_Hessian_times_input(current_estimate, input, optional_subset, out=optional)

without subset, it should do all of them I think (or at least, it should do the same as gradient).

@evgueni-ovtchinnikov
Copy link
Contributor

done for prior, @KrisThielemans please have a look at SIRF branch acc-hess and let me know whether I am on the right track - also, advice on testing needed (have a vague idea about input and output, but not current_estimate)

@KrisThielemans
Copy link
Member Author

KrisThielemans commented May 2, 2024

yes, this looks perfect to me. It'll be exactly the same for the GeneralisedObjectiveFunction, aside from the cast. (This is due to STIR hierarchy choice/difficulty). So could both have calling an intermediate function.

Any ideas for a better name than "input"? somewhat in the vein of sapby etc accumulate_Hessian_times_x. Not sure if that is clearer. If we rename, then I think we should do this on the C++ level as well, such that SIRF names are consistent.

current_estimate has to be a somewhat sensible reconstructed image.
For the log-likelihood, the main restriction is that its forward projection should NOT be zero where the data is zero (see below).

For the log-likelihood with sinogram-data, the check could be to just do the explicit calculation:

H x = acq_model.backward(data / (acq_model.forward(current_estimate)^2 * x)

(element-wise multiplications), where 0/0 is defined as 0 (in this case, which STIR will take care of). You can get a current_estimate that satisfies the restriction to avoid 1/0 by using a small number of MLEM iterations.

For the prior, this can be checked comparing the numerical second order derivative with the Hessian. This is done in STIR, see test_priors, so I don't see it's necessary to repeat that test here (especially as it was hard to get that to work with some priors due to numerical issues). Maybe it can be done for the quadratic prior only, as there the Hessian is constant (second derivative is constant), so numerical issues disappear. Main thing of course is if the wrapper works, so as long as we have a check that it doesn't fail, and it uses the same wrapping mechanism for the objective function and prior, I'd be happy to merge with the above check, a check for the quadratic prior that H*x is a multiple of x, but without numerical check for priors.

@evgueni-ovtchinnikov
Copy link
Contributor

Sorry, still lost in the dark...

H x = ...: the third closing bracket missing

x: my understanding was that x is in the images space, whereas forward produces an element of the acquisitions space - how can the two be multiplied element-wise? (I take it ^2 is elementwise too.)

@gschramm
Copy link

gschramm commented May 3, 2024

@KrisThielemans @evgueni-ovtchinnikov If I am not completely wrong, I think H x in sinogram mode should read:

H x = - acq_model.backward(data *  acq_model.forward(x)/ (acq_model.forward(current_estimate)^2)

meaning the Hessian of the Poisson logL at the current_estimate (an image) applied to x (another image)

@KrisThielemans
Copy link
Member Author

yes. sorry for the mistake!

x=current_estimate is an image. H is a linear operation on x. We compute it on the fly with the above formula.

@KrisThielemans
Copy link
Member Author

Possible approach is to add similar lines for the loglikelihood already, push, then Georg can use it already, while you create the test.

@gschramm
Copy link

gschramm commented May 3, 2024

indeed. "Hessian applied" of Poisson logL (in sinogram mode) would be very helpful to advance the DL notebooks. And the same in listmode would be even better ;-)

@evgueni-ovtchinnikov
Copy link
Contributor

In osl_reconstruction.py I added after

    recon.reconstruct(image)

the following check:

    output = image.get_uniform_copy(0)
    x = image.clone()
    curr_est = image.clone()
    output = prior.accumulate_Hessian_times_input(output, curr_est, x)
    print(output.norm())
    y = acq_model.forward(curr_est)
    output = - acq_model.backward(acq_data * acq_model.forward(x)/(y * y))
    print(output.norm())

and got

2046.584716796875
188793.703125

and if I set penalty factor to 0, I get

0.0
188520.3125

Looking at QuadraticPrior<elemT>::accumulate_Hessian_times_input, the first value is correct in the latter case, so there must be a problem with the formula for H x.

@evgueni-ovtchinnikov
Copy link
Contributor

Oh I see, that formula must be for the objective function...

@KrisThielemans
Copy link
Member Author

For the quadratic prior, a not very strong check would be that that output doesn't depend on curr_est. Note recommended Python signature is above

ret = accumulate_Hessian_times_input(current_estimate, input, optional_subset, out=optional)

which is different from what you have above.

@KrisThielemans
Copy link
Member Author

I've given the naming more thought. In fact, the accumulate name ONLY makes sense if out is specified, but that is non-standard. (other functions with out replace the content)..

I therefore think that in SIRF we need to have this in Python:

ret = multiply_with_Hessian(current_estimate, image, optional_subset, out=optional)

which should then call STIR's (schematically)

out.fill(0);
accumulate_Hessian_times_input(out, current_estimate, image, optional_subset);

Obviously, we ideally we have this function on the C++ side as well, but I don't quite know (!) what our conventions are there.
it is still confusing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants