Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address numpy deprecation warning #506

Closed
wants to merge 3 commits into from
Closed

Conversation

OriolAbril
Copy link
Contributor

I am seeing:

DeprecationWarning: Conversion of an array with ndim > 0 to a scalar is deprecated, and will error in future. Ensure you extract a single element from your array before performing this operation. (Deprecated NumPy 1.25.)

from these lines, I think this would fix them. Not really sure how to add a test for that or if you'd prefer a diferent approach.

@dfm
Copy link
Owner

dfm commented Mar 28, 2024

Thanks! Can you share a code snippet and all the relevant versions where you're seeing this? I can't reproduce this warning.

@OriolAbril
Copy link
Contributor Author

OriolAbril commented Mar 28, 2024

I think the default is to not show DeprecationWarning for users. Try setting

import warnings
warnings.simplefilter('always')

I see it on anything with blobs. haven't checked multiple versions, I am on python 3.10, numpy 1.26.4 and emcee 3.1.4.

Some example snippets:

# single blob
sampler = emcee.EnsembleSampler(6, 1, lambda x: (-(x**2), 3))
sampler.run_mcmc(np.random.normal(size=(6, 1)), 20);
# multiple blobs
sampler = emcee.EnsembleSampler(6, 1, lambda x: (-(x**2), (np.random.normal(x), 3)))
sampler.run_mcmc(np.random.normal(size=(6, 1)), 20);

@dfm
Copy link
Owner

dfm commented Apr 2, 2024

The problem here is really that there is a bug in the log probability that you're specifying. The log probability should always be a scalar, but in this case it has size ndim. The "correct" log prob is probably something like:

- lambda x: (-(x**2), 3)
+ lambda x: (-np.sum(x**2), 3)

I'd actually probably rather raise an exception for non-scalar probability, rather than trying to coerce it into a scalar. I expect that this is a fairly small corner case (log probs that have shape (1,) instead of ())...

That being said, relying on numpy's deprecation for this doesn't seem like a good approach either! Would you be willing to update this PR to explicitly check that the log prob is a scalar?

@OriolAbril
Copy link
Contributor Author

Sounds good, I thought (1,) and scalars were exchangeable, I'll fix my code too. Do you have a preferred way to check something is a scalar? np.ndim(x) == 0?

@dfm
Copy link
Owner

dfm commented Apr 2, 2024

I guess they used to be interchangeable, but if numpy is deprecating this behavior, it might be best for emcee to follow suit!

Do you have a preferred way to check something is a scalar? np.ndim(x) == 0?

This seems sensible to me. Thank you!!

@andyfaff
Copy link
Contributor

andyfaff commented Apr 2, 2024

In scipy we come across this rather more than I'd like, so much jumping through hoops. Anyway for a multivariate function returning a scalar we deal with it as:

            fx = fun(np.copy(x), *args)
            # Make sure the function returns a true scalar
            if not np.isscalar(fx):
                try:
                    fx = np.asarray(fx).item()   # deals with 
                except (TypeError, ValueError) as e:
                    raise ValueError(
                        "The user-provided objective function "
                        "must return a scalar value."
                    ) from e
            else:
                return fx

It should deal with 1.0, np.double(1.0), np.array(1.0), np.array([1.0])

@OriolAbril
Copy link
Contributor Author

I understood the goal wasn't to deal with np.array([1.0]) but to raise an error if encountered, hence the change from the inital use of item to ndim and explicit error raising.

@andyfaff
Copy link
Contributor

andyfaff commented Apr 3, 2024

My comment was from the sidelines, what other projects do in a similar situation, not necessarily what should be done here.

@dfm
Copy link
Owner

dfm commented Apr 3, 2024

Thanks @andyfaff!! This is very useful context.

It's interesting that scipy has decided to handle the deprecation this way, and it definitely makes me less certain about how we want to navigate this. The original motivation for emcee was to be roughly compatible with scipy's optimize interface, so maybe it is right to match that.

Another more radical approach would be to explicitly sum the log prob result down to a scalar, but that would be a bigger change to the expected interface.

In the end I guess I don't have a strong opinion so I'm happy with either raising the error or using the scipy approach that @andyfaff shared above. What do you think @OriolAbril?

Thanks both!!

@andyfaff
Copy link
Contributor

andyfaff commented Apr 5, 2024

I've dealt with this issue, and added unit tests as part of #510. I went with the fix that I suggested.

As mentioned above it will deal with 1.0, np.float64(1.0), np.array(1.0), np.array([1.0]). The latter is quite common in user-land, #509.

@OriolAbril
Copy link
Contributor Author

I've dealt with this issue, and added unit tests as part of #510. I went with the fix that I suggested.

closing this then

@OriolAbril OriolAbril closed this Apr 5, 2024
@OriolAbril OriolAbril deleted the patch-1 branch April 5, 2024 16:58
@dfm
Copy link
Owner

dfm commented Apr 5, 2024

Thanks, @OriolAbril! And thanks again for tracking this down. I'm happy with where this landed in the end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants