Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: integer overflow error #14

Open
keflavich opened this issue Jan 22, 2023 · 8 comments
Open

Minor: integer overflow error #14

keflavich opened this issue Jan 22, 2023 · 8 comments

Comments

@keflavich
Copy link
Contributor

I encountered this error running on one image:

  File "//orange/adamginsburg/jwst/brick/analysis/crowdsource_catalogs_long.py", line 419, in main
    results_blur  = fit_im(np.nan_to_num(data), psf_model_blur, weight=weight,
  File "/blue/adamginsburg/adamginsburg/repos/crowdsource/crowdsource/crowdsource_base.py", line 839, in fit_im
    tflux, tmodel, tmsky = fit_once(
  File "/blue/adamginsburg/adamginsburg/repos/crowdsource/crowdsource/crowdsource_base.py", line 300, in fit_once
    xloc = numpy.zeros(repeat*numpy.sum(sz*sz).astype('i4') +
ValueError: negative dimensions are not allowed

a bit of digging shows that it's an integer overflow error:

ipdb> numpy.sum(sz*sz)
7866546941
ipdb> numpy.sum(sz*sz).astype('i4')
-723387651

I have 595,000 stars with 149x149 pix PSFs, apparently.

I'm going to try this with i8 in place of i4, but I suspect I'll run out of memory.

I'm posting this mostly in case someone else hits the same error; it's kind of obvious when you dig a tiny bit, but I didn't see it at first.

@andrew-saydjari
Copy link
Collaborator

All of the production runs for DECaPS used a 320,000 max number of sources threshold, so this was never a problem. But this is good to know and document... I don't any reason not to upgrade the type.

@schlafly
Copy link
Owner

I suspect all 149x149 PSFs is also an error; there are some hard coded constants relating stamp size to peak brightness in get_sizes that are likely inappropriate for these images. 19×19 is the usual "small" stamp size that usually goes far enough into the wings to be fine.

@keflavich
Copy link
Contributor Author

Thanks for pointing that out @schlafly. It does seem that get_sizes already has a safety valve in place to avoid making all PSFs huge. I wonder / suspect it's caused by

    if weight is not None:
        sz[weight[x, y] == 0] = 149  # saturated/off edge sources get big PSF

and having zero weights. I'm guessing now, combined with the huge number of sources, this might be some sort of edge effect where a lot of sources are being found right at the edge where the weight & value should be zero, but for some reason are not.

@andrew-saydjari the i4 limit shouldn't necessarily be there, but with >500k sources with big PSFs, the memory requirements get somewhat absurd - at least, it might be a good idea to include warnings when the xloc size exceeds a couple (tens of?) GB if you choose to s/i4/i8/.

@keflavich
Copy link
Contributor Author

I did a pre-cataloging background subtraction, which subtracted positive background from pixels that originally were zeros, and these are all along the edge. That explains how I had so many sources with the biggest PSF and the overflow is (probably always?) prevented with this fix, i.e., limit the number of huge PSFs for edge cases:
https://github.com/schlafly/crowdsource/pull/15/files#diff-978259f1f0a70a1db6e36f69ed55ba414a19836112f1b66a40953b04102fb9deL589-R595

@keflavich
Copy link
Contributor Author

This is still a real bug; if the number of sources * footprint is too large, the check on the line below fails:

zsz = (repeat*numpy.sum(sz*sz) + nskypar*npixim).astype('i4')
if zsz >= 2**32:

@keflavich
Copy link
Contributor Author

Oh, that's code I added anyway. The fix for this specific issue is to just s/i4/i8/

@schlafly
Copy link
Owner

I'm happy to merge that PR; I agree it looks straightforward. You have a case where you really want more than 2B pixels to enter that matrix?

@keflavich
Copy link
Contributor Author

No, I don't think I do! But it's better to catch the too-many-pixels error and give a clean message rather than the cryptic one. I'll issue an appropriate PR.

@keflavich keflavich mentioned this issue Dec 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants