Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PhoSim indexing fails with very large numbers of input objects #24

Closed
drphilmarshall opened this issue Nov 4, 2016 · 12 comments
Closed

Comments

@drphilmarshall
Copy link

This is a re-post from @rbiswas4 at LSSTDESC/Twinkles#383:

PhoSim indexes astrophysical sources with floats. For very large integers used as instance catalog indices , this causes indexing failures due to numerical rounding issues. This was found in the first set of Twinkles runs. A ticket describing the problem - PHOSIM-27 - was filed on the PhoSim Jira.

For Twinkles Run 3, this problem was reported in LSSTDESC/Twinkles#312 and taking advantage of the restricted nature (single CCD, rather than all sky), this issue was solved for Twinkles Run 3 in LSSTDESC/Twinkles#340 . However, in general (where there will be more objects due to larger sky area) the problem is unsolved.

@cwwalter
Copy link
Member

cwwalter commented Nov 4, 2016

Thanks for moving/filing this!

@cwwalter
Copy link
Member

cwwalter commented Jul 5, 2017

@rbiswas4 and @drphilmarshall Is this an issue for us in DC1? DC2? According to PHOSM-27, John feels this shouldn't be an issue since really they are stored in double precision.

@rbiswas4
Copy link
Member

rbiswas4 commented Jul 5, 2017

It is a DC1 issue. I think this was addressed by @danielsf and @SimonKrughoff in the context of Twinkles. The solution depended on the fact that we had a really small area. This may or may not be ok for DC2, depending on the area.

@cwwalter
Copy link
Member

cwwalter commented Jul 5, 2017

So, just to make sure I understand from reading the Twinkles PR: This an issue solely in making the catalogs correct?

If this is going to be an issue for DC2, some one should follow up on PhoSim-27 and let John know you don't think his comment really solves the issue.

@rbiswas4
Copy link
Member

rbiswas4 commented Jul 5, 2017

So, just to make sure I understand from reading the Twinkles PR: This an issue solely in making the catalogs correct?

Not sure what you mean: The catalogs and phosim images are actually generated correctly (at least in the cases we have seen). However, if you want to do comparisons of the input catalogs and the phosim centroid files (which I think is a very useful step), then the indices are often different (because of the representation as doubles).

If this is going to be an issue for DC2, some one should follow up on PhoSim-27 and let John know you don't think his comment really solves the issue.

Whether it is an issue might depend on the size of DC2 ... I am traveling this week, so I don't have the details. I guess @danielsf would be able to help ...

@danielsf
Copy link
Contributor

danielsf commented Jul 5, 2017

I actually do not think this will be a problem for DC2 (nor was it a problem for DC1). It was only an issue in Twinkles. Someone should check my reasoning though:

PhoSim stores the indices as double-precision floats. This means they can store an integer of 2^52 ~ 4.5e15 exactly. Therefore, the PhoSim indexing would only break down if we ended up needing more than 4.5e15 independent sources. Twinkles ran afoul of this limit because, when we injected lensed AGN and SNe, we were forced to bit-pack their indexes to prevent them from colliding with the uniqueIDs of the ~ 3.5e10 galaxy uniqueIDs already allotted by CatSim.

So: as long as we aren't injecting new transients into DC2 (are we?), we should be able to live with the way PhoSim currently handles object indexes.

@cwwalter
Copy link
Member

cwwalter commented Jul 5, 2017

Now the thinking is that DC2 will include transients at normal density with an embedded Twinkles field.

@danielsf
Copy link
Contributor

danielsf commented Jul 5, 2017

In that case, we will have a problem. As Rahul said, we were able to fix this in Twinkles because it was a single chip on a single point in the sky, so we only had to keep a few million unique galaxy indices, rather than a few tens of billions. Once we break that assumption, we need PhoSim to store indices as ints or strings.

@danielsf
Copy link
Contributor

danielsf commented Jul 5, 2017

I thought @SimonKrughoff hacked together a quick solution on his own branch of PhoSim and issued a pull request. Whether or not that pull request gets merged to PhoSim master, could we just use Simon's branch with his quick fix?

@rbiswas4
Copy link
Member

rbiswas4 commented Jul 5, 2017

transients at normal density with an embedded Twinkles field
@cwwalter Twinkles had transients/variables at higher than normal density and it was a 10 year survey.

So, to me the questions are:

  • How many years is DC2 ?
  • What is the density of transient/variables used? From your statement above does it mean that it is the normal density except for a particular chip location where it will be overdense as in Twinkles?

These two questions will be important in quantifying the number of transient/variables (as a hack) Scott is talking about.

@cwwalter
Copy link
Member

cwwalter commented Jul 5, 2017

Yes. Your understanding is correct. Right now we are planning 10 years, 6 bands, 300 sq degrees. We will have normal density for transients over the WFD field and an embedded Twinkles DDF. I'm not sure what the size of that is. That is for conversation at SUNYSB.

@fjaviersanchez
Copy link
Contributor

I think that this is not relevant anymore so I'm closing this (please, feel free to reopen @cwwalter)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants