Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test PhoSim 3.7 with subset of DC1 inputs #19

Closed
jchiang87 opened this issue Oct 24, 2017 · 45 comments
Closed

Test PhoSim 3.7 with subset of DC1 inputs #19

jchiang87 opened this issue Oct 24, 2017 · 45 comments

Comments

@jchiang87
Copy link
Contributor

This will be done on KNL with a couple of "typical" visits, both with default and new "quick background" settings for comparison to DC1 production runs to derive expected scalings for DC2.

We should also do a couple of long runs from DC1, i.e., where the background light was high or there was a off-sensor bright star to get a sense of the long execution time tails.

@sethdigel
Copy link

sethdigel commented Oct 24, 2017

Here are some selected sensor visits from DC1, extracted from a DC1 metadata file attached here

The obsId entries below refer to minion_1016

obsId raftId sensorId DC1 sensors in visit Comment
270676 R42 S21 17 Maximum moonBright (combination of moonAlt and moonPhase)
194113 R30 S10 130 Max CPU time for a visit with at least one mag = 10 star and moonAlt < 0
220091 R34 S12 114 Max CPU time at max airmass (1.479)
220090 R20 S02 41 Faintest brightest star (mag=13.15) with moonAlt < 0
233988 R42 S02 13 Min airmass (1.022) with min CPU time
201828 R13 S21 131 Visit with the median CPU time (56 ks)
300306 R33 S11 107 Visit with the average CPU time (89 ks)

Note that for DC1 the instance catalogs were modified so that stars brighter than 10th magnitude were limited to 10th magnitude. This essentially eliminated the dependence of CPU time on the magnitude of the brightest star in the instance catalog. Before this change, Tom did run some sensor visits with the original stellar magnitudes, which ranged as bright as magnitude ~5. These had a very large effect on the CPU times; with the largest effects actually on sensors adjacent to the bright stars. This page has some notes from from a test simulation of an entire focal plane image that Tom ran about a year ago, with a then-current version of phoSim.

(October 30: I updated the table to include a column with the total number of sensors in the corresponding DC1 visit.)

@cwwalter
Copy link
Member

What should we do about the magnitude 10 cut for these tests? Is the thought this can now be removed?

@johnrpeterson
Copy link

johnrpeterson commented Oct 25, 2017 via email

@sethdigel
Copy link

Thank you, John. That Google document mentions DC1 simulations of bright stars in one place: "Second, there was a rare but very significant bright star optimization bug which caused some of the brighter stars (10-12 magnitude) to be simulated unoptimized when they fell in certain places off the chip." For DC1, once the magnitudes of the brightest stars were limited to 10, the reported CPU times of the sensor visits did not depend strongly on the magnitude. Here's a plot from the first Confluence page linked to my comment above (sensor visits with moonAlt > 0 are in red, and moonAlt < 0 are in blue)

image

The brightest star magnitude is the brightest star in the trimcat file for that sensor visit, regardless of whether it is off chip.

I think that you have fixed the bright star optimization bug in v3.7, but I'm fairly sure that we do not have recent tests of the impact of bright stars on the execution time. The brightest stars (magnitudes less than 10) have (or had) kind of a compounded effect because they end up in the trimcat files for many sensors in the visit. It might be worth retrying the same visit that Tom ran in the second link in my comment above, if the various inputs are still available.

@johnrpeterson
Copy link

johnrpeterson commented Oct 25, 2017 via email

@sethdigel
Copy link

The first plot on this page shows the locations of the brightest stars in the obsId 1668469 (from minion_1016?) visit together with the CPU time required for the individual sensor visits. The brightest stars were magnitude ~<7 and each was surrounded by an array of sensor locations for which the simulations did not finish within the 120 CPU hour limit of the SLAC batch farm. Fainter stars had smaller regions of influence on the CPU times.

I think that re-running this visit, if the various inputs are still available, would be a good test of the feasibility of including realistic magnitude distributions for the stars, e.g., whether threads simulating bright stars influence the overall efficiency.

I'm afraid that I don't know enough to be able to do this; even if I knew how to run it in the pipeline system at SLAC, it would better be run at NERSC. I think that @TomGlanzman is the only one who knows how it all works. If it is interesting, I could make some individual runs with, e.g., a single bright star at various distances from a single sensor. That would not answer questions about overall throughput/efficiency, though.

@sethdigel
Copy link

I've updated the table above to include the total number of sensors in the selected DC1 visits. None of them corresponds to a full focal plane, but I think most have more than enough for a reasonable test of current processing efficiency in the pipeline(s).

@katrinheitmann
Copy link
Contributor

@sethdigel Seth, what is the plan on this now? Is @TomGlanzman going to run some more tests? Do we have an estimated timeline for when this will happen? Thanks!!

@sethdigel
Copy link

My understanding is that Tom has it on his to-do list to (re)run a sample of DC1 visits. I don't know the schedule. I think that he was working on getting a prototype DC2 simulation workflow going (involving Pegasus/HTCondor at NERSC) to run these.

@cwwalter
Copy link
Member

Hi @TomGlanzman. Can you let us know what the plan for these are? Is this tied into the new workflow as @sethdigel mentions?

@TomGlanzman
Copy link
Contributor

There have been many (sometimes conflicting) requests related to getting the DC2 phosim production running. The ultimate goal is, as I understand it, to have a Pegasus/HTCondor workflow handle this project at NERSC running on KNL. The immediate goal, however, is to resurrect and clone the DC1 phosim workflow (using the SLAC Pipeline workflow engine) and get that running within a DC2 context. This will provide us with important performance benchmarks and, assuming proper DC2 catalogs are available, a prototype DC2 dataset. I am hoping to complete the commissioning of the new workflow this week.

@johnrpeterson
Copy link

johnrpeterson commented Nov 13, 2017 via email

@katrinheitmann
Copy link
Contributor

katrinheitmann commented Nov 13, 2017 via email

@johnrpeterson
Copy link

johnrpeterson commented Nov 13, 2017 via email

@katrinheitmann
Copy link
Contributor

katrinheitmann commented Nov 13, 2017 via email

@johnrpeterson
Copy link

johnrpeterson commented Nov 13, 2017 via email

@cwwalter
Copy link
Member

cwwalter commented Nov 13, 2017

Hi All,

There seems to be a mixture of things being discussed now.

This issue as we originally wrote it is not to test DC2 production or any other issues related to catalogs etc etc. It is to test 3.7 with the new settings with some of the same inputs from DC1 and careful control so we can have the real scaling numbers. It would be fine to run this in the old workflow on KNL but running it in the new one would be OK too. I would still like this to happen independent of the other things being discussed.

It would be good to make new issues for other requests so we can track things in a focused way.

@johnrpeterson
Copy link

johnrpeterson commented Nov 13, 2017 via email

@salmanhabib
Copy link

I am not worried about scaling (agree with John) but we do want to check that the quick background method is acceptable. I would like to see an objective set of tests for this (will chat separately with John).

@TomGlanzman
Copy link
Contributor

TomGlanzman commented Nov 13, 2017

I sent this file to John via email, but want to publish it here also for comment. This command file is a clone of that used for DC1 with the addition of the 'quickbackground' and sourceperthread directives. Is this a good start point for DC2?

# Enable centroid file
centroidfile 1

# Disable sensor effects
cleardefects
fringing 0

# Disable dirt
contaminationmode 0

# Set the nominal dark sky brightness
zenith_v 21.8

# Quick background
backalpha 0.1
backbeta 4.0
backgamma 1000.0
backdelta 1.0
activebuffer 600

# Number of sources handled by a single thread
sourceperthread 100

@cwwalter
Copy link
Member

@TomGlanzman Thanks. For the real DC2 we need to check the selected options are turned on. So for example if we wanted BF, we might need 'chargesharing 1' if cleardefects is clearing that too.

Also, I can't remember: are we setting a fixed sky background with the zenith_v command? I remember that there was some bug in twinkles that required us to do that and then it was decided it should be removed. Was that this command or something different?

@TomGlanzman
Copy link
Contributor

@cwwalter If any of the "selected options" are known today, I would really like to include them asap to generate some benchmarks. Yes, 'cleardefects' does clear the 'chargesharing', so if the latter is desired, it would need to be reactivated explicitly.

Don't recall the story behind zenith_v - perhaps someone listening in can comment?

@johnrpeterson
Copy link

johnrpeterson commented Nov 14, 2017 via email

@johnrpeterson
Copy link

johnrpeterson commented Nov 14, 2017 via email

@SimonKrughoff
Copy link

If realistic time correlations are now in phosim, I agree we can take out {{zenith_v}}.

The problem before was that simulating back to back observations using different phosim processes could cause the sky background to change by a few magnitudes.

@SimonKrughoff
Copy link

SimonKrughoff commented Nov 14, 2017

its accurate to below 1% in normalization and the large scale patterns are still there (vignetting & various backgrounds).

@johnrpeterson do you have some sort of validation data that shows this? Or is this essentially a by eye test?

@johnrpeterson
Copy link

johnrpeterson commented Nov 14, 2017 via email

@cwwalter
Copy link
Member

Hi John,

I think you were trying to attach an image (or a link)? It didn't work. Can you try again?

BTW, would you consider going to the GH website to enter your responses? Because of something in the Purdue security mail scanner it is turning each of your answers into a big mess. Go ahead and look at what the last entry looked like on the web site. Rachel and I have been editing the comments to remove the extra material but it would be great if we could bypass that. Thanks!

-Chris

@johnrpeterson
Copy link

sorry. here it is:

backgroundopt

@salmanhabib
Copy link

@johnrpeterson I think this is great. What we should do is write up a small document stating how you go from the full background to the acceptable quick background approximation. And what the validation tests are.

@rmandelb
Copy link
Contributor

@johnrpeterson - just to expand on @salmanhabib 's request, when you do that, can you please clarify a little what is being plotted in each panel on this figure so we know what we should be taking away from it? (I have a guess as to what you're doing in this validation test, but the labels are not quite obvious enough that I am 100% positive of my interpretation.)

@SimonKrughoff
Copy link

@johnrpeterson and one more request, when you plot those distributions (like the lower right), will you also plot the ratio so we can see fractionally how well they agree?

@cwwalter
Copy link
Member

@TomGlanzman

If any of the "selected options" are known today, I would really like to include them asap to generate some benchmarks. Yes, 'cleardefects' does clear the 'chargesharing', so if the latter is desired, it would need to be reactivated explicitly.

OK this is under active discussion right now but I suspect

BF
Tree rings
Cosmic Rays
Bleeding

will be on.

Fringing
Saturation/non-linearity
Crosstalk and hot pixels and colums etc

likely won't be on but might be.

@johnrpeterson
Copy link

Rachel, for the validation test: this is a flat near the edge of the field, so there is a lot of vignetting. the three panels are the image with the different level of approximations. the histogram is the relative flux in the pixels.

Chris, with quick background the sensor physics will be averaged over for the background & dome light on small spatial scales. so you cannot see the effects in the background, but could see them in the astrophysical sources. so this makes turning on "tree rings", "BF", "fringing", for example, a little bit strange and inconsistent. not saying you couldn't do this, but would have to think through what DM is doing with these things. you could keep these on in the default optimization and there is no issue for that.

@cwwalter
Copy link
Member

This makes turning on "tree rings", "BF", "fringing", for example, a little bit strange and inconsistent. not saying you couldn't do this, but would have to think through what DM is doing with these things. you could keep these on in the default optimization and there is no issue for that.

Hi John,

Thanks. As you know, I'm discussing now with DM people about what corrections are currently possible etc. I'm not sure I am following completely what you are saying so I want to make sure I understand (as we are working with the SAWG to check some related items in some tests now). Are you saying with the quick background the photon bunching optimization is large enough that you won't expect to see BF, tree rings etc in background sky photons, but that you expect the effect will still be there for any astrophysical source and they will be summed?

@johnrpeterson
Copy link

yes, basically. its large scale patterns that are preserved, but not the smaller-scale sensor details. [but not sure what you mean by "summed"]

so lets take each one:

  1. Brighter fatter on
    normal optimization: will see correlations in flats & background; will see stars getting brighter & fatter
    quick optimization: will see stars getting brighter & fatter; no correlations in flats & background

  2. tree rings on:
    normal optimization: will see tree rings in flats & background; will see stars/galaxies having photometric/astrometric/psf size/ellipticity related to tree rings
    quick optimization: no tree rings in flats & background; will see stars/galaxies having photometric/astrometric/psf size/ellipticity related to tree rings

  3. fringing on:
    normal optimization: will fringing in flats & background; will see stars/galaxies having photometric effects related to fringing
    quick optimization: no fringing in flats & background; will see stars/galaxies having photometric effects related to fringing

@cwwalter
Copy link
Member

not sure what you mean by "summed"

Sorry.. I just meant the background with no effect and the sources with an effect will sum to the total electron level. This is what we are testing with SAWG because the BF correction kernel is extracted using the flats and we wanted to quantify the residual correlation size relative to the BF. So we will consider this.

Thanks.

@johnrpeterson
Copy link

definitely that is fine with regular optimization.

for the others it really depends on what DM does. so we have to look into it. with fringing, for example, its not common to even correct photometry for fringing effects, but it is common to remove fringing from the background. with tree rings, historically they have never been corrected correctly, so no one knows this ,but it wouldn't be that important to correct the background for tree rings, but it would be important to correct the astrometry of sources. same with what you say about brighter-fatter, you'd use the flats to infer what to do about the sources, however probably not correct the flats, I'm guessing. so its safe to say that none of the physics would be completely "reversed", but its important to be clear about what observable consequence DM is capable of correcting now.

also, with BF, you could make some special flats with normal optimization and then you could apply the BF corrections to images in the DC with quick background later.

@salmanhabib
Copy link

I would like to suggest a few limited tests on one FOV (or some reasonable fraction thereof), where we compare the full-on background, the approximate background, and runs with fixed sources including all the background options. This way, we will have a data set where at least some of these questions can be directly addressed quantitatively. We can do this at Argonne.

@TomGlanzman
Copy link
Contributor

TomGlanzman commented Nov 17, 2017

As of 20:43 Sun 11/19/2017, all of the runs mentioned below are complete.

Update: A set of 11 visits using (mostly) DC1 visits and phosim configuration but with quickbackgrounds enabled is nearing completion. The first three visits are the first three DC1 visits. The 4th visit is not DC1 (rejected because it took too many hours to complete) and is a performance check. The final seven (7) visits are those suggested by Seth (above). These visits are summarized in this spreadsheet:

https://docs.google.com/spreadsheets/d/17-EUgaDMtfgQLt84WquxeAjycqJaqHJZS_jZqrQXOVw/edit?usp=sharing

The remaining sensor-visits should complete before midnight tonight. As I will be on vacation all next week, please feel free to take a look at the data once they are available (by checking the workflow web page: http://srs.slac.stanford.edu/Pipeline-II/exp/LSST-DESC/task.jsp?refreshRate=60&task=49412792).

@TomGlanzman
Copy link
Contributor

TomGlanzman commented Dec 14, 2017

Getting back to the issue of the phoSim command file to use for DC2, I am trying to keep notes on how the discussion is evolving. My current guess, based on DC1 with recent updates follows. Note that it does not include all of @cwwalter list as I am not certain which combination of phoSim commands corresponds to his list - perhaps @johnrpeterson can help with the association?

`
# Enable centroid file production
centroidfile 1

# Disable sensor effects
cleardefects
fringing 0

# Enable brighter-fatter
chargesharing 1

# Disable dirt
contaminationmode 0

# Quick background
backalpha 0.1
backbeta 4.0
backgamma 1000.0
backdelta 1.0
activebuffer 600

# Number of sources handled by a single thread
sourceperthread 100
`

@TomGlanzman
Copy link
Contributor

The actual commands.txt file (listed above) is in github and currently here: https://github.com/TomGlanzman/DC2-phoSim-2/blob/master/NERSC/commands.txt Please comment on the content in this issue.

@jchiang87
Copy link
Contributor Author

I propose that we discuss the content of this file the same way we should handle any other code: make a branch where the new version of the file is introduced, issue a pull-request for that branch, and in the pull-request people can comment directly on the relevant lines with proposals for changes. Since PRs (assuming the repo is configured properly) can only be merged after review, this provides control that the reviewers agree that the proposed changes are acceptable. Since this is a file that would be used for DC2 productions, I think this commands.txt file should be in the DC2_Repo repository.

@TomGlanzman
Copy link
Contributor

Here is the PR to discuss the contents of the PhoSim commands/physicsOverride file:
#63

@katrinheitmann
Copy link
Contributor

Since we have shifted our testing to the DC2 era, I will close this issue. The most important point, checking out background implementations in more detail has been move to #113.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants