Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Staged protoDC2 image generation #33

Closed
jchiang87 opened this issue Nov 6, 2017 · 272 comments
Closed

Staged protoDC2 image generation #33

jchiang87 opened this issue Nov 6, 2017 · 272 comments
Assignees
Labels

Comments

@jchiang87
Copy link
Contributor

Since we will not be able to generate the full protoDC2 data set before the Sprint Week, we would like to produce an initial subset that would still be useful for the Working Groups in the near term.

We would like to do the full 25 sq degrees, so downscoping for the initial stage would mean fewer bands and a shorter observation time frame.

Questions:

  • How many visits (or sensor-visits) seem feasible to have done by Sprint Week? @TomGlanzman?
  • How many and which bands? @cwwalter?
  • What depth? 1 year? @cwwalter?
@cwwalter
Copy link
Member

cwwalter commented Nov 7, 2017

It is hard to answer this question before we first address what the WGs would like to do in the sprint week. What studies do people envision that they can't address with DC1 or the catalog? Knowing this will help answer the question what exactly would be useful. Do we have any information on this? For example, What is setting the 25 degrees?

Generally, I would think 6 bands is very important since we haven't tried this yet both from an analysis and CI perspective.

@katrinheitmann
Copy link
Contributor

Weak lensing studies cannot be done with DC1. One weak lensing person (maybe Mike J.?) said two bands are sufficient for some reasonable tests. For the sprint week, one main goal is to make sure the full pipeline is set up correctly. Another aim is to ensure that all the data formats are well under control and we can set up cross-catalog queries. The size 25 sq degrees was set to have something that is small enough so that it can be easily handled by CatSim, that it is big enough so that some statistics can be measured and validated, that we could use the flat sky approximation for weak lensing, .... many more reasons -- but mostly, small enough to be easily handled and big enough to enable validation tests.

@cwwalter
Copy link
Member

cwwalter commented Nov 7, 2017

OK, if we mostly want to make sure the full pipelines work then I would think all six bands are important. It might be useful to think about how many exposures are necessary to test stack functionality? Maybe 10 minimum?

Also, do you want it to be dithered?

@jchiang87
Copy link
Contributor Author

We have run the L2 pipeline on Twinkles data for all six LSST bands at 10 year depth for a DDF. I don't think the criteria should primarily be stack functionality for these data. I think we should try to provide something that the WGs would find useful for Sprint Week, so that we can understand and exercise catalog access, queries, etc. from their perspectives.

Don't we already know whether we want protoDC2 to be dithered? I would have assumed yes. There are still fundamental items that I don't see specified anywhere that are required before anyone can generate instance catalogs or images. What opsim db are we using? What object classes will be included in the instance catalogs? Just non-variable objects? Is a dithering implementation available now? If not, then I say we pick an opsim db and go with no dithering.

@cwwalter
Copy link
Member

cwwalter commented Nov 8, 2017

We have run the L2 pipeline on Twinkles data for all six LSST bands at 10 year depth for a DDF. I don't think the criteria should primarily be stack functionality for these data. I think we should try to provide something that the WGs would find useful for Sprint Week, so that we can understand and exercise catalog access, queries, etc. from their perspectives.

Oh, OK, I thought if we were testing the pipelines having the six bands would be important, and that it might also be useful from a CI perspective just to think about how to handle all of the different bands since we will need to do this for DC2.

Don't we already know whether we want protoDC2 to be dithered? I would have assumed yes. There are still fundamental items that I don't see specified anywhere that are required before anyone can generate instance catalogs or images. What opsim db are we using? What object classes will be included in the instance catalogs? Just non-variable objects? Is a dithering implementation available now? If not, then I say we pick an opsim db and go with no dithering.

Yes, good questions. I think we need to have a bit more of a conversation about what protoDC2 is and exactly what it is for (including this scaled down version for the hack week). It sounds like (at least for imSim) we can't make it feature complete relative to what we would like to do by the spring. We are also still waiting to hear from the WGs and making a decision for the cadence including how to handle transient sources. So this would argue for us focusing on what we want to test and running with what is ready with an understanding that this output will be replaced in the future. So, I think a clear list of what we want to test/study will help with this. Katrin said above that we needed at least two bands for WL and:

" For the sprint week, one main goal is to make sure the full pipeline is set up correctly. Another aim is to ensure that all the data formats are well under control and we can set up cross-catalog queries."

I think for this 6 bands non-dithered should be fine for now to start to test those things. I think there is a good chance we will find basic issues we will need to fix so having everything perfect on this first go is not critical.

But, I am of course interested in everyone one else's opinion. If it is important that we use whatever output we make now (for NERSC allocation reasons or something) that is of course important too.

@katrinheitmann
Copy link
Contributor

We have now established a sprint at the December meeting jointly with the WLPipe group for running the downscaled DC2 catalogs end-to-end through the weak lensing pipeline (this would even lead to cosmology parameters if it all works). Chihway sent the following message: "Usually weak lensing is done on r band or riz bands, but I think looking at all bands will be good if that’s easy to get."

Then we should discuss the PhoSim options for now since that's what we are going to do for the Sprint week. My understanding is that all features that we want to do for DC2 are complete in PhoSim, right? So the features are according to issue #4: "BF, tree-rings and saturation, and perhaps fringing and cosmic rays." Is that correct?

Why would we do non-dithered?

If the LSS group has time during the Sprint week, they could also run some simple tests on the catalogs.

I think for now we should not worry about the transients, though I can ping Rahul again if people think that that should be included as well.

@rmandelb Rachel: can you help us make the list of what is needed for a useful WL test? I am also happy to ask Chihway already directly but it would be good to have a first plan to show her that we agree can be done. Thanks!

@rmandelb
Copy link
Contributor

rmandelb commented Nov 8, 2017

Are you asking what's needed on the extragalactic catalog side, what types of images/bands/area for the image sims, what should be run on the DM side, all of the above, something else entirely?

@katrinheitmann
Copy link
Contributor

katrinheitmann commented Nov 8, 2017 via email

@jchiang87
Copy link
Contributor Author

Since @danielsf will be generating the instance catalogs, I'm hoping he can weigh in with any items we might have missed that would be needed to make those files.

Regarding dithering: I think we should definitely include it, but my concern is that we don't have a suitable dithering implementation on hand. For DC1, we selected a region of the sky and a baseline opsim db file, then ran an afterburner code to generate the dithered visits. So I'm wondering what opsim db file should we use for protoDC2 and whether can we use the old dithering afterburner or is there something new that we should use?

@salmanhabib
Copy link

Is there some "standard" dithering implementation from the project folks? One would think there would at least be a straw man --

@rmandelb
Copy link
Contributor

rmandelb commented Nov 8, 2017

This is not a complete answer, but it has a few high-level points that I think we have to consider before I can give those details.

It depends what level of test you are trying to do. For example: if you want to do this test with a subset of bands (which I personally think is good: we should have >1 band but not all 6), then they would have to use the true redshifts since we cannot get photo-z. Using true redshifts means we'd need the capability to easily match DM outputs against the truth catalogs. Is that functionality we expect to have in place by the sprint week? If not, then we would need all 6 bands plus photo-z code, which seems challenging in other respects.

The other question we have to ask is whether we need more area than we really want to do for this test using images. The images of real galaxies have shape noise, so realistically for weak lensing to have decent S/N there has to be a decent area. (In contrast, with the extragalactic catalog we could imagine using just the lensing shear without the intrinsic galaxy shape to test the pipeline without shape noise.) So I tried to answer this question of the area required, to figure out whether an image-based WL test even makes sense at this stage. Our estimate is that 1-year depth images should give an effective source number density (with reasonable cuts) of ~12/arcmin^2. In the regime of shape noise-limited WL measurements (which the LSST survey will not be, but a small-area sim will be), the S/N for cosmic shear scales like sqrt(area) * neff.

For HSC we expect to have S/N for cosmic shear of around 25 in our 136 deg^2 with neff=24/arcmin^2. Let's say for this test of DC2, it's interesting even with S/N=10. Given our LSST Y1 neff=12/arcmin^2, we can roughly estimate the area needed for an interesting WL test (this is ignoring redshift scalings which will modify it slightly):

area needed = (HSC area) * (2 * (10/25))^2

where the 2 comes from the number density difference, and the 10/25 comes from the S/N we can accept for this test. So we need about 0.64 times the HSC area, which is around 90 deg^2.

I guess if it's 2 bands = 1/3 of the bands for DC2, and 90/300 of the area for DC2, and 1/10 of the time (Y1 instead of Y10), then that amount of imaging would correspond to 1% of DC2. So maybe that's not bad after all?

Anyway, if it's acceptable to do 2 bands to Y1 depth in a ~90 deg^2 area, and if we'll be able to match to the truth catalogs easily, then we could plan to do this WL test. And I will give you the rest of what you asked for, but first I wanted to check: do the basics sound feasible?

If not, the WLPipe team definitely has something they can do at sprint week (working from the extragalactic catalog would already enable an interesting test).

@rmandelb
Copy link
Contributor

rmandelb commented Nov 8, 2017

I just realized the initial issue in this thread said 25 deg^2, so maybe this test is not doable after all?

@rmandelb
Copy link
Contributor

rmandelb commented Nov 8, 2017

If we want to stick with what's doable for 25 deg^2, then I think it's pretty clear we cannot do the project of going all the way to cosmology. I did some thinking about what could be usefully done:

  • A bunch of null tests for PSF model fidelity, and certain shear systematics. (The sensitivity of the tests will not be great, so this would mainly be a demonstration that we can get things all the way through the null test pipeline.) This would not require output vs. input catalog matching, and would operate purely on the DM outputs.

  • If there is concern about shear conventions, we can use output vs. input catalog comparisons to do confirm that the shear conventions are all under control.

  • Basic sanity check of the measured RMS ellipticity dispersion (as a function of magnitude, redshift, etc.) compared to that in HSC or DES.

Edited to add: I think that 2 bands at Y1 depth would be enough. We should have a few pointings (not just 1), but within that 25 deg^2 constraint it should be fine.

@danielsf
Copy link
Contributor

danielsf commented Nov 8, 2017

@jchiang87 I am finalizing the InstanceCatalog generation code here

https://github.com/danielsf/gcr-catalogs/tree/sed_fitting

I've been backchanneling Eve any time I run across something I don't understand. So far, I think we can work with protoDC2 as it is, now. I will let you know here if there is an insurmountable problem.

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

If we want to stick with what's doable for 25 deg^2, then I think it's pretty clear we cannot do the project of going all the way to cosmology. I did some thinking about what could be usefully done:

Are there basic performance checks shapes etc that we should/could be doing now with the DC1 dataset with no external shear applied in r-band? I'm a bit worried that people are excited about testing the analysis pipelines but we are missing testing basic things we can do now and will also exercise people learning how to interface with data.

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

Is there some "standard" dithering implementation from the project folks? One would think there would at least be a straw man

Basically no. Dithering is added via an afterburner which adds a new column to the OpSim database. In the past Simon had done one and then for DC1 Humna and Eric designed a optimized one including applying a rotation on each filter change.

I guess if we are using the same OpSim database as before those columns are still there so we can use them for this small test. Then we also designed a way to reduce the unnecessary simulation for sensors that would give us uneven depth in a region that was chosen by the LSS group. You still get more visits than no dithering but the "wasted" sensors are removed. There are a couple ways of doing this.

This is one of the reasons I asked what we really wanted to test at the hack day. If we are making a small test that doesn't rely on the depth smoothing that dithering gives you then running with the standard OpSim non-dithered pointings will give you what you need with no extra work and no extra run factor for dithering and trimming issues.

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

Actually what the scheduler will really do about dithering is an interesting question. I can follow up on this in project meetings. @SimonKrughoff or @danielsf do you know anything about the plan / status of the work now?

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

BTW our 40 square degrees before is basically 4 undithered pointings. So it is a bit discretized and, as I mentioned, you need to think about if the depth uniformity is important when the dithering is included.

@rmandelb
Copy link
Contributor

rmandelb commented Nov 9, 2017

Are there basic performance checks shapes etc that we should/could be doing now with the
DC1 dataset with no external shear applied in r-band?

There are PSF modeling checks that could be done, I am sure.

Were any shear estimation algorithms run in DC1?

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

Were any shear estimation algorithms run in DC1?

Yes, I think so; we ran the entire standard DM measurement algorithm suite. Let me check the actual variable list...

@cwwalter
Copy link
Member

cwwalter commented Nov 9, 2017

Looks like we have at least SDSS shape and HSM. Take a peek here:

https://github.com/LSSTDESC/SSim_DC1/blob/master/Notebooks/Butler%20access%20demo.ipynb

@rmandelb
Copy link
Contributor

rmandelb commented Nov 9, 2017

Yep. Any of the tests I listed in #33 (comment) could be done with DC1, it would seem.

@rmandelb
Copy link
Contributor

rmandelb commented Nov 9, 2017

Actually now that I think of it I had already recommended the PSF model tests for DC1 (but I think I didn't realize regaussianization had been run, so I didn't recommend any galaxy shape tests).

@TomGlanzman
Copy link
Contributor

There is now an operational workflow at NERSC running phoSim v3.7.1, quickbackground configuration on the first three DC1 visits. Due to queue latency, full results may not be available until ~Thursday. But some initial results are dribbling out of the system which may allow for a realistic estimate of how much CPU and elapsed time will be required to generate [[staged]proto]DC2.

The first result is that using the DC1 sky, 34 sensor-visits were completed on a single KNL node in ~177 minutes. There were 34 instances of phoSim per node running with 8-threads each. (Recall that DC1 imposed a mag=10 limit on bright stars.) This result is good for a fully-loaded KNL node in a realistic production context. This represents an average raytrace throughput of 1 sensor-visit every 5.2 minutes per node. Unless there are some super-scaling I/O issues, this should be a good planning number until final catalogs are available for test.

By comparison, DC1 sensor-visits averaged 560 minutes (with long tails on the distribution) running 8 instances per node and 8-threads each. The raytrace throughput on Haswell was more like one sensor-visit every 70 minutes per node, or a factor of 13.5 less (per node). This factor accounts for all changes in hardware, node loading, phosim code fixes & improvements, and phosim configuration differences (e.g., quickbackground).

A limitation of the KNL result is that it is still to early to measure operational inefficiencies - which could be significant. Also, I've not yet had the opportunity to run this new workflow at scale...with 100s of nodes, so there may be scaling surprises waiting to be discovered.


With regard to preparing data for the Sprint Week, consider what might be done in a single (very ideal) day of a ramped-up workflow running on 500 KNL nodes.

(1 sensor-visit / 5.2 min / node) * (500 nodes) * (1440 min/day) * (1 focal plane / 189 sensors) = 732 focal planes/day

This early in the game a derating factor of 5-10 is probably prudent.

The trick will be assembling the missing pieces (instance catalogs, dithered map of sensors to simulate, agreeing on the desired phoSim configuration), AND getting jobs to run at NERSC in a timely way. In general, it can take days for jobs to start at NERSC so this becomes a critical factor given the limited time.

  • Tom

@salmanhabib
Copy link

salmanhabib commented Nov 15, 2017

Tom's runs are done with what would nominally the best number of threads (34 * 8=272, the total number of hardware threads on the KNL). From the previous KNL performance numbers I got from Glenn and John, I had estimated 3.3 minutes per job, so I am a little puzzled by the 5.2 minutes we are seeing. What's the difference between these two cases?

@fjaviersanchez
Copy link

@cwwalter I don’t know if it’s just a fluctuation but you are right in that there should be no trend if the moon is below the horizon. Especially given that if the altitude is zero, the moon is deactivated. The only thing I can think of is that PhoSim needs the moon to have the correct background levels?

@fjaviersanchez
Copy link

@sethdigel. Is it possible to repeat one of these exposures with the moon below zero but activated and see if it makes a difference?

@sethdigel
Copy link

@fjaviersanchez I looked at the code and the phoSim reference guide some. I can't say that I entirely understand the code, but I suspect that changing the one line that sets moonMagnitude to 10000 when moonalt < 0 may not be all that informative. On closer look I see that, not surprisingly, the calculation is more complicated than that. I took a quick look at Krisciunas & Schaefer (1991), cited as the source of the moon brightness model, and I'd guess that the code is separately calculating the Mie scattering and Rayleigh scattering components of the sky brightness. I got excited briefly by seeing that Rayleigh scattering depends on the square of the cosine of the angular distance from the Moon. But this looks like the component that gets turned off when the Moon goes below the horizon, though. Anyway, I don't doubt that John & co. have carefully validated the model.

@sethdigel
Copy link

sethdigel commented Dec 9, 2017

Here's something a little interesting. As part of the DC2-phoSim-1, Tom ran the same full focal plane visit (obsHistID 138143) using phoSim compiled with different compilers: gcc and Intel. Here's how the reported CPU times compare:

cpu_comp_11_12

The points are the CPU times for the individual sensor visits. The gcc times are about 33% longer on average - probably not a surprise. These are for phoSim v3.7.1, and both used KNL hosts.

He also ran the same visit using phoSim v3.7.5 with the Intel compiler.

cpu_comp_11_13

For this visit phoSim v3.7.5 is 20% slower on average. This is probably not a surprise either, as it is doing less optimization. Both used KNL hosts.

I got the CPU times from the RunRaytrace step of DC2-phoSim-1

@johnrpeterson
Copy link

although it should be a little bit slower, it might not be as significant as these plots show. the reason is that it is only slowing down a few bright stars, so if you look carefully you may realize you just have one thread lingering after the other 7 complete or something like that. then when you fully load the node with a few dozen phosim's simultaneously, the overall slowdown probably should be rather small.

but 20% is within the noise of a lot of our estimates, so i am happy with this.

@TomGlanzman
Copy link
Contributor

TomGlanzman commented Dec 13, 2017

This morning a discussion about phoSim instanceCatalog generation got started, some details of which should be preserved. The upshot is that the current scheme is not a good match to a large production on cori-knl. Various idea are being discussed. Below is the entire thread for reference.

------------------------------------

On 12/13/2017 12:16 PM, Scott Daniel wrote:
I agree, Yao.  That masking scheme will probably work

On Wed, Dec 13, 2017 at 12:13 PM, Yao-Yuan Mao <yymao.astro@gmail.com> wrote:

    Scott,

    It only loads needed columns (unless the unlying file does not support that, but protoDC2 does). 

    So I think we can edit this function: https://github.com/LSSTDESC/gcr-catalogs/blob/master/GCRCatSimInterface/DatabaseEmulator.py#L97 

    We first load in ra and dec, create a mask, and then load in other quantities one by one, filter them with the mask. 

    Also, GCR supports renaming label -- something we can also change in the said function. 


    BTW, in this file: 
    https://github.com/LSSTDESC/DC2_Repo/blob/master/scripts/protoDC2/generateDc2InstCat.py#L118 

    we should now use 'protoDC2' instead of 'proto-dc2_v2.0'

    Yao


    On Wed, Dec 13, 2017 at 3:03 PM Yao-Yuan Mao <yymao.astro@gmail.com> wrote:

        It'll load whatever the chunk the catalog file was broken into, so if the catalog use nside=8, GCR is limited to loading an nside=8 healpixel at a time.

        We can obviously create a file that is broken into smaller chunks, but I'm not sure if this is the best route...

        Best,
        Yao

        On Wed, Dec 13, 2017 at 3:01 PM Scott Daniel <scottvalscott@gmail.com> wrote:

            Hi Yao,

            Assuming that the DC2 catalog files are sharded into healpixels, how does the GCR's native "query by healpix id" functionality work?  Is it only capable of loading an entire shard (i.e if the catalog is natively divided into nside=8 healpixels, then is GCR limited to loading an nside=8 healpixel at a time, or can it load a subset of that)?

            Cheers,

            Scott

            On Wed, Dec 13, 2017 at 11:54 AM, Yao-Yuan Mao <yymao.astro@gmail.com> wrote:

                OK. Given the numbers you mentioned, I think you want less than about 2 sq. degs per instance (in terms of original galaxy catalog).  This is not likely to happen for file-based catalogs. I think there are two options (or more):

                (1) We use GCRCatalogs to convert the catalog to a database, or 

                (2) Somehow we use CatSim to get only the necessary part of the catalog. 

                However, I am not sure if my suggestion is useful at all because I am not sure about how this process actually works. Scott or Tom, can you elaborate a bit more on the connection between CatSim and this pre-trimming step? Is CatSim being run on the fly? Or do you use CatSim to create an instance catalog and then trim it?

                Best,
                Yao


                On Wed, Dec 13, 2017 at 2:46 PM Tom Glanzman <dragon@slac.stanford.edu> wrote:

                    Yao,

                      This step produces the full (pre-trimmed) instance catalog used by phosim for a single visit, so the amount of sky is based on how many sensors are involved.  In terms of memory, one KNL node has 96 GB to share amongst 272 hyperthreads (or 68 cores).  Given that this particular step is single-threaded, we must keep the memory footprint to a minimum.  I'd like to plan on running of order 50 simultaneous instances of this step, although they may not all end up on the same KNL node (over which I have little control).  Ideally, such a step would limit itself to just a few GB.


                      - Tom


                    On 12/13/2017 11:35 AM, Yao-Yuan Mao wrote:
>                     Tom,
>
>                     What's the typical sky area you want it to be kept in memory at once? 
>
>                     Best,
>                     yao
>
>                     On Wed, Dec 13, 2017 at 2:32 PM Tom Glanzman <dragon@slac.stanford.edu> wrote:
>
>                         Hi Scott,
>
>                           If the gcr-catalogs code truly reads ~40GB into memory and then sifts through it in the process of creating the phoSim instanceCatalog, then this is even worse than I had feared: we would only be able to run *two* instances per KNL node!  This will not even scale to the needs of protoDC2.  Adding Eve to the cc: list.
>
>
>                           - Tom
>
>
>                         On 12/13/2017 11:25 AM, Scott Daniel wrote:
>>                         Hi Tom,
>>
>>                         I cannot speak to the final size of the DC2 catalog.  That is a question for Eve Kovacs and the ANL team that will actually be producing these catalogs.
>>
>>                         The code that ultimately reads in the catalogs is here
>>
>>                         https://github.com/LSSTDESC/gcr-catalogs/blob/master/GCRCatSimInterface/DatabaseEmulator.py#L98
>>
>>                         it will definitely not scale.  Because protoDC2 is so small, the code just reads the whole catalog into memory and deals with spatially querying it (i.e. finding only those galaxies which are within your field of view) on the fly.  This will not work for the final DC2 catalog.  How we deal with scaling will depend on what kind of spatial index (probably healpixels, but I'm not sure) the catalog producers apply to the DC2 catalog.
>>
>>                         Does that answer all of your questions?
>>
>>                         Cheers,
>>
>>                         Scott
>>
>>                         On Wed, Dec 13, 2017 at 11:18 AM, Tom Glanzman <dragon@slac.stanford.edu> wrote:
>>
>>                             Thanks Yao.  I will wait for Scott/Jim to chime in as there were a number of steps involved to make gcr-catalogs operational for the instanceCatalog generation.  Hopefully, they can also help with your recommended updates.
>>
>>                             In the meantime, I will attempt to take your advice, copying the ANL_AlphaQ_v2.hdf5 to $SCRATCH, and then modify this file:
>>
>>                             /global/projecta/projectdirs/lsst/production/DC2/gcr-catalogs/GCRCatalogs/catalog_configs/proto-dc2_v2.0.yaml
>>
>>                                which contains:
>>
>>                             subclass_name: alphaq.AlphaQGalaxyCatalog
>>
>>                             filename: /global/projecta/projectdirs/lsst/groups/CS/descqa/catalog/ANL_AlphaQ_v2.hdf5
>>
>>                             lightcone: true
>>
>>                             creators: ['Eve Kovacs', 'Danila Korytov', 'Katrin Heitmann', 'Andrew Benson']
>>
>>                             ---
>>
>>                               Thanks,
>>                               - Tom
>>
>>
>>
>>
>>
>>                             On 12/13/2017 11:02 AM, Yao-Yuan Mao wrote:
>>>                             For now I would say you can just copy the file to the disk you want? And just change the path in the yaml file (or use config_overwrite keyword in GCRCatalogs.load_catalog) 
>>>
>>>                             Does that sound reasonable? 
>>>
>>>                             Best,
>>>                             Yao
>>>
>>>                             On Wed, Dec 13, 2017 at 1:51 PM Tom Glanzman <dragon@slac.stanford.edu> wrote:
>>>
>>>                                 Hi Yao,
>>>
>>>                                   Thanks for that advice, who is the one to make such a change?  And how does that propagate through the various bits we have put together for protoDC2?
>>>
>>>                                   I think the immediate issue is that we cannot efficiently read these huge files from Cori compute nodes (i.e., knl nodes).  Our 'project' disk space is in a r/w GPFS area which is inefficient for the compute nodes.  We need to consider moving the large files into the Lustre filesystem ($SCRATCH).  How would you recommend we do that?
>>>
>>>                                 -rw-r--r-- 1 kovacs   lsst 41012452272 Dec  2 16:19 ANL_AlphaQ_v2.1.hdf5
>>>
>>>                                 (Mustafa: are there other tricks we might use with a 40 GB file?)
>>>
>>>
>>>                                   - Tom
>>>
>>>
>>>                                 On 12/13/2017 10:42 AM, Yao-Yuan Mao wrote:
>>>>                                 Tom, 
>>>>
>>>>                                 You can find the actually paths of the galaxy catalogs (i.e. pre-CatSim catalogs) in the yaml config files, which are listed here: 
>>>>
>>>>                                 https://github.com/LSSTDESC/gcr-catalogs/tree/master/GCRCatalogs/catalog_configs 
>>>>
>>>>                                 For example, protoDC2 v2.1 points to /global/projecta/projectdirs/lsst/groups/CS/descqa/catalog/ANL_AlphaQ_v2.1.hdf5 
>>>>
>>>>                                 However, that catalog is already outdated. You may want to consider to switch to v2.1.1.
>>>>
>>>>                                 Best,
>>>>                                 Yao
>>>>
>>>>                                 On Wed, Dec 13, 2017 at 1:37 PM Tom Glanzman <dragon@slac.stanford.edu> wrote:
>>>>
>>>>                                     Hi Scott,
>>>>
>>>>                                       Thanks for that pointer.  I've been able to trace gcr-catalogs activity to this NERSC directory:
>>>>
>>>>                                     /global/projecta/projectdirs/lsst/groups/CS/descqa/catalog
>>>>
>>>>                                     But from there it becomes more obscure which files are actually used for instanceCatalog generation.  Exactly which of these files, some of which are ~40 GB each, are used?  If one or more large catalog files are routinely read by the instanceCatalog generation, then we should seriously consider moving/copying them to Lustre ($SCRATCH) for production.
>>>>
>>>>                                     Have added Yao for some assistance, and Mustafa so he can help us make this work in a large production.
>>>>
>>>>                                       - Tom
>>>>
>>>>                                     On 12/13/2017 09:31 AM, Scott Daniel wrote:
>>>>>                                     Hi Tom,
>>>>>
>>>>>                                     The galaxy catalog is stored at NERSC.  Moving it to $SCRATCH will entail some work, simply because we are using Yao-Yuan Mao's generic catalog reader to access the catalog, and that contains some hard-coded paths to the catalog.  I suspect it will be straightforward to redirect it to a $SCRATCH copy, I just don't know how to do that off of the top of my head.  I suspect it will involve creating a modified version of this file
>>>>>
>>>>>                                     https://github.com/LSSTDESC/gcr-catalogs/blob/master/GCRCatalogs/catalog_configs/proto-dc2_v2.1.yaml
>>>>>
>>>>>                                     but we should check with Yao-Yuan, if we decide to go down this path.
>>>>>
>>>>>                                     One other concern: eventually, we will retarget this code to the actual DC2 galaxy catalog.  I do not know how large that will be and whether or not it will be possible to load it into $SCRATCH.  That is probably a question for Eve Kovacs.
>>>>>
>>>>>                                     Cheers,
>>>>>
>>>>>                                     Scott
>>>>>
>>>>>                                     On Wed, Dec 13, 2017 at 8:55 AM, Tom Glanzman <dragon@slac.stanford.edu> wrote:
>>>>>
>>>>>                                         Scott and Jim,
>>>>>
>>>>>                                           Thank you for your advice and reassurance yesterday.  The DC2 instance catalog generator seems to be working fine mechanically.
>>>>>
>>>>>                                           One concern is that while this process clocked in at 15 minutes on a cori login node, my first attempt to run this on KNL required a whopping 103 minutes to complete.  Despite not knowing exactly what is going on internally with generateDc2InstCat.py, I did monitor the catalog file sizes while the job was running and can state that much of this 103 minutes was in building the gal_cat_947161.txt file.  I observed the size of this file growing at roughly 100 MB every 7 minutes.  This file is ~1.1 GB prior to gzipping.
>>>>>
>>>>>                                           How exactly does the galaxy catalog part of this process work?  Is there a large galaxy catalog stored somewhere at NERSC and, if so, perhaps it should be copied to $SCRATCH for performance reasons?
>>>>>
>>>>>                                           - Tom
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>

@yymao
Copy link
Member

yymao commented Dec 13, 2017

What I learned in that giant thread ---

  • We need to be able to access the catalogs in a low-memory overhead fashion.
  • For now, we can probably achieve what by improving this module. I think @danielsf and I will start to make this change.
  • For full DC2, we either need to use a database, or break the catalog files into small enough sky patches. @evevkovacs
  • If the catalog is file-based, probably need to move/copy it to scratch for runtime efficiency.

Also learned but not related to the previous points:

  • Recent updates to protoDC2 did not propagate to the image pipeline. We should use 'protoDC2' at this place. @jchiang87

@TomGlanzman
Copy link
Contributor

@yymao It would be good, for the sake of provenance, to know exactly which version of that catalog was used for any given run. A publicized tag of gcr-catalogs would probably suffice.

(Also, I can override the default, proto-dc2_v2.0, if necessary.)

@danielsf
Copy link
Contributor

@yymao Could you please take a stab at fixing the CatSimInterface as we described? I have a lot of other work that I need to focus on this week. Thanks.

@yymao
Copy link
Member

yymao commented Dec 13, 2017

@danielsf sure.
@TomGlanzman I'll let you know when I finish the change, hopefully by tomorrow...

@cwwalter
Copy link
Member

cwwalter commented Dec 13, 2017

Can someone send me that email (or reformat it somehow)? Because of the way the wrapping is working it is basically impossible to read here...

(I tried editing it and putting it in triple quotes but that didn't help)

@cwwalter
Copy link
Member

Yao sent it to me... no need for more copies...

@cwwalter
Copy link
Member

So that I understand: One of the constraints here is that you have decided that the job the produces the instance catalog has to be the same one that runs the phoSim instance correct?

@TomGlanzman
Copy link
Contributor

@cwwalter I assume that last question is for me? So, not quite. The first workflow batch job creates the instanceCatalog, then runs phosim.py to the point where it attempts to submit condor jobs then quits, followed by a bit of bookkeeping. This job is composed of single-threaded computation. In general, it will run on a KNL node where a lot of other jobs are running, so memory consumption is a concern.

@cwwalter
Copy link
Member

Right, sorry... So by "job" I mean it all runs together on the same processor. Could you make the instance catalogs on a haswell node ( I assume this step is pretty fast) and then copy the file to the memory space that looks like a disk on the KNL side (sorry I can't remember what this is called) and then run the raytrace on KNL?

@TomGlanzman
Copy link
Contributor

@cwwalter Yes, but... There are two distinct problems related to instanceCatalog generation: 1) compute nodes (either knl or haswell) reading a 40 GB file residing on the r/w GPFS project area file system; and, 2) large memory footprint (experimentally determined to be >~8 GB, high-water mark). Once the instanceCatalog has been generated, I do not anticipate further problems (based on DC1 experience).

@salmanhabib
Copy link

I am surprised that reading a 40GB file is a bottleneck. It should not be -- we should look at how the IO is being done.

As to the second question, how many individual compute jobs are needed to generate the instance catalog?

@cwwalter
Copy link
Member

That's why I asked about this feature I remembered you and Salman and Heather explaining. I thought there was some on chip memory that acted like a fast disk. I was asking if you could stage the file there.. Or is it too small?

@salmanhabib
Copy link

There's a burst buffer on Cori that can help speed up IO, but even without it we should be ok. (I would avoid the BB for now if we don't have to use it -- another thing that can go wrong.)

@evevkovacs
Copy link
Contributor

@TomGlanzman @yymao Re the issue with using an out of date version of protoDC2: Tom, are you using the latest version of GCRCatalogs? Lats week, Yao implemented version control for both the catalogs and the reader to tell the user if they are using the latest version. You should switch to that.

@TomGlanzman
Copy link
Contributor

@cwwalter Yes, we could experiment with the burst buffer, but as is written up here, whether there is a net benefit depends on the details of the I/O. But how to characterize the I/O? The gcr-catalogs authors might wish to chime in. Also @MustafaMustafa might wish to comment on whether there are specific tools available to characterize I/O on running jobs.

@salmanhabib A single compute (slurm) job is currently required to generate the instanceCatalog for a single visit. That step is single-threaded and somewhat memory intensive. The reason for thinking the disk I/O is a bottleneck is that running the exact same IC generation on knl is 7x slower than on a cori login node (haswell-like); and, the standard advice is to not perform I/O to the r/w GPFS file system. As you have KNL at Argonne, I would welcome your confirmation of this behavior and further analysis into the cause of this slow-down.

@evevkovacs I am using the setup provided by @jchiang87 and @danielsf which is not the latest and greatest. That is good news about the versioning so I will follow up with them about upgrading.

@TomGlanzman
Copy link
Contributor

Update on instanceCatalog generation: after moving the 40GB catalog from GPFS to Lustre, the elapsed time to run the same config dropped from 103 min to 94 min. The time on cori login remains at 14.5 min.

The memory high-water mark remains at just above 8 GB.

Note to @cwwalter - Our current workflow system should allow running different steps on different architectures. This is a mode with which I have no experience, and it would significantly complicate the operation of the workflow (due to managing pilots on both cori-knl and cori-haswell).

@cwwalter
Copy link
Member

@TomGlanzman Wasn't there something other than the burst buffer where we trying to put our software when the startup time was so slow?

@TomGlanzman
Copy link
Contributor

@cwwalter There are several methods recommended by NERSC:

  1. Avoid r/w GPFS and use Lustre
  2. Use the r/o GPFS in /global/common/software
  3. Use the burst buffer
  4. Use a container

You may be referring to item 2.? I have heard (and I do not mean from DJT) that 1. > 2.

@cwwalter
Copy link
Member

Yes, I think I was remembering 2.

@TomGlanzman
Copy link
Contributor

Thanks to @yymao the gcr-catalogs package has reduced its memory footprint from ~6.2 GB (max res) down to ~1 GB. This problem is solved. Getting very close to having a working config.

Note that NERSC (cori) is down today until 20:00 PST for maintenance.

@katrinheitmann
Copy link
Contributor

This issue is now outdated. Run 1.0 is underway, for Run 1.1 we have new specifications. If there is anything in this issue that should be kept open, we should capture that in a new issue. So for now, I am closing this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests