Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

Re-Simulation Processor #49

Closed
tomeichlersmith opened this issue Mar 10, 2021 · 8 comments · Fixed by #95 or LDMX-Software/ldmx-sw#1183
Closed

Re-Simulation Processor #49

tomeichlersmith opened this issue Mar 10, 2021 · 8 comments · Fixed by #95 or LDMX-Software/ldmx-sw#1183

Comments

@tomeichlersmith
Copy link
Member

We already have two generators that require an input file formatted the same as our event files and these generators attempt to restore the configuration of the random number generator in Geant4 at the time of the first simulation.

I've been trying to use these generators to do some re-sim and it isn't working.

I think this is because the Simulator processor is designed to be the first Producer in a production chain. I think we can get around this issue by writing another processor that is specifically focused on re-simulation. This processor would be designed to provide the current event bus to other simulation parts before starting the simulation, so that Geant4 can be reconfigured the same. Then we can run the ReSimulator processor as the first Producer in an analysis chain.

This will probably require some modifications to other parts of our simulation infrastructure. Another potential solution is to try to have the Simulator work for both running modes; however, I ran into issues trying a few "easy" fixes, so I think separating them will be easier.

@EinarElen
Copy link
Contributor

EinarElen commented Feb 21, 2023

I started looking into this a bit myself for my personal work. As far as I can tell, the output that we store in the eventSeed part of the event header isn't what you want to be able to restore the random number state. What we are storing is the state of the RNG at the end of the simulated event, which isn't helpful if you want to repeat a simulation. You'd want the RNG state at the start of the event (preferably at the start of the first successful attempt for the event) or wherever you want to actually restart the simulation from (e.g. the end of the primary generation for RootCompleteReSim). I don't know if it is possible to repeat a simulation with the same RNG state from the scoring planes

I've tried to see if you can rescue thing with the existing eventSeed but I don't think that's possible.

Have you any thoughts on this @tomeichlersmith ?

@EinarElen
Copy link
Contributor

For context, what I'm doing for my purposes is something morally equivalent to:

  • Add a parameter to Simulator called repeat_events which takes a list of event numbers to record the seeds from/repeat
  • If no such events have been run before (or if repeat_events is empty),
    • run the full simulation including the history leading up to the events you are interested in as normal. At the start of the events of interest, record the RNG state to a clean stringstream
G4Random::saveFullState(seedStream);
  • If the event is not aborted, write the contents of the stream to disk
  • If the seed files are on disk already
    • Set the maxEvents to the number of events you want to resimulate
    • Restore the RNG state from that event file, i.e. something like
      auto eventToRepeat{repeat_events_[eventNumber - 1]};
      std::ifstream seedInputFile{"seed_run_" + std::to_string(run_) + "_event_" +
                                  std::to_string(eventToRepeat) + ".txt"};
      G4Random::restoreFullState(seedInputFile);

and this seems to be working for my purpose. I haven't checked if the events exactly the same yet though

@tomeichlersmith
Copy link
Member Author

First of all, I agree that moving the recording of the RNG state to the start of the event is key. I think, in general, my long term view is to write a separate processor for handling re-simulations.

Separate ReSim Class

I think we need a separate ReSimulator that mirrors Simulator for most actions. The reason I think it needs to be separate are the following:

  • Simulator is built upon the assumption that there is no input file. This includes a bunch of auxiliary work it does to ensure that the event file has the information required by downstream processors. Including a "resim" option would highly complicate this.
  • Re-instating the RNG is only one piece of the puzzle. In order to have the same events, we also need to ensure that the primary generator is the same and that the detector is the same (or at least is being changed intentionally). This can be done by looking at the input file's run header information.
  • "on disk already" -> This is a simple task in theory, but is currently not well handled by our Framework. Unless you plan to store the event seeds in their own file separate from the event file which should work but would be added complexity for folks trying to share their setup with each other.
  • I don't think the repeat_events parameter is necessary for the Simulator. Instead, I'd include this parameter in the ReSimulator and it will just "Abort" events while reading the input file until a matched event number is found. Additionally, if that parameter is not provided, the ReSimulator can just re-sim all events.

For implementation, I'd have an abstract base that both Simulator and ReSimulator inherit from so that they can share common methods while providing their own specializations on top. Writing a whole new ReSimulator would be a lot of work, so if you got something working, I'd be interested in seeing the code. It may be better than what I'm imagining and so should be merged.

@EinarElen
Copy link
Contributor

Yeah, I just wanted to be clear about what I was doing for my needs and that it looks like picking the seed in a different place is enough for reproducing most basic events, I agree that dumping things to a random file isn't what we want :)

A base class approach was what I had in mind. I think to get started, the check for ensuring the same primary generator/detector can be left for later (and detector, I guess actually we wouldn't want to keep the same in most use-cases). I haven't written this yet, as this has been enough for what I need right now but I think I could give it a go some time this week since I'm working on related things anyway.

@EinarElen
Copy link
Contributor

EinarElen commented Feb 21, 2023

One question would be, do we know if anyone is currently using the eventSeed part in their work? I could imagine someone depending on it as a basic check for identical events or something but I'm hoping not. If not, changing what it means would be straight-forward.

One concern would be that we'd be reading that stream once for every attempted event but I think that overhead is relatively small compared with the rest of the Geant4 processing.

@tomeichlersmith
Copy link
Member Author

I am not aware of anyone using the eventSeed currently and I agree that the overhead of reading the stream is negligible relative to the rest of the Geant4 sim.

@EinarElen
Copy link
Contributor

Ok, I'll see what I can do for this. If a resim processor works, would there be any purpose to keeping the rootCompleteReSim generator? It would change its behavior but afaik that behaviour is wrong anyway right now

@tomeichlersmith
Copy link
Member Author

No - that generator would be removed.

The ResimFromEcalSP generator would be kept around to (hopefully) enable re-sim of particles leaving the ECal to make studying different HCal geometries more efficient, but that is a much more complicated problem since it is attempting to start an event in the middle of it.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
2 participants