Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Hooks" for ShootFromSnapshotsSimulation (i.e., committor) #755

Merged
merged 23 commits into from
Jun 29, 2021

Conversation

dwhswenson
Copy link
Member

@dwhswenson dwhswenson commented Jan 30, 2018

This begins to introduce the idea of "hooks" (which @jhprinz has previously called "reporters") to our main simulation objects. The goals here are to:

  1. Enable much more flexibility for users
  2. Simplify code by putting repeated sections into hooks (e.g., storage management is the same in PathSampling or Committors: save the result after each step, sync after save_frequency steps, and sync at the end -- all this code can go into a storage hook, avoiding repeated code and making the main code easier to read)
  3. Prepare for task-based parallelization: By reorganizing the structure of the simulator, it should be easier to properly create a task for the parallelization I will be working on.

In this PR, I add the basic functions for hooks, and replace code in the ShootFromSnapshotsSimulation with hook-based versions. The default behavior will be indistinguishable from the original, but now there are a lot more possibilities.

This builds off of #754, and should only be merged after that.

  • Add general support for hooks
  • Add tests for attaching hooks in general
  • Add hooks to replace parts of ShootFromSnapshotsSimulation
  • Test individual replacement hooks
  • Run Committor with hooks
  • Add docs about hooks

@dwhswenson dwhswenson self-assigned this Jan 30, 2018
@dwhswenson dwhswenson changed the title "Hooks" for ShootFromSnapshotsSimulation (i.e., committor) [WIP] "Hooks" for ShootFromSnapshotsSimulation (i.e., committor) Jan 30, 2018
@hejung
Copy link
Contributor

hejung commented May 18, 2018

I have a question concerning the after_step methods:
In your implementation of ShootFromSnapshotsSimulation it looks like it is expected that after_step returns hook_state, however the hooks implemented so far do not return anything on calling after_step?
Did I miss something important? Or should after_step return the input hook_state or possibly modify it and return the modified? Should hook_state be a dict?

PS: I have a working implementation of PathSampling with hooks @ https://github.com/hejung/openpathsampling/tree/pathsamp_hooks , I'll still have to test it extensively, I have however used it without problems in the past month.

@hejung
Copy link
Contributor

hejung commented May 18, 2018

Found it, it happens in the run_hooks in PathSimulator, but now it looks to me like every hook (before/after_step/simulation) can return a value, is there a reason to only collect the hook_state for after_step() ?

@dwhswenson
Copy link
Member Author

Some of the design of the hooks is based on the idea that eventually we'll use them as part of our parallelization schemes, which will use a task-based approach (probably using dask.distributed). This is still in development, but the idea was that anything that has side-effects needs to have an explicit output which is part of the input of a future step. For example, you could have an on-the-fly analysis hook, which would could be done at the same time as the next simulation step, but not before the next analysis step. So the hook_state returns any internal state that depends on previous steps.

Hopefully clearer example: imagine the analysis (after_step) is a histogram, but that adding to the histogram takes a long time (so you do the analysis in parallel with the simulation). You can add step $n$ to the histogram while running step $n+1$, but you shouldn't add step $n+1$ to the histogram until you've completed addition of step $n$ to the histogram. Task-based programs can go a long way to figuring out these dependencies automatically, as long as the dependencies are explicit inputs/outputs of the simulation.

As an aside, If you're interested in the broader ideas around task-based parallelization, I'd encourage you to apply to an upcoming E-CAM ESDW, which will focus on that topic: https://www.cecam.org/workshop-1650.html

@dwhswenson
Copy link
Member Author

@hejung : In order to (eventually) get this merged in, while I'm still a little uncertain about its API, I've moved this to the beta subpackage. See description of that idea in #813. This will involve changing your imports (I hope it doesn't cause any trouble beyond that!)

@dwhswenson dwhswenson added this to the 1.2 milestone Dec 7, 2019
@dwhswenson dwhswenson modified the milestones: 1.2, 1.3 Jan 23, 2020
@hejung hejung mentioned this pull request Mar 31, 2020
@hejung hejung mentioned this pull request May 4, 2020
@dwhswenson dwhswenson removed this from the 1.3 milestone Sep 26, 2020
@dwhswenson dwhswenson removed their assignment Jan 20, 2021
@dwhswenson dwhswenson changed the title [WIP] "Hooks" for ShootFromSnapshotsSimulation (i.e., committor) "Hooks" for ShootFromSnapshotsSimulation (i.e., committor) Jan 20, 2021
@dwhswenson
Copy link
Member Author

This is (FINALLY!) ready for review. @hejung, obviously you've already worked with the code, so I'd especially appreciate if you can check that the description in the docs is an accurate representation.

@hejung
Copy link
Contributor

hejung commented Jan 22, 2021

I will give it a thorough look this weekend. :)

@hejung
Copy link
Contributor

hejung commented Jan 24, 2021

LGTM!

The only thing I still have trouble wrapping my head around is the hook_state dictionary, the one that is returned by the after_step hooks. If I recall correctly and have not overlooked something now, every hook gets the dictionary containing the state of all hooks from the last step . The keys are the respective hook functions, e.g. HookClass.after_step and the values are the returns of the last call to those functions. (?)
If that is correct I think it would be good to document. If only coding a hook it is probably a bit unexpected to get the state back in a different format then what you return in your own after_step method. Maybe we can even add how one can extract only the part of the state that is from the last call to one specific hook?
If I am not mistaken something like the bit below should do the trick:

class HookWithState(PathSimulatorHook):
    def after_step(self, sim, step_number, step_info, state, results,
                   hook_state):
        if hook_state is not None:
            own_state = hook_state[self.after_step]
        else:
            own_state = None
        # here we would do stuff and update own_state
        # to finally return it
        return own_state

Edit: Check for None in the code snippet as I realized that the hook_state is None if no hook has returned a state yet.

Copy link
Member

@sroet sroet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main idea here is great! I don't agree fully with some of the design choices/restrictions (thinking of what I might need for #989), but this looks great overall

One general point:
what is the difference between beta and experimental again (I have the feeling this has been explained before, but I can't remember)

So first a warning, I have not looked at #911 so I don't know if my concerns are caught/fixed in there. Also, this is done with:

If you can give an eye to #755 and #911 with a mental focus on "will this work if the hooks use Dask?"

I do not have a lot of recent experience with Dask but I assumed this is going to either use delayed or futures.

The two main points:

  • run_hooks should have the option to carry it's state around.
  • all hook functions should accept **kwargs for them to be mixed with hooks that might require more/different kwargs

A couple assumptions:

My comments in run_hook assumes that hook_name_state will be a dict with {hookfunc_object: future} and that dask is smart enough that it does not require the whole dictionary before calculating a hook that does not need it. If the second case is not true, it can be easily fixed by poping the dict and only feed a hook it's own return

Other **kwargs than the default: In the future, some hooks might need a client to fire_and_forget, while others (classical ones) don't. Another option to solve this issue is to do some fancy inspection, but that is more work with little in extra return, I guess.

I now made a couple comments that are hook_state = self.run_hooks..., if you don't agree with that, at least catch the (possible) return with an underscore _ = self.run

self.output_stream.write("\n")
n_snapshots = len(self.initial_snapshots)
hook_state = None
self.run_hooks('before_simulation', sim=self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hook_states should be carried around

Suggested change
self.run_hooks('before_simulation', sim=self)
hook_state = self.run_hooks('before_simulation', sim=self, hook_name_state=hook_state)


def run_hooks(self, hook_name, **kwargs):
"""Run the hooks for the given ``hook_name``"""
hook_name_state = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be good to (have the option) keep a constant state here, so change this into

Suggested change
hook_name_state = {}
hook_name_state = kwargs.get('hook_name_state') or {}

Comment on lines +138 to +139
if hook_name_state != {}:
return hook_name_state
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should always return, as each of the hook states should be stored with hook.func as key, so these should never update hook keys that shouldn't. (It also helps if you one hook state needs the result from another hook, such as before_step and after_step for step timings)

Suggested change
if hook_name_state != {}:
return hook_name_state
return hook_name_state

Comment on lines 136 to 137
if result is not None:
hook_name_state.update({hook: result})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should always update, None is a decent return as well

Suggested change
if result is not None:
hook_name_state.update({hook: result})
hook_name_state.update({hook: result})

implemented_for = ['before_simulation', 'before_step', 'after_step',
'after_simulation']

def before_simulation(self, sim):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if another hook implements more kwargs here it should not error out

Suggested change
def before_simulation(self, sim):
def before_simulation(self, sim, **kwargs):

self.output_stream = output_stream
self.allow_refresh = allow_refresh

def before_simulation(self, sim):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def before_simulation(self, sim):
def before_simulation(self, sim, **kwargs):

if self.allow_refresh is None:
self.allow_refresh = sim.allow_refresh

def before_step(self, sim, step_number, step_info, state):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def before_step(self, sim, step_number, step_info, state):
def before_step(self, sim, step_number, step_info, state, **kwargs):

Comment on lines +137 to +139
self.run_hooks('before_step', sim=self,
step_number=step_number, step_info=step_info,
state=start_snap)
Copy link
Member

@sroet sroet Mar 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hook_states should be carried around

Suggested change
self.run_hooks('before_step', sim=self,
step_number=step_number, step_info=step_info,
state=start_snap)
hook_state = self.run_hooks('before_step', sim=self,
step_number=step_number,
step_info=step_info,
state=start_snap,
hook_name_state=hook_state)

Comment on lines +170 to +171
hook_state=hook_state
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
hook_state=hook_state
)
hook_state=hook_state,
hook_name_state = hook_state
)

snap_num += 1

self.run_hooks('after_simulation', sim=self)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.run_hooks('after_simulation', sim=self)
hook_state = self.run_hooks('after_simulation', sim=self, hook_name_state)

@dwhswenson
Copy link
Member Author

Thanks a lot for taking a look at this one; it definitely helps to have another pair of eyes on it, especially with trying to plan this to be compatible with Dask.

I'm going to take some time to think about the bigger points, but I'll answer a couple easy things now:

The main idea here is great! I don't agree fully with some of the design choices/restrictions (thinking of what I might need for #989), but this looks great overall

I'm curious why this would make it harder? I think that in #911 @hejung made it so that all the timing info is within the hook that reports that info. In principle, you could even use a totally different hook for estimating time remaining -- say you just wanted to use something like tqdm instead of our more verbose output. You could create a TqdmProgressHook that does that, and then not use the default hook at all.

(I mean, it does mean that the work in #989 needs to be redone in this context, but the hope is that it's easier to do this way than it was the other way!)

what is the difference between beta and experimental again (I have the feeling this has been explained before, but I can't remember)

The fundamental difference is that beta is included in the standard CI test suite (so (1) it gets tested nightly and against each PR; and (2) it counts toward coverage), while experimental is not. Everything else comes from that:

  • beta modules must have enough testing to increase coverage (ideally 100% diff coverage). experimental does not need to be as well-tested.
  • beta APIs should be nearly stable (part of the reason I've been taking time on this -- @hejung is the only person to really thoroughly test it so far). experimental may have much more significant internal API changes after being part of a release.

The premise was that experimental would be a place to play with large changes in smaller PRs, while still keeping major feature development on the main branch. (Like merging much of storage without even including storable functions! Or OpenMM support!) Mature experimental modules would be moved to beta.

@sroet
Copy link
Member

sroet commented Mar 5, 2021

I'm curious why this would make it harder? I think that in #911 @hejung made it so that all the timing info is within the hook that reports that info.

main point if you want to use hooks with dask: self (or any data stored in a mutable way, like attributes) is not a valid way to reliably transfer data between dask functions, see the following examples:

from dask.distributed import Client
client = Client()

class A():
    def __init__(self, a):
        self.a = a
    
    def set_a(self, b):
        self.a = b
        #return self
        
    def get_a(self):
        return self.a
    
a = A(12)

b = client.submit(a.set_a, 13)
c = client.submit(a.get_a)
c.result()

this returns 12, instead of 13 while the following easy adaptation:

class A():
    def __init__(self, a):
        self.a = a
    
    def set_a(self, b):
        self.a = b
        return self
        
    def get_a(self):
        return self.a
    
a = A(12)

a = client.submit(a.set_a, 13)
c = client.submit(a.get_a)
c.result()

errors out with

AttributeError                            Traceback (most recent call last)
<ipython-input-15-19e9324dce45> in <module>
     13 
     14 a = client.submit(a.set_a, 13)
---> 15 c = client.submit(a.get_a)
     16 c.result()

AttributeError: 'Future' object has no attribute 'get_a'

Via delayed there is technically a way of getting this working, but it is not to friendly
This does not work (returns 12 instead of 13)

import dask
class A():
    def __init__(self, a):
        self.a = a
    
    @dask.delayed
    def set_a(self, b):
        self.a = b
        #return self
    
    @dask.delayed
    def get_a(self):
        return self.a
    
a = A(12)
_ = a.set_a(13)
c = a.get_a()
c.compute()

but this does (but requires a double compute call):

import dask
class A():
    def __init__(self, a):
        self.a = a
    
    @dask.delayed
    def set_a(self, b):
        self.a = b
        return self
    
    @dask.delayed
    def get_a(self):
        return self.a
    
a = A(12)
a = a.set_a(13)
c = a.get_a()
# Needs 2 computes
c.compute().compute()

@hejung
Copy link
Contributor

hejung commented Mar 5, 2021

I agree with @sroet that the hook state is confusing (I think I already wrote that somewhere).
If I recall correctly the assumption was that only the after_step hooks would have a state because there it could be important that they are done after each other by dask? But I agree that it could be useful to have the option for all run_hooks calls. Would we then need to think about separating the state for the different calls, i.e. should we keep the states of after_step, before_step etc separate?

@sroet : Concerning the dask example: I am not sure I get the issue. How would dask know that you want to use the result of the first operation on the class if you do not pass the result of the first call to the second one (as in the last example that works)? But I also have not thought about dask for a while...

@sroet
Copy link
Member

sroet commented Mar 5, 2021

@sroet : Concerning the dask example: I am not sure I get the issue. How would dask know that you want to use the result of the first operation on the class if you do not pass the result of the first call to the second one (as in the last example that works)? But I also have not thought about dask for a while...

It doesn't, which was the point I was trying to make.

think that in #911 @hejung made it so that all the timing info is within the hook that reports that info.

So the info is stored inside the hook (as an attribute to self), which is not a valid way of transferring data between functions (in this example two separate functions, but it might as well been two consecutive calls to the same function that update some internal attribute)

Now it might help to share the 'dummy' hook that I think about for most of these discussions:
say I want a hook that calculates the time an MC step took without any of the other overhead so it would:

  1. call the before_step hook to set a time and then
  2. in the after_step hook we would set another time and then calculate the difference.
    (this would not really work in a parallel setting, but is a reasonable model system where we want to transfer data from one function to another)

The point that I was trying to make in my review and the second comment is that this data needs to be returned by the before_step hook and then used an arg or kwarg in the call to the after_step function without relying on mutable/internal attributes (like when using self).
(The current implementation/limitation relies on the Hooks carrying an internal state which would not be easily adaptable when trying to combining this with dask)

@dwhswenson
Copy link
Member Author

main point if you want to use hooks with dask: self (or any data stored in a mutable way, like attributes) is not a valid way to reliably transfer data between dask functions

First, note that the client.submit will have to come from within the hook, so you would see something more like

def after_step(*args, **kwargs):
    hook_state = kwargs['hook_state']
    a = self.client.submit(self.calculate_a, hook_state)
    hook_state = self.client.submit(self.do_something_with, a, self.const)
    return hook_state

Running the hook itself will stay in the local process, but the hook can certainly launch things with Dask (note: we'll probably actually be launching with a wrapper around Dask, not actual distributed.client.)

As to the specific case, this is the purpose of hook_state. Instead of keeping state that changes from step to step in the object's attributes, keep state in the hook_state for that hook. The hook itself should have the same kind of pseudo-immutability of all OPS simulation objects. Plus, hook_state is intended to explicitly order when each run of after_step gets evaluated. The basic picture I had for the task graph is something like this:

image

The after_step hook cannot begin until the task is completed, because it depends on the state and results from the simulation (in path sampling, that would be a SampleSet representing active, and the MCStep object that was created). It also cannot begin until the previous after_state was completed, because hook_state is used to impose ordering.

I can tell that not having everything return hook_state is going to confuse people (if it confuses the two of you!) One of the reasons for that was to simplify the task graph above. Plus, the main idea of hook_state is that, in addition to imposing ordering, it can record the hook's response to the most recent step. That is only needed to be returned in the part of the hook that runs immediately after the step -- anywhere else it is redundant.

There's certainly no reason for after_simulation to return it -- you can't use anything returned there. (However, thanks to thoughts inspired by @sroet's review, I am going to make it so after_simulation takes hook_state as input!) before_step is intended to be rarely used: It only exists because you might want to present information to the user before the first step starts; anything it could do with hook_state could just as easily be done in after_step. That does leave before_simulation, although I think that's mostly as an initialization step, so that the hook can get information from the simulation. That information is constant over the simulation, and I can't think of a case where it needs to be set in the hook_state.

I'm definitely concerned about how allowing more stages (especially before_step) to return hook_state might affect the task graph. Although I see how this is causing confusion, the intent was really to keep it simple so that people would be less likely to create something that is blocking. In general, though, my thought is that dask-based tasks would only be done in after_step (and, now I realize, after_simulation for .result() etc.).

@dwhswenson
Copy link
Member Author

say I want a hook that calculates the time an MC step took without any of the other overhead so it would:

Aha... thanks for the clear example case.

So here's the thing about this example: it relies on a hidden variable that is dependent on the environment (i.e., the time). This is part of why is doesn't actually make sense in parallel (we know nothing about the machines each task will be executed on -- even aside from questions of execution timing, they may not have synchronized clocks!)

I suspect this is always the case for any example where you need to pass information from before_step to after_step. All the "deterministic" information available at before_step is also given to after_step.

@sroet
Copy link
Member

sroet commented Mar 5, 2021

Yeah, you are correct in that it is not a good example to be done in parallel

Running the hook itself will stay in the local process, but the hook can certainly launch things with Dask (note: we'll probably actually be launching with a wrapper around Dask, not actual distributed.client.)

thanks for the more detailed idea here. Would it then still not matter that the data is shuttled around in hook_state? As the hooks themself would have to make sure to grab only their own blocking data from the hook state dictionary?

Say if you have 2 independent after_step hooks: after_step1 (fast, output to user) and after_step2 (slow, syncing to disc), you would not want after_step1 to be dependent on after_step2 being done, right?

before_step is intended to be rarely used: It only exists because you might want to present information to the user before the first step starts; anything it could do with hook_state could just as easily be done in after_step.

I do understand that before_step should be identical to after_step after the first step, but wouldn't it then be wasteful to even be called after the first step? (So better to be renamed as before_first_step hook and only be called before the first step?)

@dwhswenson
Copy link
Member Author

I made a few small changes to the API here based on some comments from @sroet and @hejung:

1.before_simulation may take arbitrary kwargs which will depend on the simulation type (anything passed as arguments to the run method should also be available to the hook). Obviously, not all hooks will need to make use of that information, so accepting and ignoring kwargs is also completely reasonable behavior. Also, any hook that requires a specific kwarg will (of course) be limited to simulation types that implement that kwarg.

  1. after_simulation takes the hook_state as well.

For now, I'm going to stick to before_step not being involved in hook_state. This is an API change we can make in the future. I really just want to get this in, get OPS 1.5 released, and then be able to put energy into mixing the ideas here with the parallelization in #1001. Once we have a better sense of how that works, I'll be very happy to entertain the possibility of making that dependency on hook state more complicated.

@hejung and @sroet, please take another look and let me know if there's anything that really needs to be addressed before a beta version of this enters a release. @hejung : Your work in #911 and after will probably require changes based on the API changes listed above (but you can see from d59b306 that it isn't much work to change those).

@sroet
Copy link
Member

sroet commented Jun 21, 2021

please take another look and let me know if there's anything that really needs to be addressed before a beta version of this enters a release.

I don't see anything particularly pressing.

@dwhswenson
Copy link
Member Author

I'll leave this open for another 48 hours to provide time for any other comments (particularly from @hejung?), although I'm looking forward to getting this into the code. I will merge this no earlier than Fri 25 Jun 18:00 GMT (14:00 my local).

@hejung
Copy link
Contributor

hejung commented Jun 25, 2021

Sorry for the late reply!
The changes look very reasonable to me.
I think it makes sense that before_step does not pass around the hook state considering that we are so far only using it to print the Working on step[...] output. And, as you noted, after_step and before_step are the same most of the time anyway, i.e. anything requiring the state could just be done in after_step (I think).
It also means the output could probably be moved to after_step too (would need to take a bit of care for the last step). However, if all calls to dask come from within the hook methods then doing the timing in the local process (with variables attached to self) will just continue to work(?)

@dwhswenson
Copy link
Member Author

Thanks @hejung! I expect that you'll need to update your PRs to reflect changes here once this is merged (merging it now). Then I'll try to review those so that they can be included in the 1.5 release!

It also means the output could probably be moved to after_step too (would need to take a bit of care for the last step).

Also would require changes to the before_simulation in order to output "working on step 1" -- which is why I added the before_step hook stage. It's entirely so that progress indicators can be simpler. (Also, one could imagine a simulation type with a significant startup time before actually starting step 1: in that case, using before_simulation for this might give an unreasonably long time for the first step).

However, if all calls to dask come from within the hook methods then doing the timing in the local process (with variables attached to self) will just continue to work(?)

I'm not entirely sure what you mean here -- for the specific issue of timing, a few things:

  1. The standard procedure already calculates timing of a simulation step (at least for path sampling; if this is not true for calculations based on ShootFromSnapshots, it should be!) This would/should be done as part of the main task of the simulation, which would also be run on a remote machine if using Dask. So the timing returned would be the timing of the process on the remote machine. Since this should be built-in, a hook to calculate timing is not necessary. But I think @sroet's discussion on that point was more about providing a concrete example, as opposed to specifically implementing a timing hook.

  2. In the context of a hook to calculate timing (or something similar), the issue is that "time" is a shared state that wouldn't necessarily be the same on different machines. This could be avoided by using an external resource to manage the shared state (e.g., if the time was requested from an online server instead).

  3. If a hook wanted to calculate its own timing, then the best practice would probably be wrap each stage of the hook with a timing calculation, and then to add those results up. If that hook sends its work to a remote machine via Dask, the timing calculation should be inside remote task. You could do a timing that calculates the main thread time (as opposed to remote) in a simulation that is fully using Dask, but that probably would not be very interesting, since you'd just be measuring how long it takes to register tasks with Dask.

In any case, if anyone else has interest in mixing this with Dask, please do play with it and let me know of any issues! After this is merged, I'll do work on that in #1001.

@dwhswenson dwhswenson merged commit b6da5dd into openpathsampling:master Jun 29, 2021
@dwhswenson dwhswenson deleted the committor_hooks branch June 29, 2021 11:39
@hejung
Copy link
Contributor

hejung commented Jun 29, 2021

Sorry for beeing unlcear about the timing I meant.
You are right, that the time per step (that is saved with the MCstep) is calculated as part of the main simulation also for the PathSampling.
The other timing that we do is for the output hook (which for the PathSampling prints the expected time to finish). This is how I did this so far and also why I was asking. If I am not mistaken here the timing output would only be correct if the PathSamplingOutputHook runs in the same (dask) process as the main simulation.
I think I can however also use the timing info that is saved/passed with the MCStep and then would not need to keep the time in the hook (but the estimate for the ETA would become worse as the time the hooks take to run is not taken into account anymore).
PS: I will try to get my PR up-to-date later today, but could become tomorrow morning.

@dwhswenson
Copy link
Member Author

The other timing that we do is for the output hook (which for the PathSampling prints the expected time to finish).

The challenge is that this kind of output doesn't actually make sense when running in parallel. This was one of the main reasons I started on these hooks -- if the main task is run by Dask, then the timing we were using would report that the entire simulation was completed as soon as the tasks had been scheduled! (So, in a fraction of a second, regardless of how large the simulation.) The original goal of this PR was to separate progress reporting from the actual loop for running the simulation.

The default hooks to use for progress reporting will depend on the "scheduler" being used. The current behavior will be wrapped by a SerialScheduler; current hooks will work with that (and that will be the default). Dask will have a DaskScheduler. I expect I'll probably do something like have the scheduler include a default_hooks() method so the path sampler can ask what the default behaviors should be. And this is all assuming I can find a way to make all of this work with one main class whether serial or parallel -- this was the intent, but existing parallel implementations used specialized PathSimulator subclasses.

TLDR: what you have there is fine! 😊 The end goal is that the user would have to specifically request those hooks if running a parallel simulation (and yes, the output would be nonsense, but if the user explicitly asks for nonsense....)

PS: I will try to get my PR up-to-date later today, but could become tomorrow morning.

Thanks! No huge rush, but as soon as we finish those I'll start working on the 1.5.0 release (which will require a little care because very old PRs, like this one and like the spring shooting PR, are sometimes overlooked by my tool to automate writing release notes).

@dwhswenson dwhswenson mentioned this pull request Jul 5, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants