Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of temperature REMD #64

Merged
merged 35 commits into from
Aug 5, 2022
Merged

Conversation

JaydevSR
Copy link
Contributor

Includes:

  • Types ReplicaWrapper and ReplicaSystem for handling replica exchange simulations. ReplicaSystem is equivalent to System but containing multiple replicas at different temperatures. ReplicaWrapper is a wrapper for a particular replica.

@JaydevSR JaydevSR marked this pull request as ready for review June 14, 2022 17:17
@JaydevSR JaydevSR marked this pull request as draft June 14, 2022 17:17
@Ruibin-Liu
Copy link
Contributor

The current interface seems not friendly to implement other types of replica exchange schemes like pH replica exchange and general Hamiltonian replica exchange simulations.

src/types.jl Outdated Show resolved Hide resolved
@JaydevSR JaydevSR changed the title Implementation of temperature REMD WIP: Implementation of temperature REMD Jun 15, 2022
Copy link
Collaborator

@jgreener64 jgreener64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I agree that there should be a system type to hold the replicas. Since the parametric types are generic it could be made more general by just renaming temp to data or info.

However I think the temperature data should be stored at the level of the simulator rather than at the level of the system, in line with the other thermostats. In this case that would mean removing references to temperature in these two structs and having a TemperatureREMD simulator or similar, which would contain temperature data in its struct along with the thermostat to use. What do you think?

src/types.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
@JaydevSR
Copy link
Contributor Author

JaydevSR commented Jun 15, 2022

However I think the temperature data should be stored at the level of the simulator rather than at the level of the system, in line with the other thermostats. In this case that would mean removing references to temperature in these two structs and having a TemperatureREMD simulator or similar, which would contain temperature data in its struct along with the thermostat to use. What do you think?

I was thinking about this as well as we will need different "sub" simulators for each replica with different coupling temperature for temp-REMD, so this will be better. We still need to maintain common indices between the simulator replicas and the system replicas so that there is no ambiguity. I will look into simplifying this.

@JaydevSR
Copy link
Contributor Author

JaydevSR commented Jun 20, 2022

I have added the supporting functions to the ReplicaSystem type in addition to the suggested edits. Many function for System and ReplicaSystem are identical. Should I use a union type for that to simplify the code? Next I am also working of a simulator type TemperatureREMD and a method of simulate! for that.

@codecov
Copy link

codecov bot commented Jun 20, 2022

Codecov Report

Merging #64 (b82d1c7) into master (7cf63d6) will decrease coverage by 0.28%.
The diff coverage is 81.75%.

@@            Coverage Diff             @@
##           master      #64      +/-   ##
==========================================
- Coverage   85.26%   84.98%   -0.29%     
==========================================
  Files          31       31              
  Lines        3190     3323     +133     
==========================================
+ Hits         2720     2824     +104     
- Misses        470      499      +29     
Impacted Files Coverage Δ
src/types.jl 76.83% <73.68%> (-3.87%) ⬇️
src/simulators.jl 96.49% <88.70%> (-2.91%) ⬇️
src/loggers.jl 95.97% <100.00%> (+0.28%) ⬆️
src/chain_rules.jl 46.80% <0.00%> (-5.32%) ⬇️
src/zygote.jl 77.84% <0.00%> (+0.29%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@jgreener64
Copy link
Collaborator

Good changes. Yes you can use a union type, possibly we could introduce an abstract type later.

I think the AtomsBase interface requires that the index given to position and velocity is the atom index, so the replica index plus atom index used here would be confusing. Possibly use the first replica for now in these functions?

@JaydevSR
Copy link
Contributor Author

JaydevSR commented Jun 23, 2022

For making the simulation, I first define a simulation type:

struct TemperatureREMD{N, T, S, SD}
    temps::StaticVector{N, T}
    indices::StaticVector{N, Int}
    simulators::SD
end

A constructor is defined separately for this type that I am omitting here right now. Then a new methods for the simulate! function:

function simulate!(;
    sys::ReplicaSystem,
    sim::TemperatureREMD,
    n_cycles::Int,
    cycle_length::Int;
    parallel=true
    )

    if sys.n_replicas != length(sim.indices)
        error("Number of replicas in ReplicaSystem and simulators in TemperatureREMD must match.")
    end
    for cycle=1:n_cycles
        for idx in sim.indices
            simulate!(sys.replicas[idx], sim.simulators[idx], cycle_length; parallel=parallel)  #flag1
        end

        if cycle != n_cycles
            n = rand(1:sys.n_replicas)
            m = mod(n, sys.n_replicas) + 1
            β_n, β_m = 1/sim.temps[n], 1/sim.temps[m]
            V_n, V_m = potential_energy(sys.replicas[n]), potential_energy(sys.replicas[m])  #flag2
            Δ = (β_m - β_n)*(V_n - V_m)
            if Δ <= 0 || rand() < exp(-Δ)
                # exchange coordinates
                sys.replicas[n].coords, sys.replicas[m].coords = sys.replicas[m].coords, sys.replicas[n].coords
                # scale velocities
                sys.replicas[n].velocities *= sqrt(β_n/β_m)
                sys.replicas[m].velocities *= sqrt(β_m/β_n)
            end
        end
    end
end

Note: Right now the replica simulation is not in parallel.

The main issue here is that sys.replicas[i] has type IndividualReplica for which many of other function will need new methods, but if it had the type System then all of this will work without any problem and we don't need to make any changes to other functions (#flag1 and #flag2) or define new methods. But if we make all the replicas their own systems, then there is issue of redundant information in all of them. Any suggestions?

Reference for T-REMD algorithm that I am using: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6484850/

@jgreener64
Copy link
Collaborator

The simulator outline looks good. In order to follow the simulate!(sys, sim, n_steps) convention it might be better to store an exchange_time in TemperatureREMD and calculate n_cycles and cycle_length in simulate!.

You raise an interesting point about types. I think the most flexible approach might just be to have multiple Systems. This opens up the possibility of setting up the systems differently too, which some people might find use for. Longer term each replica would need a different box_size anyway (barostats change the box size) and I think each replica should have its own neighbour finder since CellListMapNeighborFinder stores some neighbour info in the struct.

That leaves the atoms, atoms_data and interaction fields as duplicated, but since arrays are stored by reference each System can just point to the same memory for this. You could still have ReplicaSystem which just holds a list of Systems and does relevant setup in the constructor.

How does that sound?

@jgreener64 jgreener64 closed this Jun 23, 2022
@jgreener64 jgreener64 reopened this Jun 23, 2022
@jgreener64
Copy link
Collaborator

(Accidentally clicked close sorry)

The simulator outline looks good. In order to follow the simulate!(sys, sim, n_steps) convention it might be better to store an exchange_time in TemperatureREMD and calculate n_cycles and cycle_length in simulate!.

You raise an interesting point about types. I think the most flexible approach might just be to have multiple Systems. This opens up the possibility of setting up the systems differently too, which some people might find use for. Longer term each replica would need a different box_size anyway (barostats change the box size) and I think each replica should have its own neighbour finder since CellListMapNeighborFinder stores some neighbour info in the struct.

That leaves the atoms, atoms_data and interaction fields as duplicated, but since arrays are stored by reference each System can just point to the same memory for this. You could still have ReplicaSystem which just holds a list of Systems and does relevant setup in the constructor.

How does that sound?

@JaydevSR
Copy link
Contributor Author

For the constructor, in order to support multiple sub simulators and couplings, I am having to write multiple checks:

function TemperatureREMD(;
    simulator, exchange_time, temps, dt, 
    kwargs...)
    S = simulator
    T = eltype(temps)
    N = length(temps)
    ET = typeof(exchange_time)
    temps = SA[temps...]
    indices = SA[collect(1:length(temps))...]

    coupling_required = false
    if simulator  [VelocityVerlet, Verlet, StormerVerlet]
        coupling_required = true
        coupling = kwargs[:coupling]
    end

    if coupling_required && coupling  [AndersenThermostat, BerendsenThermostat]
        couplings = [coupling(temps[i], kwargs[:coupling_const]) for i in indices]
    elseif coupling_required && coupling  [RescaleThermostat]
        couplings = [coupling(temps[i]) for i in indices]
    elseif coupling_required
        error("Temperature coupling required for TemperatureREMD using $(simulator) simulator.")
    end

    if simulator==Langevin
        simulators = Dict(
            [i, simulator(dt=dt, temperature=temps[i], friction=kwargs[:friction],
            remove_CM_motion=kwargs[:remove_CM_motion])]
            for i in indices
        )
    elseif simulator  [VelocityVerlet, Verlet]
        simulators = Dict(
            [i, simulator(dt=dt, temperature=temps[i], coupling=couplings[i],
            remove_CM_motion=kwargs[:remove_CM_motion])]
            for i in indices
        )
    elseif simulator == StormerVerlet
        simulators = Dict(
            [i, simulator(dt=dt, temperature=temps[i], coupling=couplings[i])]
            for i in indices
        )
    else
        error("Simulator type $(simulator) not supported.")
    end
    SD = typeof(simulators)

    return TemperatureREMD{N, T, S, SD, ET}(temps, indices, simulators, exchange_time)
end

But when new simulators are added, this function will also need to be changed. This does not seem good.

@jgreener64
Copy link
Collaborator

I agree it doesn't scale well. This could just be left to the user for now, i.e. the constructor takes a list of simulators as an argument and checks that it is the same length as the list of temperatures. The docs would make clear that the simulators should be set up with the appropriate coupling/temperatures.

@JaydevSR
Copy link
Contributor Author

I was looking a bit into how to parallelize the whole replica simulations and would like to make a few points about that:

  1. Right now the parallel=true versions of other functions use Threads.@threads macro, which is statically scheduled. So, If I use Threads.@threads for parallelizing replica simulation, then only one thread will be available for those functions.
  2. My suggestion is we use the @sync and @spawn for all the parallelization, this will change:
    Threads.@threads for i=1:n
        do_some_work()
    end
    to:
    @sync for i=1:n
        Threads.@spawn do_some_work()
    end
  3. Also, from julia version 1.8 onwards, the behavior of Threads.@thread macro is going to change. And it will be dynamically scheduled by default. See: https://github.com/JuliaLang/julia/blob/v1.8.0-rc1/NEWS.md#multi-threading-changes. So maybe these changes won't be necessary after the release of v1.8?

@jgreener64
Copy link
Collaborator

I think target Julia 1.8 behaviour, it may be that we drop support for Julia 1.7 soon anyway depending on advances in the AD landscape. The parallel behaviour here doesn't have to mirror the existing behaviour though since it is more complex.

As you suggest it would be useful to be able to run each simulation on a certain number of threads, e.g. 16 cores could run 4 replicas on 4 cores each. Is this possible using the threads setup? The current parallel keyword argument may be insufficient here, I have considered changing it to n_threads across the whole package previously. Would that help?

@jgreener64
Copy link
Collaborator

Great. I think a Lennard-Jones simulation is okay for now but it would be good to see the graphs you had previously with a clearer temperature profile.

It also would be useful to log when the exchanges took place, then compare the number of exchanges over a simulation to an expected number. Perhaps the ReplicaSystem could have a loggers field, with the default being a new ReplicaExchangeLogger that records the time step, the indices exchanged and Δ for each exchange.

On the topic of loggers, I think ReplicaSystem shouldn't store replica_loggers in its struct. replica_loggers should be used in the constructor as is, then the loggers can be accessed with replica_system.replicas[i].loggers.

@JaydevSR
Copy link
Contributor Author

JaydevSR commented Jul 24, 2022

I performed a test simulation on 3 replicas of Leonard-Jones gas on 8 threads, using DistanceNeighborFinder and a Langevin simulator.

temp_vals = [120.0u"K", 160.0u"K", 200.0u"K"]
simulator = TemperatureREMD(
    dt=0.005u"ps",
    temperatures=temp_vals,
    simulators=[
        Langevin(
            dt=0.005u"ps",
            temperature=tmp,
            friction=0.1u"ps^-1",
        )
        for tmp in temp_vals],
    exchange_time=2.5u"ps",
);

The exchanges were logged with the help of the new ReplicaExchangeLogger and we have the following exchanges in form of (step, exchanged indices) after simulating for 20_000 steps:

 (9500, (2, 3))
 (12500, (1, 2))
 (15500, (1, 2))
 (17000, (3, 1))
 (18000, (2, 3))
 (18500, (2, 3))

These are plots for temperatures of each replica and the x-coordinates of the one of the atoms in these replicas:

  1. Temperatures:
    temperature_of_each_replica

  2. Coordinates:
    x_coordiantes_of_atom_7_in_each_replica

Edit: Changed the plots after fixing the velocity exchange

Copy link
Collaborator

@jgreener64 jgreener64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, the temperature difference is much clearer there. Ideally turn that simulation into a test that looks for the temperatures and exchange count to fall within an expected range.

Not that it would block the PR, but it would be interesting to see how performance scales with replica count. For example, how long does 4 replicas on 8 cores for 1000 atoms take compared to 1 simulation on 2 cores?

src/loggers.jl Outdated Show resolved Hide resolved
src/types.jl Outdated Show resolved Hide resolved
src/types.jl Show resolved Hide resolved
src/simulators.jl Outdated Show resolved Hide resolved
src/simulators.jl Outdated Show resolved Hide resolved
src/simulators.jl Outdated Show resolved Hide resolved
@JaydevSR
Copy link
Contributor Author

Benchmarks

  1. System with 1000 atoms and DistanceNeighborFinder, LeonardJones interactions using Langevin simulator for 5000 steps (2 threads):
    77.253948 seconds (582.38 k allocations: 4.966 GiB, 0.87% gc time)
  2. ReplicaSystem with 4 replicas of above system with same simulator and steps (8 threads):
    173.750242 seconds (2.18 M allocations: 20.255 GiB, 2.47% gc time, 0.45% compilation time)

Ratio of times: 2.25

test/simulation.jl Outdated Show resolved Hide resolved
Copy link
Collaborator

@jgreener64 jgreener64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The benchmark results are interesting - it's not essential for this PR but it would be good to profile and see what is stopping that ratio from being lower. Make sure the neighbour list is being used as well (see other comment).

test/simulation.jl Outdated Show resolved Hide resolved
test/simulation.jl Outdated Show resolved Hide resolved
src/loggers.jl Outdated Show resolved Hide resolved
@JaydevSR
Copy link
Contributor Author

I did profiling on the simulate! function and the bottleneck seems to be the part where the replicas are being exchanged. Most of the part there is type unstable. I will try to make it type stable and see if there are any improvements.

@JaydevSR
Copy link
Contributor Author

It seems that I misinterpreted the profiles before. The actual issue with the drop in performance seems to do with task scheduling. Here are the flame graph of the profiles:
overall_flame_graph
find_neighbors_focus_flame_graph

In the second image we can see this more clearly, more than 60% of time is spent in the wait() in the call stack of find_neighbors!. This as far as I can think should be because of clashing scheduling of @floops and @spawn. I think I may be able to fix this by using one of the different executors for the parallel loop in find_neighbors!. See: https://juliafolds.github.io/FoldsThreads.jl/dev/#FoldsThreads.WorkStealingEx. I will see if this works.

@JaydevSR
Copy link
Contributor Author

JaydevSR commented Aug 5, 2022

So I experimented a bit to see if there can be an improvement or not. After trying different executors and a pure julia implementation also, the case with 2 threads 1 system and 8 threads 4 replicas, gives no improvement. But for a 1 thread 1 system and 4 threads 4 replicas, the ratio was improved a bit to about 1.6, as my machine supports at most 8 threads so I think the wait time was high in the first case because all threads were getting used. So, maybe it is best to use less than the maximum available threads for the replica simulation to avoid such clashes.

But still julia v1.8 will bring some changes to multithreading, so it will be interesting to see if that gives any improvement for nested threaded loops.

Copy link
Collaborator

@jgreener64 jgreener64 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. Yes let's see how it changes when Julia 1.8 comes out.

I think this is ready to merge after some minor changes. Also add TemperatureREMD to the list of simulators in docs/src/docs.md line 679.

src/loggers.jl Show resolved Hide resolved
src/loggers.jl Outdated Show resolved Hide resolved
src/simulators.jl Outdated Show resolved Hide resolved
test/simulation.jl Outdated Show resolved Hide resolved
@JaydevSR JaydevSR marked this pull request as ready for review August 5, 2022 13:20
@JaydevSR JaydevSR changed the title WIP: Implementation of temperature REMD Implementation of temperature REMD Aug 5, 2022
@jgreener64
Copy link
Collaborator

Nice one 🎉

Thinking about next steps, it might be worth considering how to extend this to non-temperature REMD simulators. The system type and the logger are already independent of temperature. Would it be possible to modify the simulate! function to take in a function that runs the lines

if dimension(sys.energy_units) == u"𝐋^2 * 𝐌 * 𝐍^-1 * 𝐓^-2"
    k_b = sys.k * T(Unitful.Na)
else
    k_b = sys.k
end
T_n, T_m = sim.temperatures[n], sim.temperatures[m]
β_n, β_m = 1/(k_b*T_n), 1/(k_b*T_m)
neighbors_n = find_neighbors(sys.replicas[n], sys.replicas[n].neighbor_finder; n_threads=n_threads)
neighbors_m = find_neighbors(sys.replicas[m], sys.replicas[m].neighbor_finder; n_threads=n_threads)
V_n, V_m = potential_energy(sys.replicas[n], neighbors_n), potential_energy(sys.replicas[m], neighbors_m)
Δ = (β_m - β_n)*(V_n - V_m)
should_exchange = Δ <= 0 || rand(rng) < exp(-Δ)
return should_exchange, Δ

and determines whether an exchange should take place? Then simulators such as TemperatureREMD can be a particular case of a generic REMD simulator that takes this function as one of its arguments, and the simulation code can be reused.

@jgreener64 jgreener64 merged commit 3251cfd into JuliaMolSim:master Aug 5, 2022
@JaydevSR
Copy link
Contributor Author

JaydevSR commented Aug 5, 2022

Yay 🎊 .

Your suggestion seem excellent. This will help in generalizing the simulation code. I will go ahead and make these additions. Also, I think I can try implementing the Hamiltonian-REMD next, that would require the replicas to have different potentials. So, I will also need to make some changes to ReplicaSystem to allow for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants