getting PCA equilibration times for the other components #170

batukav · 2024-02-24T14:37:17Z

Hi all,

I'm interested in calculating the PCA equilibration times not only from (or for) the first principal component but from the first 10 components. What would I need to change in the code to do this?

I would like to create a eq_times.json file as an output that contains the equilibration times calculated for each of the top 10 principal components. @pbuslaev, any suggestions?

pbuslaev · 2024-02-24T19:27:26Z

You will have to rework the PCA computation a bit. Currently, only one PC is taken into account (check this function). Maybe the easiest solution, though not the nicest, is to add an argument coord to the get_proj method of PCA class:

def get_proj(self, data, pc = 0):
        # projection on PC
        proj = np.tensordot(cdata, self.eig_vecs[pc:pc + 1], axes=(1, 1)).T
        proj = np.concatenate(np.array(proj), axis=None)

        self.proj = proj

Then, you will need to iterate over PC components you're interested in in main. So instead of current loop, you'd likely want to have smth like

equilibration_times = {}
# Iterate over trajectories for different lipids
for traj in parser.concatenated_trajs:
    print(f"Main: parsing lipid {traj[4]}")
    # Creat PCA for trajectory
    pca_runner = PCA(traj[0], traj[1], traj[2], traj[3], parser.trjLen, pc)
    # Run PCA
    data = pca_runner.PCA()
    print("Main: PCA: done")
    _tmp_eq = []
    for pc in range(10):
        # Project trajectory on PC1
        pca_runner.get_proj(data, pc)
        print("Main: Projections: done")
        # Calculate autocorrelation
        pca_runner.get_autocorrelations()
        print("Main: Autocorrelations: done")
        # Estimate equilibration time
        te2 = TimeEstimator(pca_runner.autocorrelation).calculate_time()
        # NOTE: if you want pure time, do not divide by trjLen
        _tmp_eq.append(te2/parser.trjLen)
        print("Main: EQ time: done")
    equilibration_times[traj[4]] = _tmp_eq

Hope it helps

batukav · 2024-02-25T17:13:30Z

Thank you, this helps a lot. I will make the necessary changes and will update this post. Do you see any value of adding this functionality to the databank?

pbuslaev · 2024-02-25T19:24:21Z

I am not sure how useful it would be for the whole databank. Out of interest, I would be happy to contribute to the large-scale application of PCA analysis to membranes (e.g. to the whole databank). I think there could be some interesting findings which can be published somewhere, but I am not sure that it would be aligned with the databank goals that much.

batukav · 2024-02-27T21:28:52Z

Hello,

I have made the implementation as the following:

for the final loop

        equilibration_times = {}
        # Iterate over trajectories for different lipids
        for traj in parser.concatenated_trajs:
            print(f"Main: parsing lipid {traj[4]}")
            # Creat PCA for trajectory
            pca_runner = PCA(traj[0], traj[1], traj[2], traj[3], parser.trjLen)
            # Run PCA
            data = pca_runner.PCA()
            print("Main: PCA: done")
            _tmp_eq = []
            for pc in range(10):
                # Project trajectory on PC1
                pca_runner.get_proj(data, pc)
                print("Main: Projections: done")
                # Calculate autocorrelation
                pca_runner.get_autocorrelations()
                print("Main: Autocorrelations: done")
                # Estimate equilibration time
                te2 = TimeEstimator(pca_runner.autocorrelation).calculate_time()
                # NOTE: if you want pure time, do not divide by trjLen
                _tmp_eq.append(te2/parser.trjLen)
                print("Main: EQ time: done")
            equilibration_times[traj[4]] = _tmp_eq

and modified the get_proj() as

    def get_proj(self, cdata, pc = 0):
            # projection on PC
            proj = np.tensordot(cdata, self.eig_vecs[pc:pc + 1], axes=(1, 1)).T
            proj = np.concatenate(np.array(proj), axis=None)

            self.proj = proj

with these modifications, I managed to get 10 projections. When I run the code, I got exactly the same equilibration times for the PC0 as with the previous version of the code.

A curious thing is that when I applied the code to some other data (databank entry 2c6/ae5/2c6ae54d8ce9652f529cd674727a0602d06c8990/2c6ae54d8ce9652f529cd674727a0602d06c8990) it turned out that the first PC was not the slowest one:

{"POPC": [3.3065293017437223, 9.520027080952298, 6.154706020727084, 1.2030953056192617, 1.2769982402263582, 4.590255687551348, 1.3070063956914977, 3.1167834233800424, 1.6524583381178275, 5.383538786484999]}

For the others I tried (36e/333/36e333d9de810fdc8d0009df0e755225f7645161/36e333d9de810fdc8d0009df0e755225f7645161 and 478/b6c/478b6c725d0a406058482c7d80ee728165e4a884/d6eaf162a46bcbce0935cb07cf3927ba2f7bacbc also some other) the first component is indeed the slowest one.

I'm curious if these results make any sense, as I thought that the first component is always the slowest. Could it be that my implementation is wrong, or interpretation wrong, or both with varying degrees?

pbuslaev · 2024-02-27T22:16:09Z

Was the trajectory with strange results with Polarizable ff?

batukav · 2024-02-28T05:25:18Z

With only one of the polarizable trajectory. The other polarizable trajectories (both with Drude and Amoeba) I tried worked as expected. I also edited the atom selection to exclude Drude particles alongside with the hydrogens. pbuslaev ***@***.***> schrieb am Di. 27. Feb. 2024 um 23:16:

…

Was the trajectory with strange results with Polarizable ff? — Reply to this email directly, view it on GitHub <#170 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB47SNK53PUJDCA73SVDIBLYVZLLLAVCNFSM6AAAAABDYB2PBOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNRXG4ZTAOJUHE> . You are receiving this because you authored the thread.Message ID: ***@***.***>

pbuslaev · 2024-02-28T21:33:46Z

I rememember, that when I tried PCA with polarizable force fields, I also observed this strange behaviour for some of the trajectories. Usually, it was due to insufficient sampling in polarizable simulations. I won't be able to tell more without additional details, unfortunately. My experience, that for such strange cases, running full PCA analysis can be really helpful to detect the origin of the problem.

batukav · 2024-02-28T21:40:42Z

I see, that's also my hunch. What do you mean by the full PCA analysis? What would I need to do there?

pbuslaev · 2024-02-28T21:45:29Z

The analysis implemented in Databank only focuses on fast computation of equilibration times. But as described in the original paper, many more things, including structural ensembles, can be checked. So if you are interested in the behavior of a particular trajectory, full analysis can be helpful. But it will require deep dive into results, computed for a trajectory and thus can not be easily scaled.

ohsOllila · 2024-04-26T20:20:10Z

Should we close this issue?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting PCA equilibration times for the other components #170

getting PCA equilibration times for the other components #170

batukav commented Feb 24, 2024

pbuslaev commented Feb 24, 2024

batukav commented Feb 25, 2024

pbuslaev commented Feb 25, 2024

batukav commented Feb 27, 2024

pbuslaev commented Feb 27, 2024

batukav commented Feb 28, 2024 via email •

edited

pbuslaev commented Feb 28, 2024

batukav commented Feb 28, 2024

pbuslaev commented Feb 28, 2024

ohsOllila commented Apr 26, 2024

getting PCA equilibration times for the other components #170

getting PCA equilibration times for the other components #170

Comments

batukav commented Feb 24, 2024

pbuslaev commented Feb 24, 2024

batukav commented Feb 25, 2024

pbuslaev commented Feb 25, 2024

batukav commented Feb 27, 2024

pbuslaev commented Feb 27, 2024

batukav commented Feb 28, 2024 via email • edited

pbuslaev commented Feb 28, 2024

batukav commented Feb 28, 2024

pbuslaev commented Feb 28, 2024

ohsOllila commented Apr 26, 2024

batukav commented Feb 28, 2024 via email •

edited