Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

United atom simulations with cholesterol #9

Closed
ohsOllila opened this issue Apr 1, 2021 · 12 comments
Closed

United atom simulations with cholesterol #9

ohsOllila opened this issue Apr 1, 2021 · 12 comments
Assignees

Comments

@ohsOllila
Copy link
Member

The idea in the databank is that the mapping file would be give to all molecules, and order parameters would be calculated based on that for all molecules. This includes also, for example, cholesterol. This means that for united atom models we need to construct the hydrogens also for all molecules. This is currently done with buildH code by @patrickfuchs et al., but I am not sure if it can handle cholesterol, and if yes, how the dictionary should be built.

Maybe @patrickfuchs could help here?

Example of simulation containing united atom cholesterol data can be found, for example, from here https://doi.org/10.5281/zenodo.605429

@patrickfuchs
Copy link

Hi @ohsOllila, yes I think buildH can reconstruct H for cholesterol without problem. I can add it to the current supported molecules. I keep you updated on that.
The only thing to bear in mind is that buildH has to be launched once for each molecule. Currently, it cannot calculate new H and the order parameter for more than one molecule at a time. Would that be OK with your goals in the databank project?

@ohsOllila
Copy link
Member Author

ohsOllila commented Apr 9, 2021

You can check how we use the buildH code now starting from line 1059 in AddData.py.
There we first construct the def file based on mapping file and then give that to the buildH

Currently we have the modified version of buildH in the databank folder which we then import to AddData.py.

So I think that it works for us now, but there might be a nicer way to implement this.

Anyway, a example how to include cholesterol into dic_lipids.py would be good.

@patrickfuchs
Copy link

Thanks @ohsOllila, this is instructive for me. I realize a few things :

  • buildH has changed massively since you cloned it in the Databank repo. It's no longer a single .py program, it has many modules and a more complex tree view.
  • We no longer use dic_lipids.py, now we use some json files, here's an example. The user can provide it's own json file if it's not in the default lipids present in buildH.
  • So far we've been using a def file to : i) give a generic name to each H, ii) to guess what Hs the user wants to build and calculate the OP on. This use of this def file is inherited from the initial script of J. Melcr, that is why we implemented it in buildH. I thought it would be useful to use the same input. But I realize you don't use it anymore, only mapping files. And then when you want to use buildH I see that you have to generate a def file on the fly, which is not very convenient. This is something we could simplify for sure.

I think that would be good if we talked with Anne sometime to see how we could simplify this procedure. Clearly, I see that it'll be convenient for you to use buildH as a module. I'll check with my colleagues whether it's easily doable.

For now, I come back to you (ASAP) once cholesterol is implemented in buildH.

@patrickfuchs
Copy link

For info, we created an issue in the buildH project so that it is usable as a module as you do now. I think that'll be useful to upgrade to the new version of buildH in the databank project, there many new optimizations and features. We expect to have that ready in a couple of weeks.

@patrickfuchs
Copy link

For info, we added a function buildh.launch() which allows the use of buildH as a module. The new version is now accelerated with Numba. On my workstation, a traj of 2500 frames with 128 POPC took more than 30', now takes ~ 7'. So I advice you to upgrade. If you need help, don't hesitate to contact me.
Regarding cholesterol, I have been able to create the json file automatically using autoLipMap. Thus I'll add it soon to the supported lipids of buildH. I'll make you know about this.

patrickfuchs added a commit to patrickfuchs/buildH that referenced this issue Jun 10, 2021
The molecule has been taken from https://doi.org/10.5281/zenodo.605429 as suggested by
@ohsOllila in the databank project of NMRlipids (NMRLipids/Databank#9)
The json and def files were automatically generated using autoLipMap (the scripts will be
added soon)
Fix #9
@patrickfuchs
Copy link

Hello @ohsOllila, I could finally implement the Berger cholesterol in buildH. It took more time than anticipated because I wanted to do it automatically with autoLipMap, and I did some tests to ensure the implementation was correct. Now the nice thing is that I can generate json (which replace dic_lipids.py) and def files quite rapidly thanks to autoLipMap. I plan to write the procedure on the autoLipMap repo.
Don't hesitate if you have questions about cholesterol or upgrading buildH (which I highly recommend).

@ohsOllila
Copy link
Member Author

ohsOllila commented Dec 6, 2021

I finally had some time to look at this. Currently the key for outdated dic_lipids.py is stored in the README.yaml file. The calcOrderParameters.py then creates a def file starting from the line:

for key in system['UNITEDATOM_DICT']:
and calls some modifed buildH version.

However, in buildH, dic_lipids.py have been replaced with def and json files of which only def file would be necessary in my understanding. Therefore, currently the best option to proceed would be to store the name of def/json file (e.g.,
Berger_POPC) into the README.yaml, and then call the new buildH to calculate order parameters. The def file would be readily
available from buildH, so the part that creates those could be removed from the code.

Only thing that remains to be done is to store the results using the mapping file names for atoms.

Other thing is that the def files currently contains also CH3 bond hydrogens. I think that we should not calculate order parameters for those because hydrogen locations are not uniquely defined for CH3 groups in united atom simulations. I filed an issue on this also to the buildH repo: patrickfuchs/buildH#157

@patrickfuchs
Copy link

However, in buildH, dic_lipids.py have been replaced with def and json files of which only def file would be necessary in my understanding. Therefore, currently the best option to proceed would be to store the name of def/json file (e.g., Berger_POPC) into the README.yaml, and then call the new buildH to calculate order parameters. The def file would be readily available from buildH, so the part that creates those could be removed from the code.

Great if this simplifies the code.

Only thing that remains to be done is to store the results using the mapping file names for atoms.

OK, I guess it should be quite easy.

Other thing is that the def files currently contains also CH3 bond hydrogens. I think that we should not calculate order parameters for those because hydrogen locations are not uniquely defined for CH3 groups in united atom simulations. I filed an issue on this also to the buildH repo: patrickfuchs/buildH#157

As stated in this issue we can probably implement a flag so that CH3 order parameters are not written to the output file. We keep you updated.

@ohsOllila
Copy link
Member Author

I have now implemented this into the order parameter code, but in a slightly different way than I described above. The code is almost the same as before, but instead of calling modified outdated buildH version, the buildH is ran at line

os.system('buildH -t ' + xtcwhole + ' -c ' + topfile + ' -d ' + def_fileNAME + ' -l ' + system['UNITEDATOM_DICT'][key] + ' -o ' + ordPfile + '.buildH' )

Naturally, this requires installation of buildH.

The def file is still generated by the code based on the mapping file. This helps to keep the atom names correct in the output. The name of the lipid topology (-l option) is read from the UNITEDATOM_DICT in a databank README.yaml file. This name has to be defined correctly when adding new datasets into the NMRlipids databank.

When we get a buildH version with the flag ignoring CH3 groups, it should be easy to use that flag here.

@patrickfuchs
Copy link

patrickfuchs commented Jan 17, 2022

buildH has now a new option -igch3 to ignore order parameters belonging to a methyl group even if they are present in the def file. It is also usable when launching buildH as a module with the option ignore_CH3s, we added an example in the doc here.

Regarding your example :

os.system('buildH -t ' + xtcwhole + ' -c ' + topfile + ' -d ' + def_fileNAME + ' -l ' + system['UNITEDATOM_DICT'][key]  + ' -o ' + ordPfile + '.buildH' ) 

I would recommand to launch buildH as a module that way:

import buildH
[...]
buildh.launch(coord_file=topfile, def_file=def_fileNAME, lipid_type=system['UNITEDATOM_DICT'][key], traj_file=xtcwhole , out_file=f"{ordPfile}.buildH", ignore_CH3s=True)

I think it makes the code more readable and avoids a call to os.system().

@patrickfuchs
Copy link

I just released a new version of buildH (1.6.1), you can update it, it'll include the new flag -igch3 as well as a fix of H reconstruction on PS.

ohsOllila pushed a commit that referenced this issue Feb 9, 2022
@ohsOllila
Copy link
Member Author

This is now implemented as suggested and seems to work fine, so I close this issue. Thanks a lot @patrickfuchs and others!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants