You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the --init process, yes I notice that the compound is initialized very slowly a long time ago because some molecules take a long time to generate the isomers. That's why to speed up the process, I tend to multiply the ncpu needed with the cpu in the config.yml for docking (I hardcoded it since I don't want to add up more argument to --init_db at that time), which speeds up the process in linear fashion if I remembered (it takes around 3 hours to initialize ~600k compound including isomers with 150 CPU).
Currently init_db function takes ncpu argument, which comes from the command line argument ncpu. The issue here is that the command line arg ncpu has different meaning if docking is launched on a single server or with dask on multiple servers. In dask-mode, this is the number of CPUs used for any other processing rather than docking. In docking on a single server this is additionally the number of molecules docked in parallel.
The obvious solution is to set ncpu in all functions to Pool.cpu_count() and a user will lose the control on those parts of a program and the control only on docking will remain. Not sure this is the best solution, but I do not see another option currently.
Another slow down is caused by not parallelized post-processing of molecules after protonation (in add_protonation), if molecules were submitted as 3D structures. There is an additional and time-consuming step of assigning correct bond orders. This can be also addressed in the context of this issue. I have a draft implementation to solve this, but did not test it yet.
For the
--init
process, yes I notice that the compound is initialized very slowly a long time ago because some molecules take a long time to generate the isomers. That's why to speed up the process, I tend to multiply the ncpu needed with the cpu in theconfig.yml
for docking (I hardcoded it since I don't want to add up more argument to--init_db
at that time), which speeds up the process in linear fashion if I remembered (it takes around 3 hours to initialize ~600k compound including isomers with 150 CPU).Originally posted by @Feriolet in #35 (comment)
The text was updated successfully, but these errors were encountered: