Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup identify_connections #741

Merged
merged 4 commits into from
Jul 7, 2023

Conversation

daico007
Copy link
Member

The current Topology.identify_connections method is quite slow for big system, this PR made several changes to the gmso/utils/connectivity.py to improve the performance of said method (change the comparison of site to just their index, use IndexedSet instead of list, etc).

@codecov
Copy link

codecov bot commented Jun 27, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (727d32a) 91.99% compared to head (57aa644) 91.99%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #741   +/-   ##
=======================================
  Coverage   91.99%   91.99%           
=======================================
  Files          67       67           
  Lines        6460     6460           
=======================================
  Hits         5943     5943           
  Misses        517      517           
Impacted Files Coverage Δ
gmso/utils/connectivity.py 98.38% <100.00%> (ø)

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@chrisjonesBSU
Copy link
Contributor

Speeding this up will be great. Thanks for doing this!

Have you bench marked it yet to get an idea of the performance improvement?

@daico007
Copy link
Member Author

daico007 commented Jun 28, 2023

I haven't but I can sweep up something

@CalCraven
Copy link
Contributor

CalCraven commented Jul 6, 2023

Ran this code locally on a system.

import gmso
import sys
sys.path.append("..")
from basic_testing import gmso_water_graph, box_water, box_polymer
import timeit
import numpy as np
setupStr = """
import gmso;
import sys;
sys.path.append("..");
from basic_testing import gmso_water_graph, box_polymer;
n_mols=100;
poly_box = box_polymer(n_mols);
gmso_top = gmso_water_graph(poly_box);
"""
outList = timeit.repeat("gmso_top.identify_connections()", setup=setupStr, number=3, repeat=3)
print(f"Identify {gmso_top.n_angles + gmso_top.n_dihedrals} Connection in"
      + f" {np.mean(outList):.2f} +- {1.96*np.std(outList)/np.sqrt(3):.3f} seconds"
)

Output with this PR

Identify 1491 Connection in 0.92 +- 0.002 seconds

Output with current GMSO main

Identify 1491 Connection in 1.24 +- 0.002 seconds

Which is a a 25% speedup for this system

Copy link
Contributor

@CalCraven CalCraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Did a little plotting as well to verify locally, and I think these changes seem even better at larger number of connections.
image

gmso/utils/connectivity.py Outdated Show resolved Hide resolved
Copy link
Contributor

@chrisjonesBSU chrisjonesBSU left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The performance improvement looks awesome, thanks @daico007!

@daico007
Copy link
Member Author

daico007 commented Jul 7, 2023

I will merge this after all test passes, thanks for doing the profiling Cal!

@daico007 daico007 merged commit 2f3c7f3 into mosdef-hub:main Jul 7, 2023
13 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants