Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speed up minimum spanning tree step #336

Closed
zktuong opened this issue Nov 18, 2023 · 1 comment · Fixed by #335
Closed

speed up minimum spanning tree step #336

zktuong opened this issue Nov 18, 2023 · 1 comment · Fixed by #335
Labels
enhancement New feature or request

Comments

@zktuong
Copy link
Owner

zktuong commented Nov 18, 2023

Is your feature request related to a problem?

currently it doesn't scale well when there's many nodes in a single clone. the main issue is the nx.from_pandas_adjacency step.

def mst(mat: dict) -> Tree:
"""
Construct minimum spanning tree based on supplied matrix in dictionary.
Parameters
----------
mat : dict
Dictionary containing numpy ndarrays.
Returns
-------
Tree
Dandelion `Tree` object holding DataFrames of constructed minimum spanning trees.
"""
mst_tree = Tree()
for c in mat:
tmp = mat[c] + 1
tmp[np.isnan(tmp)] = 0
G = nx.from_pandas_adjacency(tmp)
mst_tree[c] = nx.minimum_spanning_tree(G)
return mst_tree

Describe the solution you'd like

can we create the graph in batches?

Describe alternatives you've considered

No response

Additional context

No response

@zktuong
Copy link
Owner Author

zktuong commented Nov 18, 2023

this part is also super slow.

for i, row in edge_list_final.iterrows():
edge_list_final.at[i, "weight"] = tmp_totaldist.loc[
row["source"], row["target"]
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant