used queue instead of ordinary list #5217

kryuki · 2021-12-06T03:19:27Z

I noticed that you are using an ordinary list when performing BFS. If you pop(0) from a list, it takes O(N) time complexity since all the element in the list have to move 1 step to the left. However, if you use Queue module, pop() operation is done in O(1) time complexity, and it's going to be faster.

rossbar

It's my impression that the queue module is more focused on threading support than on "queue" objects in the data structure sense - maybe the collections.deque container would be more suitable? It has the properties that you've identified (e.g. O(1) insertion/popping from the left) and a (IMO at least) nicer API, i.e. popleft(). What do you think?

At face value I think this change makes sense, but it'd be great to also have some benchmarking numbers (even if they're very basic) to evaluate what effect this has on performance, if any.

kryuki · 2021-12-08T22:08:16Z

Thank you for your feedback! Yes, you are right. It looks like deque is the right choice if we just use it as a "queue" in a data structure sense.
I tested the performance on several kinds of graphs, and here are the results of the benchmark. I tested with Python 3.9.6.

-	polbooks.mtx	mycielskian8.mtx	GD00_a.mtx	Vehicle_10NN.mtx	Erdos992.mtx
nodes	105	191	352	846	6100
edges	441	2360	458	5447	7515
list	0.078 s	0.338 s	0.079 s	5.195 s	196.86 s
queue	0.142 s	0.436 s	0.119 s	6.616 s	179.035 s
deque	0.069 s	0.314 s	0.081 s	4.48 s	107.628 s

It looks like deque has a 10~20% speed advantage over the ordinary list in many cases. Especially, the graph is large, the difference is noticeable. Also, it turned out using queue was no faster.

Therefore, using deque looks good to me. What do you think? I committed another commit.

The graph I used are

polbooks.mtx(https://suitesparse-collection-website.herokuapp.com/MM/Newman/polbooks.tar.gz), Undirected Graph
mycielskian8.mtx(https://suitesparse-collection-website.herokuapp.com/MM/Mycielski/mycielskian8.tar.gz), Undirected Graph
GD00_a.mtx(https://suitesparse-collection-website.herokuapp.com/MM/Pajek/GD00_a.tar.gz), Directed Graph
Vehicle_10NN.mtx(https://suitesparse-collection-website.herokuapp.com/MM/ML_Graph/Vehicle_10NN.tar.gz), Undirected Weighted Graph
Erdos992.mtx(https://suitesparse-collection-website.herokuapp.com/MM/Pajek/Erdos992.tar.gz), Undirected Graph

The benchmark code I used is

import time
import networkx as nx
from scipy.io import mmread
from networkx.algorithms.centrality.betweenness import betweenness_centrality

graph_name = 'graph_name.mtx'
G = nx.from_scipy_sparse_matrix(mmread(graph_name))

print(f'Graph name: {graph_name}')
print(f'Number of nodes: {G.number_of_nodes()}')
print(f'Number of edges: {G.number_of_edges()}')

#Use ordinary list
t0 = time.time()
betweenness = betweenness_centrality(G, normalized=True, endpoints=True, mode="list")
print('-' * 20)
print("ordinary list")
print(f'Time taken: {round(time.time() - t0, 3)} s')

#Use Queue
t1 = time.time()
betweenness = betweenness_centrality(G, normalized=True, endpoints=True, mode="queue")
print('-' * 20)
print('queue')
print(f'Time taken: {round(time.time() - t1, 3)} s')

#Use deque
t2 = time.time()
betweenness = betweenness_centrality(G, normalized=True, endpoints=True, mode="deque")
print('-' * 20)
print("deque")
print(f'Time taken: {round(time.time() - t2, 3)} s')

In the betweenness.py,

#added 
def keep_loop(Q, mode):
    if mode == 'queue':
        return not Q.empty()
    else:
        return len(Q) != 0

def _single_source_shortest_path_basic(G, s, mode=None):
    S = []
    P = {}
    for v in G:
        P[v] = []
    sigma = dict.fromkeys(G, 0.0)  # sigma[v]=0 for v in G
    D = {}
    sigma[s] = 1.0
    D[s] = 0

    #added 
    Q = [s]
    if mode == 'list':
        Q = [s]
    elif mode == 'queue':
        Q = Queue()
        Q.put(s)
    elif mode == 'deque':
        Q = deque([s])

    while keep_loop(Q, mode):  # use BFS to find shortest paths
        #added 
        if mode == 'queue':
            v = Q.get()
        elif mode == 'deque':
            v = Q.popleft()
        else:
            v = Q.pop(0)
        S.append(v)
        Dv = D[v]
        sigmav = sigma[v]
        for w in G[v]:
            if w not in D:
                #added 
                if mode == 'queue':
                    Q.put(w)
                else:
                    Q.append(w)
                D[w] = Dv + 1
            if D[w] == Dv + 1:  # this is a shortest path, count paths
                sigma[w] += sigmav
                P[w].append(v)  # predecessors
    return S, P, sigma, D

rossbar

Awesome, thanks for the benchmarking and sharing all your work! I was surprised by the Erdos992 values so I tried that one locally - I saw more modest improvements but IMO that's not surprising since benchmarking can be notoriously difficult (also: your computer is much faster than mine 😂 ). Regardless, I think this is a very nice improvement both in terms of pedagogical value (i.e. leveraging the correct data structures for BFS) and performance.

Thanks @kryuki !

kryuki · 2021-12-09T00:18:59Z

It's my pleasure. Yes, actually, the benchmark is different even from run to run on my computer. But I am happy to show at least some improvement.
Thank you as well for your wonderful library! I often use this networkx library for my research involving graphs, and it's really nice and robust. I really appreciate your work. Thanks @rossbar !

dschult

Thank you for the PR, and especially for the quick adaptations to deque and for your performance exploration. This looks good!

* used queue instead of ordinary list * use deque instead of queue

used queue instead of ordinary list

d33d66d

rossbar reviewed Dec 8, 2021

View reviewed changes

use deque instead of queue

3f08702

rossbar approved these changes Dec 8, 2021

View reviewed changes

rossbar added the type: Enhancements label Dec 9, 2021

dschult approved these changes Dec 9, 2021

View reviewed changes

dschult merged commit 96a236e into networkx:main Dec 9, 2021

jarrodmillman added this to the networkx-2.7 milestone Dec 13, 2021

dschult pushed a commit to dschult/networkx that referenced this pull request Jan 25, 2022

used queue instead of ordinary list (networkx#5217)

4ccd6be

* used queue instead of ordinary list * use deque instead of queue

MridulS pushed a commit to MridulS/networkx that referenced this pull request Feb 4, 2023

used queue instead of ordinary list (networkx#5217)

ccad37a

* used queue instead of ordinary list * use deque instead of queue

cvanelteren pushed a commit to cvanelteren/networkx that referenced this pull request Apr 22, 2024

used queue instead of ordinary list (networkx#5217)

dedc104

* used queue instead of ordinary list * use deque instead of queue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

used queue instead of ordinary list #5217

used queue instead of ordinary list #5217

kryuki commented Dec 6, 2021

rossbar left a comment

kryuki commented Dec 8, 2021 •

edited

rossbar left a comment

kryuki commented Dec 9, 2021

dschult left a comment

used queue instead of ordinary list #5217

used queue instead of ordinary list #5217

Conversation

kryuki commented Dec 6, 2021

rossbar left a comment

Choose a reason for hiding this comment

kryuki commented Dec 8, 2021 • edited

rossbar left a comment

Choose a reason for hiding this comment

kryuki commented Dec 9, 2021

dschult left a comment

Choose a reason for hiding this comment

kryuki commented Dec 8, 2021 •

edited