Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pandana network initiation slow for large network #174

Open
wendy-ngan opened this issue Nov 26, 2021 · 4 comments
Open

Pandana network initiation slow for large network #174

wendy-ngan opened this issue Nov 26, 2021 · 4 comments

Comments

@wendy-ngan
Copy link

I am initializing a pandana network for Canada roads network. My hdf5 file is just under 200mb and the number of edges is around 2.2 million. Been running the code below for almost 5 hrs and it's still on going.

I am quite new to this domain and am wondering if anyone can give me suggestions on speeding up the computation.
Would it be possible to implement multiprocessing here? Or should I break the roads network into something smaller like census divisions or provinces and somehow merge them back together, would that make it run faster?

Thanks in advance !

Environment

  • Windows 10
  • Operating system: 64-bit operating system, x64-based processor
  • Processor: Intel(R) Core(TM) i5-1035G4 CPU @ 1.10GHz 1.50 GHz
  • 16GB RAM, 4 CPU core
  • Python version: 3.9.7
  • Pandana version: 0.6.1
network = pdna.Network(nodes.x, nodes.y, edges['from'], edges['to'], edges[['dist']], twoway=False)
network.precompute(distance + 1)
@double-u-a
Copy link

Pandana already utilises parallel processing as it has a C++ backend.
I'm not too sure why, but it's always been quicker for me to not use network.precompute and just run the analyses.
My 16 core Ryzen can initiate a 2 million edge pdna.Network in around 2 minutes so it's definitely not something you should break up into chunks if you can help it.

@wendy-ngan
Copy link
Author

wendy-ngan commented Nov 29, 2021

Thank you for the reply. I removed the precompute line as suggested and ran just pdna.Network. It's been one hour and still computing, is it because I only have 4 cores 8 threads?

//Edit:
After I broke them down into provinces, the computation time was in matter of seconds, even with the precompute code added!
However, I have another question: is it possible to convert the final result from the plot into a shapefile? Or is there a way that I can save the data for each node (lat, long, accessibility distance computed) into a geodataframe?
There is an option save_hdf5, but it doesn't save the associated data with it.

@double-u-a
Copy link

Oh yes sounds like a cpu limitation there, breaking things up does create a lot of issues on the validity of analysis on the edge of provinces i.e. something like Ottawa and Gatineau perhaps.

Just join the results of the analysis back to the input geodataframe and export that, they should both be in the same order so it's just a straight join.

@Fardin3303
Copy link

Fardin3303 commented Aug 16, 2022

I am trying to create a network. With 480000 rows of my dataset which includes 482801 nodes and 960000 edges, it works (in 2 seconds), but by adding a bit more ( 490000rows) it gets stuck like:
Generating contraction hierarchies with 16 threads.
Setting CH node vector of size 492730
Setting CH edge vector of size 980000
Range graph removed 980000 edges of 1960000

and nothing happens. This is my system configuration:
Docker container
Windows 10, wsl2
Operating system: 64-bit operating system, x64-based processor
Processor: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
32GB RAM, 8 CPU core
Python version: 3.10.2
Pandana version: 0.6.1

Is it related to my machine resources?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants