Unable to close network in memory #52

pksohn · 2016-06-01T21:12:03Z

There doesn't seem to be a good way to "close" a pandana network instance in memory. I would like to get the following steps working:

Load a pandana network from its precomputed saved state as an HDF5 file (i.e. using from_hdf5)
Do some things with it (join data to it, run aggregation query, return data)
Close the network so I can repeat these steps

Since I can't seem to "close" the pandana network that is loaded in memory (the last step above), I get the following message: AssertionError: Adding more networks than have been reserved. I am aware that we can configure pandana so that it takes more than one network in memory, but I'd like to avoid that solution.

The text was updated successfully, but these errors were encountered:

jiffyclub · 2016-06-01T21:18:23Z

This would be really nice. Unfortunately this limitation is pretty deep in the contraction hierarchies code that underlies Pandana.

waddell · 2016-06-01T21:54:27Z

That's what I thought might be the case. We'll have to get Juan or Fede to look into this.

fscottfoti · 2016-06-01T21:54:42Z

That's a good question. There is no quick solution unfortunately. It's not really designed to be run in that mode, and it's a limitation I've known about for a while. Technically it's a limitation in the nearest neighbor part of Pandana (the ANN library I did not write), but because it was there I set Pandana up similarly. Basically it expects you to load your networks and then run a bunch of queries against them rather than reprocessing networks over and over. Now that there is nearest neighbor in scipy (this happened about 18 months ago I think) this limitation could be removed (and use the NN in scipy instead). It would take a bit of coding down in the C internals to remove the limitation in Pandana, but it's possible.

The global memory in question is here: https://github.com/UDST/pandana/blob/master/src/pyaccesswrap.cpp#L8. There's just one pointer that needs to be moved inside the object that's gets passed to Python. (And then the nearest neighbor library needs to be removed and scipy used instead.)

You have a couple of options for workarounds. The most likely to work is to start a new process every time you need a new network. This would be pretty easy to write as a script for instance - you could run a script which does exactly what you say, and because it's a totally different process each time, the networks can be initialized every time you run the script. The reason not to do this is because it's expensive to run a script for every request, but reading the file from HDF5 is going to be slow anyway so I think this will be a non-issue given your use case. The only tricky part will be communicating from your code to a separate process/script - you could imagine that script reading in a file and writing out a file as you describe, or just communicating using stdin and stdout - basically I would look as os.popen and similar functions for that. But in general the short-term solution I think is to start a new process for each request.

The long term solution would be to remove the few cases of global memory, which is not a huge deal but would take some time to do it right.

federicofernandez · 2017-05-05T18:43:16Z

I have two comments / questions about this issue.

From one side, I have been analyzing the need of libANN in the code. From what I see, it is not directly used in the C++ CH code, it's just used for generating the POI categories and can be just extracted and implemented in the Python side. Following that reasoning, it shouldn't be too difficult to replace the KDtree with the scipy version.

From the other side, I haven't found yet the limitation for closing an ANN kdtree. I mean, the annClose() method cannot be called because it frees up global memory, so the fact that this line [1] is commented out is a good idea. However, there are functions to delete the memory allocated by annAllocPts and others. Perhaps I'm missing something because I've only seen the code for a couple of hours.

[1] https://github.com/UDST/pandana/blob/master/src/nearestneighbor.cpp#L21

fscottfoti · 2017-05-06T18:29:43Z

I think this is all correct.

I think there might be a way to free up parts of the kdtree and reallocate, but I don't think I ever got that far.

I think replacing libANN and using scipy (absolutely correct that libANN and accessibility are separable) is a very promising path. You might want to do a benchmark first though because I remember libANN being a bit slower (though maybe not enough to matter).

federicofernandez · 2017-05-08T12:26:49Z

Thanks Fletcher! I'll do the benchmarking and then decide what to do. However, I suspect that going with scipy might be the best idea.

fscottfoti · 2017-06-24T05:00:35Z

Fixed in #87

pksohn added the Type: enhancement label Apr 4, 2017

fscottfoti mentioned this issue May 24, 2017

pandana revamp todo list #83

Closed

18 tasks

fscottfoti closed this as completed Jun 24, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unable to close network in memory #52

Unable to close network in memory #52

pksohn commented Jun 1, 2016

jiffyclub commented Jun 1, 2016

waddell commented Jun 1, 2016

fscottfoti commented Jun 1, 2016

federicofernandez commented May 5, 2017

fscottfoti commented May 6, 2017

federicofernandez commented May 8, 2017

fscottfoti commented Jun 24, 2017

Unable to close network in memory #52

Unable to close network in memory #52

Comments

pksohn commented Jun 1, 2016

jiffyclub commented Jun 1, 2016

waddell commented Jun 1, 2016

fscottfoti commented Jun 1, 2016

federicofernandez commented May 5, 2017

fscottfoti commented May 6, 2017

federicofernandez commented May 8, 2017

fscottfoti commented Jun 24, 2017