ph5toms memory usage #100

nick-falco · 2017-08-03T20:45:37Z

ph5toms uses a lot of memory (>1Gb per request). This affects our ability to scale the web services. We need to work on making this this more efficient.

The memory usage graph looks like this for the following request on my local machine:
mprof run --include-children ph5toms --nickname master --ph5path /hdf5-data2/PH5_Experiments/pn4/16-015/ -o . --station 100? --channel DPZ --starttime 2016-06-26T00:00:00.0 --stoptime 2016-06-27T00:00:00.0

The text was updated successfully, but these errors were encountered:

derick-hess · 2017-08-03T20:55:27Z

Still running some tests. When I run a really big request, say for 10GB of data it more clear that memory usage is going up by the amount of data yielded each time it yields an obspy stream.

I have already added some stuff to clear out ph5 tables after they are done being used. That helped a little but they really aren't that big. I'm investigating why the yielded object or data related to it isn't being garbage collected automatically after it is done being yielded.

At least that is my theory so far. Using mprof and guppy right now.

nick-falco · 2017-08-03T21:19:11Z

I think you may be on to something. The memory might not be freed until stop iteration is reached. See the following stack overflow posting:
https://stackoverflow.com/questions/15490127/will-a-python-generator-be-garbage-collected-if-it-will-not-be-used-any-more-but

derick-hess · 2017-08-03T21:50:29Z

Yeah so it looks like while the iteration is going through no garbage collection can happen anywhere in the script. I tried clearing out and deleting all the tables as soon as they are un needed but mprof shows that it actually doesnt do that until the end of the script even if i explicitly tell it to with gc.collect()

derick-hess · 2017-08-04T14:34:52Z

One solution I will look into is multiprocessing to see if that allows proper garbage collection

nick-falco · 2017-08-04T17:39:23Z

I also read about how you can potentially force garbage collection by running memory intensive processes in a separate thread. Once the thread is finished running, the memory is freed.

Do we know what is using so much memory? It seems like the first step is to figure out exactly what is causing the spike in memory usage, before figuring out a fix.

derick-hess · 2017-08-04T17:44:49Z

I believe it is the yielded obspy stream not being collected. When I run it on large data sets it looks like it is jumping by the size of the obspy stream every time. A little bit of it also looks like the das_t. I added a line in in create_trace to free that memory every time it is done with a das_t but it currently doesn't do anything since garbage collection is halted.

I also added code to clear all other tables when they are done but those only amount to < 10MB total memory.

derick-hess · 2017-08-04T17:46:26Z

I am going to test this by modifying the script on my local machine to immediately write out the trace object instead of yielding it to see how that changes things, and to get a better idea of what is going on.

nick-falco · 2017-08-17T23:51:31Z

I wonder if we have too many open HDF5 dataspaces that are causing a large memory leak:

The HDF5 docs state the following:

Excessive Memory Usage

Open Objects

Open objects use up memory. The amount of memory used may be substantial when many objects are left open. You should:

Delay opening of files and datasets as close to their actual use as is feasible.

Close files and datasets as soon as their use is completed.

If writing to a portion of a dataset in a loop, be sure to close the dataspace with each iteration, as this can cause a large temporary "memory leak".

There are APIs to determine if datasets and groups are left open. H5Fget_obj_count will get the number of open objects in the file, and H5Fget_obj_ids will return a list of the open object identifiers.

@rsdeazevedo Do you think this could be the case?

derick-hess · 2017-08-18T00:17:48Z

I don't think thats the case. I now explicitly close and delete each table after it is read. Well I try to at least, but it wont garbage collect and free the memory until the counter on the iterator is 0. Removing yielding completely fixes the problem. I'm trying to get it to work while still yielding the final stream object.

I think that will work once I rewrite th1e code that yields the stations to cut to no longer yield. I think the issue is happening because we have two iterators going at the same time.

I think we changed it to yield those to speed up the time to yield the first stream object. This code change will make it take a little longer until it yields the first stream object but I'm going to try to minimize that.

That does remind me while fixing ph5tostationxml I will also update ph5tostationxml to free the table memory as soon as it can

nick-falco · 2017-08-18T04:02:25Z

Thanks Derick the nested yielding very well could be the cause of the issue.

Anther improvement I want to make is to have the ph5tostationxml.run_ph5_to_stationxml(sta_xml_obj) method accept a list of request objects for a given network. This will vastly speed up a lot of repetitive POST requests that currently time out. If you are refactoring the ph5tostationxml.py module please keep this in mind.

For example, currently the ObsPy Fed Catalog client makes POST requests, formatted like the example below, that can be hundreds of lines.

level = 'station'
YW 1001 * * <start-time> <end-time>
YW 1002 * * <start-time> <end-time>
YW 1003 * * <start-time> <end-time>
YW 1004 * * <start-time> <end-time>
... etc.

Long requests currently time-out largely because we process each request independently (performing the same work of extracting all stations/channels more than one time). Adding support for lists of requests to the ph5tostationxml.py API will fix this problem since large amounts of data will only have to be read from each requested network one time.

derick-hess · 2017-08-18T15:42:47Z

OKay got ph5toms fixed. Going to add some more memory cleanup to see if I can get it even lower and make sure its cleaning up everywhere possible. I'll create a PR in a bit.

The issue was the nested yields. Getting rid of the yield in the cut station allowed it to garbage collect after every stream is yielded

derick-hess · 2017-08-18T20:33:12Z

adressed in PR #108

nick-falco added the enhancement label Aug 3, 2017

nick-falco assigned nick-falco, rsdeazevedo and derick-hess Aug 3, 2017

nick-falco added this to the v4.0.2 milestone Aug 18, 2017

derick-hess closed this as completed Aug 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ph5toms memory usage #100

ph5toms memory usage #100

nick-falco commented Aug 3, 2017 •

edited

Loading

derick-hess commented Aug 3, 2017

nick-falco commented Aug 3, 2017

derick-hess commented Aug 3, 2017

derick-hess commented Aug 4, 2017

nick-falco commented Aug 4, 2017 •

edited

Loading

derick-hess commented Aug 4, 2017

derick-hess commented Aug 4, 2017

nick-falco commented Aug 17, 2017 •

edited

Loading

derick-hess commented Aug 18, 2017 •

edited

Loading

nick-falco commented Aug 18, 2017 •

edited

Loading

derick-hess commented Aug 18, 2017 •

edited

Loading

derick-hess commented Aug 18, 2017

ph5toms memory usage #100

ph5toms memory usage #100

Comments

nick-falco commented Aug 3, 2017 • edited Loading

derick-hess commented Aug 3, 2017

nick-falco commented Aug 3, 2017

derick-hess commented Aug 3, 2017

derick-hess commented Aug 4, 2017

nick-falco commented Aug 4, 2017 • edited Loading

derick-hess commented Aug 4, 2017

derick-hess commented Aug 4, 2017

nick-falco commented Aug 17, 2017 • edited Loading

derick-hess commented Aug 18, 2017 • edited Loading

nick-falco commented Aug 18, 2017 • edited Loading

derick-hess commented Aug 18, 2017 • edited Loading

derick-hess commented Aug 18, 2017

nick-falco commented Aug 3, 2017 •

edited

Loading

nick-falco commented Aug 4, 2017 •

edited

Loading

nick-falco commented Aug 17, 2017 •

edited

Loading

derick-hess commented Aug 18, 2017 •

edited

Loading

nick-falco commented Aug 18, 2017 •

edited

Loading

derick-hess commented Aug 18, 2017 •

edited

Loading