Add multi-process support #66

brian-brazil · 2015-10-27T09:50:08Z

This works by having a shelve per process that are continously updated
with metrics, and a collector that reads from them.

Histogram buckets need accumulation and special handling for _count,
and for gauges we need to offer a number of options.

Fixes #30

dmclain · 2016-01-12T00:00:21Z

prometheus_client/core.py

+# This needs to be chosen before the first metric is constructed,
+# and as that may be in some arbitrary library the user/admin has
+# no control over we use an enviroment variable.
+if 'prometheus_multiproc_dir' in os.environ:


Rather than hard coding the choice of _ValueClass here, could the class be defined by the users and loaded from anywhere on the path? #70 could be implemented in a separate package entirely and be developed/updated on it's own schedule. The shelf implementation can certainly ship with client_python, but users of other multi-process python servers could be free to experiment without needing to land code here.

Something like the following, shamelessly stolen from django/utils/module_loading.py with error handling stripped dangerously:

module_path, class_name = os.environ.get(PROMETHEUS_VALUE_CLASS, 'prometheus_client.core._MutexValue').rsplit('.', 1) module = import_module(module_path) _ValueClass = getattr(module, class_name)

Then if #70 was in it's own package prometheus_uwsgi, you might start the server with PROMETHEUS_VALUE_CLASS=prometheus_uwsgi.UwsgiCacheValue or PROMETHEUS_VALUE_CLASS=prometheus_client.core.MultiProcessValueandprometheus_multiproc_dir=/path/to/multiproc_dir` to use the shelf implementation described in this PR (removing the _ since it would become part of the public API)

Those _ValueClass implementations would be responsible for configuring themselves from the environment or wherever, which feels non-pythonic. Somehow configuring the system explicitly at process startup seems like the best path, and the developers could pass in whatever class they want, but I haven't dug in enough to the rest of the implementation to suggest what I would want that API to look like.

This seems like a reasonable idea.

Somehow configuring the system explicitly at process startup seems like the best path

That won't work, as by the time your code is running all this code has likely already been imported and many metrics already exist. That's why I'm using environment variables, as they're set at exec() time before any python code has a chance to run.

That won't work, as by the time your code is running all this code has likely already been imported and many metrics already exist.

That's true assuming you want to set the _ValueClass once and only once and never have it specified in code. You could circumvent this by using a client pattern or parameterizing metric instantiation with a _ValueClass as well. I think it would make the interface nicer to use the environment variable by default, but also allow:

REQUEST_TIME = Summary('request_processing_seconds', 'Time spent processing request', value_class=prometheus_client.ShelveValue)

or something of the sort.

tlc · 2016-04-11T15:51:40Z

What is the status of this? Is it going to be pulled?

I need the functionality and will have to dl the branch if not.

Thanks!

brian-brazil · 2016-04-11T16:00:00Z

The core functionality is fine, however the performance isn't good enough to be used in production. I haven't had the time to spare to find a better backing store for that.

This works by having a shelve per process that are continously updated with metrics, and a collector that reads from them. Histogram buckets need accumulation and special handling for _count, and for gauges we need to offer a number of options.

This uses an mmaped file to store the data. Fsync is not called, and concurrent readers are permitted (which some shelve backends did not permit). Microbenchmarks indicate 1.2-1.5us per instrumentation event. This compares to 3-13ms from the shelve solution, so 3 orders of magnitude better. The 1.2-1.5us is about double normal multi-process instrumentation, due to the 2nd mutex required. Python doesn't offer an RWLock which would let this be reduced. Overall this performance is acceptable, as if someone was doing something extremely performance critical with instrumentation they wouldn't be using Python.

jonashaag · 2016-10-10T18:12:41Z

FWIW, I started to work on performance improvement of your branch last week and simply went with JSON instead of shelve/BSDdb, which also gave an improvement of a few orders of magnitude. (I don't know the exact number right now but it was somewhere around a 2x slowdown compared to non-multiprocessing.) Maybe that's an alternative that involves less custom code.

brian-brazil · 2016-10-10T18:43:08Z

That's about the same slowdown as this code, which is presumably dominated by the 2 mutexes in use (why does using a mutex take most of a microsecond?). Hopefully we'll get that faster some time, but I doubt you got too much better than the 80 lines of code I managed it in.

jonashaag · 2016-10-10T19:56:36Z

I doubt you got too much better than the 80 lines of code I managed it in.

Sorry, I did not mean to say that your code was bad or something. Actually, the binary format is very simple and the code is easy to understand. My point is that if it does not make a difference in terms of performance, maybe it's more future-proof and pragmatic to simply go with JSON -- if only so that future readers of the code don't have to worry about learning about a new serialisation format. It's not a big deal at this point, but it might become one if features are added to serialisation format; that's what I'm saying.

brian-brazil · 2016-10-10T19:59:42Z

Let's see where it goes. It's all self-contained, and we don't need to worry about backwards compatibility. The tricky stuff is all in the filenames anyway.

brian-brazil mentioned this pull request Nov 9, 2015

Support pre-fork servers prometheus/client_ruby#9

Closed

bendemaree mentioned this pull request Jan 3, 2016

WIP: uWSGI Multiprocess Support #70

Closed

4 tasks

dmclain reviewed Jan 12, 2016
View reviewed changes

brian-brazil mentioned this pull request Sep 27, 2016

start_http_server: start HTTPServer in main thread before handing off to daemon thread #102

Closed

brian-brazil force-pushed the multiproc branch 3 times, most recently from 31828e2 to 2ffb7e9 Compare October 10, 2016 15:14

Add multi-process support.

d3256c1

This works by having a shelve per process that are continously updated with metrics, and a collector that reads from them. Histogram buckets need accumulation and special handling for _count, and for gauges we need to offer a number of options.

brian-brazil force-pushed the multiproc branch from 2ffb7e9 to 0dbf528 Compare October 10, 2016 15:22

brian-brazil added 2 commits October 10, 2016 16:23

Use never verson of pypy on travis.

9e1f993

brian-brazil force-pushed the multiproc branch from 0dbf528 to 9e1f993 Compare October 10, 2016 15:24

brian-brazil closed this Oct 10, 2016

brian-brazil deleted the multiproc branch October 10, 2016 15:33

brian-brazil restored the multiproc branch October 10, 2016 15:33

brian-brazil reopened this Oct 10, 2016

brian-brazil merged commit ac77d9b into master Oct 10, 2016

brian-brazil deleted the multiproc branch October 10, 2016 15:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add multi-process support #66

Add multi-process support #66

brian-brazil commented Oct 27, 2015

dmclain Jan 12, 2016

brian-brazil Jan 12, 2016

bendemaree Jan 12, 2016

tlc commented Apr 11, 2016

brian-brazil commented Apr 11, 2016

jonashaag commented Oct 10, 2016

brian-brazil commented Oct 10, 2016

jonashaag commented Oct 10, 2016

brian-brazil commented Oct 10, 2016

Add multi-process support #66

Add multi-process support #66

Conversation

brian-brazil commented Oct 27, 2015

dmclain Jan 12, 2016

Choose a reason for hiding this comment

brian-brazil Jan 12, 2016

Choose a reason for hiding this comment

bendemaree Jan 12, 2016

Choose a reason for hiding this comment

tlc commented Apr 11, 2016

brian-brazil commented Apr 11, 2016

jonashaag commented Oct 10, 2016

brian-brazil commented Oct 10, 2016

jonashaag commented Oct 10, 2016

brian-brazil commented Oct 10, 2016