New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request for comments: python: a new plugin to use arbitrary Python as a data source #1337
Conversation
Interesting idea. Certainly deserves the request number it got ;-) |
Impressive work. I've also played around with calling into Javascript from mapnik using node.js (mapnik/node-mapnik#28), but ultimately the benefit (rapid prototyping) did not outweigh the risks (immediate crashes if used concurrently). However my implementation was much more basic than what you have come up with! I wonder if you could, upon datasource initialization, create a pool of python processes, to work around the potential gil bottleneck? Also, can you describe more what you think is causing the deadlock when loading XML? Is |
I'm not sure what's causing the deadlock as I'm not making use of XML loading currently and so haven't looked. Assuming there's not likely to be an immediate NACK to it, I'm willing to do the work to turn this into a 'proper' implementation. My gut feeling is that I'm probably double-acquiring the GIL at some point. The deadlock happens with a trivial test program and is almost certainly the fault of the somewhat 'sprinkle magic until the prototype worked' approach to acquiring the GIL I went for :). Creating a Python process pool is certainly possible although the 'right' solution would be to find a way to do that on the Python side I imagine. That way, at least, the interface is that of least surprise to the programmer. This could be done by making the interface on the Python side more asynchronous (allowing, e.g, concurrent.future process pools) rather than staying as faithful as possible to the C++ side. This would make 'naive' use of the plugin crash-safe but slow when used concurrently and multi-process aware use crash-safe and fast. Since there appears to be some appetite for this, I'll work on another rev of the plugin and update the pull request with my progress. |
@rjw57 - really cool! would be great to have this functionality |
OK. I've tidied up the patch, fixed the deadlock and added some rendering tests for the plugin. I've also added the functionality to pass (string) parameters into the Python datasource. I think this is now at a point where it can be considered for merging. Edit: I've rebased the commit for merging. Development can be seen in https://github.com/rjw57/mapnik/tree/rjw57-python-plugin-devel Edit 2: Fixed a stupid path bug with the tests. Any further development will now take place as separate commits. |
@@ -0,0 +1,34 @@ | |||
CXX = g++ | |||
|
|||
CXXFLAGS = $(shell mapnik-config --cflags) -I/usr/include/python2.7 -fPIC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense to either remove this Makefile or make generic the python version include?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll remove
This plugin allows you to write data sources in the Python programming language. This is useful if you want to rapidly prototype a plugin, perform some custom manipulation on data or if you want to bind mapnik to a datasource which is most conveniently accessed through Python. The plugin may be used from the existing mapnik Python bindings or it can embed the Python interpreter directly allowing it to be used from C++, XML or even JavaScript. Mapnik already has excellent Python bindings but they only directly support calling *into* mapnik *from* Python. This forces mapnik and its input plugins to be the lowest layer of the stack. The role of this plugin is to allow mapnik to call *into* Python itself. This allows mapnik to sit as rendering middleware between a custom Python frontend and a custom Python datasource. This increases the utility of mapnik as a component in a larger system. There already exists MemoryDatasource which can be used to dynamically create geometry in Python. It suffers from the problem that it does not allow generating only the geometry which is seen by a particular query. Similarly the entire geometry must exist in memory before rendering can progress. By using a custom iterator object or by using generator expressions this plugin allows geometry to be created on demand and to be destroyed after use. This can have a great impact on memory efficiency. Since geometry is generated on-demand as rendering progresses there can be arbitrarily complex 'cleverness' optimising the geometry generated for a particular query. Obvious examples of this would be generating only geometry within the query bounding box and generating geometry with an appropriate level of detail for the output resolution.
) | ||
|
||
""" | ||
ctx = Context() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the Context should be a member of the PythonDatasource?. The shared pointer under the hood I guess should handle this automatically, but why not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, the Context is the place where the knowledge about what fields/labels will be present in the features is vested. wkb_features() is merely a convenience function to collect that information into a single function call when returning the features. You could make your own derived class which keeps the Context inside itself if you know ahead of time what the field names will be.
If the Context were placed into the PythonDatasource class itself, then you'd either have to pre-declare the fields you'll be using elsewhere when it is initialised or this convenience function would have to override you. This can of course be done but I thought collecting them together like this allows naive code to work without stopping someone doing it 'properly'.
I suppose the rationale for this is that the PythonDatasource class and wkb_features classmethod are just there to make it quick and 'Pythonic' to generate a Datasource. You could implement your own if you 'know better'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay, sounds reasonable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just had a quick look at some other plugins and, for example, the PostGIS plugin also creates one Context for each call to features() like the PythonDatasource.wkb_features() implementation does.
Edit: /me stops wittering now it's deemed reasonable :).
Request for comments: python: a new plugin to use arbitrary Python as a data source
@rjw57 - after the merge I'm seeing one test failure on os x 10.7. Perhaps the expected test image just needs updated? http://cl.ly/image/2N3N1O2J1E1F (actual is the image from /tmp)
|
@rjw57 - it would be great to avoid the test dependency on Shapely, simply to ensure the tests are easy to run on all systems. Mapnik itself has no dependency on GEOS (except for a plugin we plan to remove). Would you be amenable to re-writing the test to use just Mapnik? Obviously Shapely would be good to use in real-world example of plugin usage, and to that goal it would be great if you would add the plugin to https://github.com/mapnik/mapnik/wiki/PluginArchitecture and create a wiki page at https://github.com/mapnik/mapnik/wiki/Python-Plugin. |
@springmeyer I'll address these things over the weekend. Thanks. |
See pull request 1414 for removal of shapely dependency. I opened a new request because this one has already been merged. |
Wiki updated. |
Sorry, I'm new to all this and so this might not be the place to comment on this because this is CLOSED, but I've got a fail in the tests when installing this on CentOS 6 using nose 1.2.1 and Mapnik 2.1.0:
Links to mapnik-python-point-render1.png image: expected (from tests/python_tests/images/support) Thank you! |
Hi @jargylplath - those images looks basically the same so you are likely seeing a non-critical error. Are you trying to use the plugin and having trouble? |
Nope. Not even using the plugin, just saw the error and wanted to report it so that either the test or the code (depending on who is the culprit) could be revised in a future release. For my purposes, Mapnik continues to perform splendidly. ^_^ |
- now at https://github.com/mapnik/non-core-plugins - refs #2773, #1875, #1337,
Edit: This is now a candidate for merging.
This pull request is a RFC about whether this feature would be useful. It is not yet ready for merging. (Or, perhaps, it is but it needs a lot of work before it should be built by default. The current issues are noted in the README.md file.) I've posted to the mailing list about this as well since I'm not sure which is the best place to discuss feature requests like this.
This plugin allows you to write data sources in the Python programming language. This is useful if you want to rapidly prototype a plugin, perform some custom manipulation on data or if you want to bind mapnik to a data source which is most conveniently accessed through Python. Finally it is of utility if you have theoretically 'infinite' data sources which can be rendered in finite areas. The README.md file associated with this patch has the example of a data source which consists of infinite concentric circles centred on a point.
Mapnik already has excellent Python bindings but they only directly support calling into mapnik from Python. This forces mapnik and its input plugins to be the lowest layer of the stack. The role of this plugin is to allow mapnik to call into Python itself. This allows mapnik to sit as rendering middleware between a custom Python frontend and a custom Python datasource. This increases the utility of mapnik as a component in a larger system.
Please see the README.md file for more information.