I configured Mapnik master with ./configure RENDERING_STATS=True, picked a single tile at z13 with a bunch of data (http://tile.osm.org/13/4052/2688.png) and made a first crack at finding performance differences between this style and the osm.xml standard style.
I've found that this style is 1.5 to 2x slower than the osm.xml even though the layers are identical. The culprit is the usage of Carto attachments which generate more <Style> objects in XML and therefore trigger more queries against the database (many of which are less efficient because they pull data that is later thrown away).
The best example of the problem is the landcover layer. In the osm.xml it is a single layer and style, while in this style it is 9 separate styles such that 9 database queries are issued for the same exact data for a single layer. For the sample tile this meant osm.xml rendered landcover in 200 ms while this style rendered landcover in 1.4s. It looks like the cause of the attachment usage in this case was to work around this carto bug: mapbox/carto#20 as discussed in #15.
Other expensive queries like buildings and minor-roads also suffer slightly from the same problem:
This style renders buildings in 7.5 seconds using 2 styles while osm.xml renders with a single style in 3.5 seconds. The second style in question here is buildings-aeroway and renders no data, while over 100,000 rows are queried. So, in makes complete sense here that rendering of building would be 2x as slow: actual rendering is only happening once, but the data is fetched twice before other layers can be rendered.
Also for minor-roads this style renders in 2s (vs 1.4s) and uses 2 styles (vs 1). The second style in question is minor-roads-fill-railway and for this tile ends up pulling 31654 features but only renders 871.
For reference, here is the output for the carto style (https://gist.github.com/4323399) and for the standard osm.xml (https://gist.github.com/4323395).
Awesome stuff - I didn't realise how much of an impact the attachment-workaround was having.
Perhaps it's not the actual fetching from the database that's the problem, so much as repeatedly iterating over the features? I seem to remember work on this at the Mapnik Hack Weekend in London a few years ago, and finding that mapnik-level caching of layer datasets didn't help much.
Still, it shows the obvious place to start for refactoring, in order to improve performance. Cheers!
Yes, attachments create multiple styles per layer and this is rarely a good thing unless you truly need to query the data more than once (like for road casings) or to get labels on top. In the case of landuse, in particular, attachments really are suboptimal because so many features are being thrown out.
To answer your question: I think it's both fetching and iterating, but particularly the former. And ya, caching features for reuse is rarely helpful because it blows up memory usage and usually just refetching is just as fast. If you have a crazy complex and CPU intensive SQL query that does not return many rows and you want to render it more than once, that is likely a case where Mapnik level caching would be a big win. This landuse case is more of the opposite.
@gravitystorm (and others interested) - one way to temporarily work around #15 (and the only really significant performance difference) in the short term would be to patch in the old landcover style XML. So, just switch out the generated XML with the actual style XML for the landcover style from the old style. This is obviously not a long term solution, but as a short term fix, after this is done, I presume that performance overall would be much closer to the old style and therefore a bunch closer to being initially deployable. This also makes me wonder if carto.js would benefit from formalizing a way to have a style be pure XML. Not sure at all, but potentially worth some consideration.
I have started doing some initial generic performance comparisons between the two styles.
I took a small region (Karlsruhe, Germany and surround) and rendered all tiles in that region from Z14 - Z18 in both styles. I did this comparison on the OSM devserver (errol) on a planet wide postgresql 9.1 / postgis 2.0 database with the mapnik 2.1 PPA packages. The command to render was
"./render_list --all -f -m carto -s ~/tiles/rednderd.sock --min-lon 8.1 --max-lon 8.6 --min-lat 48.9 --max-lat 49.2 -z 14 -n 4"
In these initial tests, there still were about a 25% slow down. The OSM mapnik style took 571 seconds whereas the carto style took 724 seconds. As this was a small area systematically rendered, most of the database access was presumably from cache.
I don't currently have any details of where the slow down is occuring, but if these numbers are of use to anyone, I can try and do some more generic testing, e.g. comparing different regions or zoom levels.
Some testing on a set of tiles from yevaud's log at peak hours put carto at 22% slower when all postgres data is in cache (rendering a subset of the tiles) and 12% slower on a larger set which has to access the disk.
They also show that more time is spent on z13 than any other two zooms combined for both osm.xml and the carto port.
This disagrees with munin, which I cannot explain. Non-standard indices speeding it up?
In any case, z12 is actually faster per meta than z13 and there are far fewer low-zoom requests, so it's z13+ that needs optimizing. I've started work on seeing what layers contribute to the slowness.
Closing as these particular performance issues are old at this point
btw, I also removed the RENDERING_STATS=True option from upcoming Mapnik 3.x as I've got something better planned to help grab rendering stats at runtime: mapnik/mapnik#1956