Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

csv plugin doesn't compile with boost < 1.47 #1456

Closed
wants to merge 3 commits into
from

Conversation

Projects
None yet
2 participants
Contributor

lightmare commented Sep 1, 2012

... because it uses

mapnik::wkt_parser parse_wkt
mapnik::json::geometry_parser<std::string::const_iterator> parse_json;

whose definitions are preprocessed out with boost < 1.47.

It appears that spirit in boost 1.46 doesn't play well with ptr_vector. I rewrote the rules in wkt_grammar so that instead of returning a vector, they take a reference and append to it. This, apart from working with older boost versions, could also reduce the amount of allocations (I assume that originally when parsing a collection, each geom rule returned a ptr_vector, whose elements were then transferred to the collection, and the empty vector dropped; but I wouldn't bet on that, every now and again I'm learning how little I understand the inner workings of spirit).

Next I tried compiling with boost 1.42, and surprisingly succeeded after only commenting-out multi_pass iterator instantiation of json grammar (I guess it's only used by geojson plugin, which was disabled).

So much for the first commit, the other two are self-explanatory.

With these, tests/cpp_tests/csv_parse_test-bin passes with boost 1.46 and 1.42
(well it always passes, here I mean it didn't produce any parse errors for me)

There's a potential problem with appending geoms directly to the output, though. When parsing a multi/collection fails, some geoms may have already been appended. It's easy to fix with an extra local ptr_vector, I just don't know whether it's really a problem, or after a failed parse the vector is discarded anyway.

Owner

springmeyer commented Sep 3, 2012

thanks! I will look to test allocations and speed today or tomorrow and will let you know what I find. RE: partial appends before fails - yeah, that does not seem like a problem.

Owner

springmeyer commented Oct 5, 2012

took a look at this finally. wrote a small benchmark. I'm seeing that with the changes there is consistent slowdown of ~70 milliseconds - tested on osx 10.7 with boost 1.51. But I like the idea of cleanly supporting older boost. Hrm....

#include <mapnik/geometry.hpp>
#include <mapnik/wkt/wkt_factory.hpp>
#define BOOST_CHRONO_HEADER_ONLY
#include <boost/chrono/process_cpu_clocks.hpp>
#include <boost/chrono.hpp>

/*
clang++ -o test-wkt-alloc test-wkt-alloc.cpp `mapnik-config --cflags --libs` -L/opt/boost-51/lib -Wl,-undefined,dynamic_lookup
*/

using namespace boost::chrono;

int main() {
    boost::ptr_vector<mapnik::geometry_type> paths;
    mapnik::wkt_parser parse_wkt;
    std::string value("MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),((20 35, 45 20, 30 5, 10 10, 10 30, 20 35),(30 20, 20 25, 20 15, 30 20)))");
    if (!parse_wkt.parse(value, paths)) std::clog << "failed to parse\n";
    int iterations = 10000;
    typedef process_cpu_clock clock_type;
    process_real_cpu_clock::time_point start = process_real_cpu_clock::now();
    for (int i=0;i<iterations;++i) {
        parse_wkt.parse(value, paths);
    }
    clock_type::duration elapsed = process_real_cpu_clock::now() - start;
    std::clog << "elapsed: " << boost::chrono::duration_cast<milliseconds>(elapsed) << "\n";
    return 0;
}
Owner

springmeyer commented Oct 5, 2012

ah, re: perf now seeing your note about maybe improving allocations (speed) for collections, so tested that now too.

"GEOMETRYCOLLECTION(MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),((20 35, 45 20, 30 5, 10 10, 10 30, 20 35),(30 20, 20 25, 20 15, 30 20))),POINT(2 3),LINESTRING(2 3,3 4),MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),((20 35, 45 20, 30 5, 10 10, 10 30, 20 35),(30 20, 20 25, 20 15, 30 20))),MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),((20 35, 45 20, 30 5, 10 10, 10 30, 20 35),(30 20, 20 25, 20 15, 30 20))),MULTIPOLYGON (((40 40, 20 45, 45 30, 40 40)),((20 35, 45 20, 30 5, 10 10, 10 30, 20 35),(30 20, 20 25, 20 15, 30 20))))"

Using that geometry collection I see that current master's fastest run is 3728 milliseconds while with the patch we get a fastest run of 4007 milliseconds. Not much, but still measurably faster without the patch.

Contributor

lightmare commented Oct 5, 2012

Tested with boost 1.49 on debian squeeze, Intel(R) Xeon(R) CPU E5645 @ 2.40GHz
First, there must be something wrong with the clock, I'm only getting timings rounded to 10ms.
I increased the number of iterations to 100k.

First run after make with patch 4730ms, later runs about 4330-4530ms.
First run after make without patch 4380ms, later runs about 4270-4570ms.
Now after several runs I'm getting about 4120ms from both.
These results are twisted by allocations (with 100k iterations it's like 7GB).

Adding paths.reserve(100) before start, and paths.clear() into the loop, I'm constantly getting 910-950ms from both versions.

Owner

springmeyer commented Oct 5, 2012

sorry, I'm an idiot. I was running in debug mode. However, in short, in master I feel like it would be better to only support recent boost unless there is a performance boost to this patch - which is seems not (from your timings). Would it be hard to maintain a branch with this fix?

Contributor

lightmare commented Oct 8, 2012

Shouldn't be hard, the grammar is not going to change, and the modifications are much simpler than my journey towards them :)

Owner

springmeyer commented Mar 13, 2013

not going to apply, boost is now already at 1.53.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment