Optionally parse list of tags to filter indexing by #25

karenzshea · 2018-01-24T23:25:09Z

Closes #24

First pass implementation of filtering on OSM file by certain tags.

Remaining tasks

Figure out segfault
Write tests
Fix TODOs around optional handling
Figure out handling when no ways are returned
Allow loadOSMExtract to accept an array of files as well as one file string

daniel-j-h · 2018-02-09T21:21:38Z

src/extractor.cpp

+    {
+        tags_filter.add_rule(true, osmium::TagMatcher(line));
+    }
+    tagfile.close();


Remove; the ifstream gets closed automatically

daniel-j-h · 2018-02-09T21:22:05Z

src/extractor.cpp

+            throw std::runtime_error(strerror(errno));
+        }
+        ParseTags(tagfile);
+    }


This ending scope closes the ifstream (in its destructor)

daniel-j-h · 2018-02-09T21:23:31Z

src/extractor.cpp

+             std::strcmp(highway, "living_street") == 0 ||
+             std::strcmp(highway, "unclassified") == 0 || std::strcmp(highway, "service") == 0 ||
+             std::strcmp(highway, "ferry") == 0 || std::strcmp(highway, "movable") == 0 ||
+             std::strcmp(highway, "shuttle_train") == 0 || std::strcmp(highway, "default") == 0);


What about creating a function taking a vector of tags and checking all of them?

Or better put all tags here into a hashset and do constant time hset.count(highway) > 1 checks

danpat

Still needs:

Changelog
package.json bump
Maybe some docs on the format for the tagfilter file (in the README?)

danpat · 2018-02-14T19:33:28Z

example-server.js

+      async.each(wayIds, (way_id, next) => {
+        if (way_id)
+        {
+          annotator.getAllTagsForWayId(way_id, (err, tags) => {


Should probably handle err here?

danpat · 2018-02-14T19:35:42Z

src/extractor.cpp

+    while (std::getline(tagfile, line))
+    {
+        tags_filter.add_rule(true, osmium::TagMatcher(line));
+        std::cout << "tag added: " << line << std::endl;


Some logging is done with cout and some with cerr - should probably make this consistent so that upstream usage doesn't have to jump through hoops to capture everything (cout is usually stdout and cerr is usually stderr).

Ok, does it make sense then to switch to using cout for info type log lines and cerr for error specific messages?

This is a good writeup on the philosophy of stderr vs stdout:
https://www.jstorimer.com/blogs/workingwithcode/7766119-when-to-use-stderr-instead-of-stdout

TL;DR - normal messages from the program, and things explicitly requested by adding parameters should go to stdout, out-of-the-ordinary messages and errors should go to stderr.

danpat · 2018-02-14T19:35:59Z

src/extractor.cpp

+    // add tags to tag filter object for use in way parsing
+    if (!tagfilename.empty())
+    {
+        std::cerr << "Parsing " << tagfilename << " ... " << std::flush;


Same as above, should be consistent with which stream we log to.

danpat · 2018-02-14T19:40:25Z

src/extractor.cpp

+    }
+}
+
+Extractor::Extractor(const std::string &osmfilename, Database &db, const std::string &tagfilename)


I think it's probably time to refactor the constructors so the code isn't repeated multiple times - I already felt bad about the duplication in the existing constructors. Want to have a go at rejiggering these so we don't have 3 copies of almost-identical logic?

danpat · 2018-02-14T19:46:18Z

src/extractor.hpp

@@ -22,6 +25,7 @@ struct Extractor final : osmium::handler::Handler
     * @param d the Database object where everything will end up
     */
    Extractor(const std::string &osmfilename, Database &d);
+    Extractor(const std::string &osmfilename, Database &d, const std::string &tagfilename);


Instead of an additional constructor, maybe just have a single:

Extractor(const std::string &osmfilename, Database &d, const boost::optional<std::string> &tagfilename = boost::none);

I think we've already got all the needed Boost libraries available.

danpat · 2018-02-14T20:28:34Z

src/extractor.hpp

    void way(const osmium::Way &way);
+    void ParseTags(std::ifstream &tagfile);


These two new methods might be better off being private (or at least protected). Not sure we'd ever want to call them from outside the class constructor, they're mostly just internal helpers.

danpat · 2018-03-21T05:33:45Z

@karenzshea I've rebased this on master, which has pulled in the changes from #30.

I've tagged and published 0.1.0-rc3 from this branch.

There's only one outstanding question: should all tags from a matching way be indexed, or only the tags indicated by the filter?

Currently, master separates the "tags to use to filter ways" from the "tags to store in the index". I've duplicated that logic on this branch, but it caused one test to fail:

https://github.com/mapbox/route-annotator/pull/25/files#diff-879eb6ab79a85f13ac84c620132435e1R100

I've commented it out so that tests pass.

If you're OK with this behaviour (only store the tags that we mention in the filter), then I think this PR is ready to merge to master and do a release of 0.1.0 on.

If you think we should index all tags on matching ways, then this if statement:

https://github.com/mapbox/route-annotator/pull/25/files#diff-3cb06dc074df26ca8f412b291b6902ccR171

should be removed, and we should index all tags. For the 0.1.0 release, I don't think it matters if we only index tags listed by the filter, we can certainly make this more flexible in a future release.

karenzshea · 2018-03-21T12:11:18Z

@danpat I'm OK with the behavior of only storing tags that are in the filter. I only wrote the test that way to check that the way I anticipated was being detected.

Thanks for rebasing this after the other PR was merged!

karenzshea force-pushed the tags-filter branch 3 times, most recently from 54c0e7b to 14d72b1 Compare February 5, 2018 21:57

daniel-j-h reviewed Feb 9, 2018

View reviewed changes

karenzshea requested review from danpat and TheMarex February 9, 2018 23:17

karenzshea force-pushed the tags-filter branch from 465ba7d to 9fe9371 Compare February 13, 2018 15:57

karenzshea self-assigned this Feb 14, 2018

danpat suggested changes Feb 14, 2018

View reviewed changes

karenzshea force-pushed the tags-filter branch 2 times, most recently from 9e20470 to a8cae2a Compare February 20, 2018 14:35

karenzshea force-pushed the tags-filter branch 2 times, most recently from 9717d38 to 840df60 Compare March 9, 2018 15:41

karenzshea removed the request for review from TheMarex March 9, 2018 16:53

danpat mentioned this pull request Mar 16, 2018

Various performance improvements #30

Merged

parse list of tags to filter by

55e3b15

danpat force-pushed the tags-filter branch from 009c8c7 to 55e3b15 Compare March 21, 2018 05:11

danpat added 2 commits March 20, 2018 22:12

Bump to rc3 after rebasing on master.

10110f1

0.1.0-rc3 [publish binary]

d2c0479

danpat approved these changes Mar 21, 2018

View reviewed changes

bump package.json and add changelog entry

a01b8ca

karenzshea merged commit 913af6e into master Mar 21, 2018

karenzshea deleted the tags-filter branch March 21, 2018 13:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optionally parse list of tags to filter indexing by #25

Optionally parse list of tags to filter indexing by #25

karenzshea commented Jan 24, 2018 •

edited

Loading

daniel-j-h Feb 9, 2018

daniel-j-h Feb 9, 2018

daniel-j-h Feb 9, 2018

danpat left a comment

danpat Feb 14, 2018

danpat Feb 14, 2018

karenzshea Feb 20, 2018

danpat Mar 9, 2018

danpat Feb 14, 2018

danpat Feb 14, 2018

danpat Feb 14, 2018

danpat Feb 14, 2018

danpat commented Mar 21, 2018 •

edited

Loading

karenzshea commented Mar 21, 2018

		void way(const osmium::Way &way);
		void ParseTags(std::ifstream &tagfile);

Optionally parse list of tags to filter indexing by #25

Optionally parse list of tags to filter indexing by #25

Conversation

karenzshea commented Jan 24, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpat left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

danpat commented Mar 21, 2018 • edited Loading

karenzshea commented Mar 21, 2018

karenzshea commented Jan 24, 2018 •

edited

Loading

danpat commented Mar 21, 2018 •

edited

Loading