Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix osm_line2poly #29

Open
mpadge opened this issue Oct 5, 2017 · 17 comments
Open

fix osm_line2poly #29

mpadge opened this issue Oct 5, 2017 · 17 comments
Labels

Comments

@mpadge
Copy link
Member

mpadge commented Oct 5, 2017

@richardbeare I've gone through most of your code and vignette with this commit. Note the following in relation to your vignette:

  1. osmdata::opq() now accepts explicit timeout (and memsize) arguments, but these actually aren't needed here. It should just work as is, and does for me, so no need to distract readers of vignette with that stuff
  2. I've removed most of the data you previously had (all of the highways), because this made the plot too large to include in an R package. We really only need the coastal data to illustrate the functionality, and I actually think this really aids the focus of the whole thing.
  3. This should actually make it small enough to include as data within the package, thereby enabling the vignette to actually be run without the graphs.

Now to the issue here which I'd love your help with:

tidy line2poly.R

I'm guessing you don't work with a code linter? Most of the changes I've made are just standard lintr suggestions, with a few extra personal things (more whitespace; vertical alignment of curly braces). I also put all separate functions out into the main function space. However, in it's current form this function no longer works. Can you just have a fish around to ensure that it all works. I think one problem is with the bbox interpretation - you often presume it has a fixed, matrix form with names rows and columns, yet there are no checks for this. Could you just write a small function to force any differently-formed bbox args into this form? (Or some equivalent way of dealing with this). Thanks in advance, and thanks for all the help here - very much appreciated!

@mpadge mpadge added the must do label Oct 5, 2017
@richardbeare
Copy link
Contributor

I'll look into it. I haven't used lintr in the past. I normally conform to many of the standard recommendations, but don't agree with all of them. I was copying your other code as far as brackets were concerned, which is where most of my lint warnings are coming from. I'll check into the scoping stuff. I try to isolate functions that aren't going to be used anywhere else inside the function using them, unless they get too big. I'll see if there's a sensible balance.

@mpadge
Copy link
Member Author

mpadge commented Oct 6, 2017

I'm entirely sympathetic with isolating functions inside other functions, but for better or worse that runs contrary to what rOpenSci expects and encourages. Putting them outside helps avoid messy implicit scoping, of which I found a few instances once i moved them outside. In short: Please avoid nesting function defs.

@richardbeare
Copy link
Contributor

richardbeare commented Oct 6, 2017 via email

@mpadge
Copy link
Member Author

mpadge commented Oct 6, 2017

I put this question out to the rOpenSci slack group:

Is there an rOpenSci stance on nested functions? I've got a PR with lots of nested fns and have asked for them to be un-nested, but rightfully got asked in return: Why? And got asked more pointedly whether there was any official documentation/justification for avoiding nests (other than scoping, which is a non-issue here)?

so far only reply suggests "Don't see a clear answer", but it's still the wee hours for most. I'll let you know of any more responses.

@richardbeare
Copy link
Contributor

richardbeare commented Oct 6, 2017 via email

@mpadge
Copy link
Member Author

mpadge commented Oct 7, 2017

Yeah, it's an interesting minefield all this stuff, and at the end all one can do is be opinionated one way or the other. Which made me realise that osmplotr didn't have a CONTRIBUTING.md file, so I've copied the one from bikedata. Check it out - I really do pretty much entirely share your opinions, especially with regard to vertical alignment of curly braces. The only place C++ can't translate directly to R is

if ()
{
}
else
{
}

I would prefer that but R will only interpret it with

if ()
{
} else
{
}

As for nesting - it seems very clear that there is no clarity, so that's also just a matter of opinion, of which i'd have just two:

  1. Avoiding nesting absolutely ensures no implicit scoping, which is a good thing
  2. I like to document all fns in a consistent and explicit way, which means doxygen/roxygen comments throughout, yet these are only highlighted when not indented. That means comments for nested fn defs are not highlighted the way non-nested fns are. Alternative solution would be to customise my (vim) syntax highlighter, but much easier to just avoid indentation.

Thanks for opening this can o'worms!

@richardbeare
Copy link
Contributor

richardbeare commented Oct 7, 2017 via email

@mpadge
Copy link
Member Author

mpadge commented Oct 7, 2017

I still get an error on this:

> bbox <- osmdata::getbb ("greater melbourne, australia")
> coast <- opq (bbox = bbox) %>%
      add_osm_feature (key = "natural", value = "coastline") %>%
      osmdata_sf (quiet = FALSE)
> coast_poly <- osm_line2poly (coast$osm_lines, bbox)
Error in FUN(X[[i]], ...) : 
  unused argument (V = c(53, 39, 4, ...

Without even worrying about exactly why this happens, I note that the lapply that causes it is the one on current L#84:

linkorders <- lapply (startidx, function (X) unroll_rec (X), V = m2)

but startidx is a simple vector, so there's no need for an lapply. Simply unroll_rec (startidx, V = m2) will give the desired result, right? Same for next line, because linkorders is a vector, not a list. Also a couple of lines later, links <- lapply (linkorders, ...) is also just sub-setting a matrix as headtail [linkorders, , drop = FALSE]. And the call after that of,

links <- lapply (links, function (X) lookup_ways (X), g)

is also just

links <- lookup_ways (links, g)

... and so on. I've accepted the PR as you've seen, but could you please re-do it and vectorise all fns where possible, and ensure that the above code works. Thanks!

@richardbeare
Copy link
Contributor

The lapply calls are there for cases when there are multiple of what I've called "chains". This requires more than a single lookups, and the results can be variable length, thus the need for a list. I think this happens in the greater Melbourne example, with one or more of the islands being constructed from more than one way and the mainland coast being made of lots of other ways.

My memory is already struggling, but I'll check again. Pretty sure if wasn't simple all the time. Will be neater if it is.

The most complex/least intuitive part is the "unroll_rec", which follows a trail of indexes. I couldn't think of a way of doing this with functional programming tools.

@mpadge
Copy link
Member Author

mpadge commented Oct 8, 2017

okay, i see. Then there are a few options, but the easiest is likely a simple

if (!is.list (startidx))
  startidx <- list (startidx)

Then the lapply calls will work in effective vectorised form. However, startidx can only be a vector, i'd think, so i don't see where these lists arise. Can you dig out two contrasting examples of simple (vector) and non-simple (list) results so we can unpick this a bit better?

@richardbeare
Copy link
Contributor

richardbeare commented Oct 8, 2017 via email

@richardbeare
Copy link
Contributor

It turns out that the greater melbourne map is a surprisingly good example of the odd formats! The western edge of the bbox cuts through the Bellarine Peninsula, and that leads to two groups of "ways", which can be observed by placing a breakpoint inside osm_line2poly as follows:

coast_poly <- osm_line2poly (coast$osm_lines, bbox)

Browse[2]> startidx
[1] 15 30

Browse[2]> linkorders
[[1]]
 [1] 15 71 56 37 36 48 49 47 46 45 10 11 12 13 14 73  2 39 38  9 40 41  3  4  5  6 19 27 17 26 25 23 21 28 18 24 22
[38] 20 16

[[2]]
 [1] 30 32 31 35 29 33 44 34 58 61 60 59 55 54 63 50 43 51 52  1 53

Presumably the short one is the Bellarine peninusla part and the long one is the rest of the mainland.

The next step removes those ways that have been linked from the original set, because those that aren't grouped must form closed polygons:

        head_tail <- head_tail [-unlist (linkorders), , drop = FALSE] #nolint

So, apart from tidying up the function that follows indexes (unroll_rec), I can't see an alternative to lapply that doesn't involve loops.

I'm still looking for a functional programming way of doing the unrolling, but only because I think there's a neat trick I'm missing.

@mpadge
Copy link
Member Author

mpadge commented Oct 10, 2017

great stuff! Thanks so much for all the work here

@richardbeare
Copy link
Contributor

I've done some thinking about the unrolling functions - probably best to go back and do them with loops. I have the prototypes ready when you have integrated the other changes to your satisfaction.

@mpadge
Copy link
Member Author

mpadge commented Oct 10, 2017

Please just send another PR as soon as you can - I've already accepted the last one, so am just waiting for the improved version of the unrolling functions

@homer3018
Copy link

Hello there,
I've just bumped into the same issue you're mentioning while trying to transform some coast lines into polygon so that it can be filled.
Is there any chance someone could have another look at that ?
Anyhow, good work there! Thanks a lot.

@richardbeare
Copy link
Contributor

There is some very ancient work in these branches

https://github.com/richardbeare/osmplotr/tree/SFExperiments
https://github.com/richardbeare/osmplotr/tree/SFSeaLand

There's a lot of work involved in reliably doing water/land shading, and the quickest way to get something reliable was to pull in the power of simplefeatures, allowing reliable intersection computation and so on. Our original experiments were rapidly heading towards reimplementing a lot of functionality that is now in sf, so building on sf is the way to go.

It was a long time ago, so I don't remember the details, but there should be something you can work on in those two branches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants