Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use lwgeom::st_astext() #800

Closed
wants to merge 2 commits into from

Conversation

etiennebr
Copy link
Member

This accelerates conversion to WKT and incidentally makes printing geometries (which uses WKT quite a lot) much faster. I don't think we can get more effect for such a small effort. I just bind the lwgeom function to get a significant improvement for very cheap.

This PR is conditional on r-spatial/lwgeom#25.

Comparisons

Outputs are similar

Note that lwgeom::st_astext spares the spaces between the geometry type and parenthesis (<GEOMETRY> () as well as after commas. I find it's an acceptable trade-off and makes the output even more compact.

# (Note that it uses `st_as_text` before 95edae7)
nc <- st_examples("nc.shp")  # from PR#633
expect_equal(
  # match spaces preceded of a comma and spaces followed by an opening parenthesis
 # to make the two formats similar 
  gsub("(?<=,)\\s|\\s(?=\\()", "", st_as_text(nc$geometry), perl = TRUE),
  lwgeom::st_astext(nc$geometry, 7)
)

One other thing to mention is that the precisionargument of st_as_text() is not comparable to the precision attribute of an sfc, which can be confusing since they share the same name. The precision argument is in fact the length of the whole output, not only of the decimal portion. This can be disturbing when dealing with coordinates that have very different values in x and y, but is no different than the previous behavior. Furthermore, I find that having an explicit and deterministic precisionargument (that could be named differently) makes it easier to control the output.

# (Note that it uses `st_as_text` before 95edae7)
lwgeom::st_astext(st_sfc(st_point(c(1.01, 333333.01))), precision = 6)
#> "POINT(1.01 333333)"
st_as_text(st_sfc(st_point(c(1.01, 333333.01))))
#> POINT (1.01 333333)
lwgeom::st_astext(st_sfc(st_point(c(1.01, 333333.01))), precision = 8)
#> "POINT(1.01 333333.01)"

Much faster

# (Note that it uses `st_as_text` before 95edae7)
microbenchmark::microbenchmark(st_as_text(nc$geometry), lwgeom::st_astext(nc$geometry))
#> Unit: milliseconds
#>                            expr        min         lq       mean     median         uq        max neval
#>         st_as_text(nc$geometry) 234.434879 237.065610 240.827458 239.061423 241.509103 316.328422   100
#>  lwgeom::st_astext(nc$geometry)   4.676498   4.819702   4.987154   4.960423   5.024772   9.740585   100

As always, I'm happy to make changes.

Closes #703, #747

@etiennebr
Copy link
Member Author

I guess we'll need to wait for r-spatial/lwgeom#25 to reach CRAN before merging. Or is the dev version on github somewhat available from travis and appveyor?

@edzer
Copy link
Member

edzer commented Aug 29, 2018

Re: precision ambiguity: it makes more sense to name this argument digits in lwgeom::st_astext, and explain in the docs that it maps to postgis::st_astext precision, but has nothing to do with GEOS' (and sf's) precision. This will be more familiar to R users and avoid confusion.

edzer added a commit that referenced this pull request Dec 7, 2018
@edzer
Copy link
Member

edzer commented Dec 7, 2018

st_astext is now in lwgeom 0.1-5, which is on CRAN. The good news is that e.g. the weird shape2 of #901 prints in about 1 sec, instead of 20 sec. The bad news is that checking sf now needs 1.5 hrs instead of 3 mins, and I have no clue why. Changed output (many!) indicate something weird goes on with digits. With this commit, put st_astext on with setting env variable LWGEOM_WKT to true. Run check e.g. with

LWGEOM_WKT=true R CMD check sf_*gz

@etiennebr
Copy link
Member Author

Could it be cyclic namespace dependency detected when loading ‘sf’, already loading ‘lwgeom’, ‘sf’?

@edzer
Copy link
Member

edzer commented Dec 11, 2018

How do I trigger that?

This accelerates conversion to WKT and incidentally makes priting much faster
* Integrates 90821d8 from @edzer
@etiennebr
Copy link
Member Author

I squashed all the commits and rebased (I don't know how to pull the edits you made from the pull request). So we have access to the same (buggy) pull request now.

When building from 9d093bd with

devtools::build()

I get

[...]
** byte-compile and prepare package for lazy loading
Error in loadNamespace(i, c(lib.loc, .libPaths()), versionCheck = vI[[i]]) : 
  cyclic namespace dependency detected when loading ‘sf’, already loading ‘lwgeom’, ‘sf’
ERROR: lazy loading failed for package ‘sf’

When running devtools::document(roclets=c('rd', 'collate', 'namespace'))

I get this warning

#> Warning message:
#>     object 'st_as_sfc' not found whilst loading namespace 'lwgeom' 

It's a weird coincidence that it searches for st_as_sfc and that st_as_sfc.character follows st_as_text.sfc in wkt.R. Maybe it's not the right track, but maybe it's something to do with roxygen? There is probably something with the requireNamespace as well, but I don't know why?

@etiennebr
Copy link
Member Author

Sorry @edzer, forget about my last messages I was confused. I see you merged the changes to master without merging the pull request, so I mixed up versions when rebasing the branches.

I see what you mean about the build time. It was around 30 minutes before 2166ebf and now is up to 60 minutes. Could it be something with caching? I don't have any issue building master (5a48393) on my machine; it runs in a reasonable amount of time.

@etiennebr
Copy link
Member Author

etiennebr commented Dec 14, 2018

To expand a bit as I'm searching, the check time for LWGEOM_WKT=false or true are the same for me locally, however some tests fail when using LWGEOM_WKT=false because the WKT are slightly different (we'd need to adapt the regex's in these cases)

  1. Failure: MtrxSet is being called (@test_sfg.R#9) 
  2. Failure: st_multilinestring works (@test_sfg.R#27) 
  3. Failure: format works (@test_sfg.R#44) 
  4. Failure: format works (@test_sfg.R#45) 
  5. Failure: format works (@test_sfg.R#48) 
  6. Failure: well-known text (@test_wkt.R#5) 
  7. Failure: well-known text (@test_wkt.R#7) 
  8. Failure: well-known text (@test_wkt.R#12)  

Since travis build is passing, I deduce that it's not using LWGEOM_WKT, so the issue with check time is maybe related to something else?

@etiennebr
Copy link
Member Author

I'll work on it during the dev day. @batpigandme, take note!

@batpigandme
Copy link

@etiennebr I can't add the tag because I don't have access to this repo, but the tag is tidy-dev-day :nerd_face: and the colour is #CBBAB8

@etiennebr etiennebr added the tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day label Jan 19, 2019
@etiennebr
Copy link
Member Author

@batpigandme, can you now see the 🤓 label?

@batpigandme
Copy link

@etiennebr nailed it!

etiennebr added a commit to etiennebr/sf that referenced this pull request Jan 19, 2019
@etiennebr etiennebr mentioned this pull request Jan 19, 2019
@etiennebr
Copy link
Member Author

Closing in favor of #957

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tidy-dev-day 🤓 Tidyverse Developer Day rstd.io/tidy-dev-day
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants