-
Notifications
You must be signed in to change notification settings - Fork 297
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Accelerate printing #747
Accelerate printing #747
Conversation
- accelerate wkt writing - accelerate `print` for `sf` objects - use the ellipsis rather than three dots close r-spatial#703
I like the ellipsis, but it now does print a whole lot of digits: > nc$geom
Geometry set for 100 features
geometry type: MULTIPOLYGON
dimension: XY
bbox: xmin: -84.3 ymin: 33.9 xmax: -75.5 ymax: 36.6
epsg (SRID): 4267
proj4string: +proj=longlat +datum=NAD27 +no_defs
First 5 geometries:
MULTIPOLYGON (((-81.4727554321289 36.234355926513…
MULTIPOLYGON (((-81.2398910522461 36.365364074707…
MULTIPOLYGON (((-80.4563446044922 36.242557525634…
MULTIPOLYGON (((-76.0089721679688 36.319595336914…
MULTIPOLYGON (((-77.2176666259766 36.240982055664… and also no longer respects e.g. options(digits = 3) to reduce those. Given that there's hardly anything we want to print, giving up digits control is a pretty big loss; there must be a simpler way to gain these 30 seconds. |
Thanks for the feedback, I agree. I'll look into it. |
@etiennebr maybe something along the lines of this: https://github.com/JanMarvin/sf/commit/4a595ee9282a02f7e7c94d6f28d7fda290ea4842 Though in general I'm in favor of ignoring the values (as data.table would do) or convert only a few selected values. Converting nested lists within lists will never be fast. Just came across this pull request as I was bugged by the slow print a little while ago |
@JanMarvin thanks for the suggestion. My issue with MACROAREA geom
1 CENTRE MULTIPOLYGON (((( 625766 4754…
^ Which is why I was trying to use I just did a quick test with
I was going for a quick fix for now, but maybe it's more complex than I thought. I think once we have the right behavior it will be worth using C++ for the bottlenecks. |
For the whitespace we could add |
I also think that printing the first few coordinates only should be the way to solve this issue; I just didn't come up with a bug free way of doing this, so went back to printing all. |
Forgot to push a follow up commit, which will fetches the first three coordinates for printing. Should be easy to integrate an if clause to check for |
Closing in favor of #957 |
@edzer, this is a first attempt to accelerate printing of geometries. For me, it really boosted the speed. Using the example from #703, a
tbl_df
(fromread_sf
) printed in around 30s, now below 3s. Adata.frame
prints also way faster.There were not many tests for printing, so I don't know if it actually breaks something, but it seems fine from what I could test.
close #703, related #713