Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

st_write(obj, "file.shp", update = TRUE) gives file name "file.shp.shp" #307

Closed
eivindhammers opened this issue Apr 18, 2017 · 17 comments
Closed

Comments

@eivindhammers
Copy link

Windows 7 64 bit, MRO 3.3.2, latest Github version of sf.

nc = st_read(system.file("shape/nc.shp", package="sf"))
st_write(nc, "test.shp")
st_write(nc, "test.shp", update = TRUE)
list.files(pattern = "test.shp")

returns

[1] "test.shp"     "test.shp.dbf" "test.shp.prj" "test.shp.shp" "test.shp.shx"

where only "test.shp" was expected.

@eivindhammers eivindhammers changed the title st_write(obj, "file.shp", update = TRUE) gives extra "shp" st_write(obj, "file.shp", update = TRUE) gives file name "file.shp.shp" Apr 18, 2017
@edzer
Copy link
Member

edzer commented Apr 19, 2017

I'm afraid your expectations were too optimistic. sf still has no support for deleting data sources before writing.

@eivindhammers
Copy link
Author

Well then isn't this description of the update option in the documentation slightly misleading?

For non database-type drivers that save a single layer (e.g. ESRI Shapefile and GeoJSON) this is roughly equivalent to overwrite in functions such as writeRaster from the raster package.

For reference, overwrite in writeRaster:

overwrite: Logical. If TRUE, "filename" will be overwritten if it exists

In any case, is the addition of an extra .shp in the filename intentional, and if so, why?

@edzer
Copy link
Member

edzer commented Apr 20, 2017

This description comes from @Robinlovelace , see #274, and might still reflect the same optimism that update would automatically overwrite. Noone has implemented a delete operation (which gdal supports) so far; rgdal::writeOGR has it.

The addition of the extra .shp is what GDAL does; see http://www.gdal.org/drv_shapefile.html

You might get further by using the dsn, layer and driver arguments, but I don't see how we can get st_write cause the removal of shapefiles before writing.

@tim-salabim
Copy link
Member

if (file.exists(dsn) && update) unlink(dsn) ?
That's what I do in functions that would otherwise fail. Though this is not quite correct for multi-file types such as .shp, GDAL seems to have no problems with overwriting the associated files if they exist. Though I have not extensively tested this!

@rsbivand
Copy link
Member

This approach has typically had issues on Windows platforms - if you care to look at rgrass7 and earlier attempts to unlink files on Windows that may have been owned by other processes. You need to use the GDAL functions, as in rgdal, and be very careful with destroying a whole directory when dsn is not a file. It is not at all simple to get right cross-platform.

@edzer
Copy link
Member

edzer commented Apr 20, 2017

Please also go through #274 to understand why such a solution can't be part of sf.

@tim-salabim
Copy link
Member

@rsbivand thanks for sharing this insight! I'll make sure to test whether dsn is a directory before unlinking.
file_test("-d", dsn) would be sufficient I assume.
@edzer I agree that a GDAL based solution should be the way to achieve this. I personally have no problems with handling this myself.

@eivindhammers
Copy link
Author

@tim-salabim I applaud any effort to implement this! In the meantime, should the docs be made more consistent with actual behavior?

@tim-salabim
Copy link
Member

@eivindhammers don't misunderstand me, I am not planning to implement this in sf. I merely mean that I have no problems with handling this in my own scripts/functions.

@Robinlovelace
Copy link
Contributor

Robinlovelace commented Apr 20, 2017

Just to say: yes that was me who wrote the demonstrably incorrect documentation. Apologies. I did not test on Windows and subsequently realised that does not behave as stated for .geojsons either I don't think. @tim-salabim good timing with you latest comment: I was about to say I would be up for helping you implement that. I think a GDAL delete interface would be best but do not currently have the skill/capacity to do this. Heads up @Nowosad who has also discussed this with me and Edzer.

@edzer edzer closed this as completed in 4ae776a Apr 20, 2017
@edzer
Copy link
Member

edzer commented Apr 20, 2017

st_write(nc, "test.shp", delete_dsn = TRUE)

will now first delete data source test.shp (which might be a directory) before attempting to write. I added a parameter delete_layer to only delete the layer, but haven't implemented it. For this use case, deleting the dsn seems to work.

@Robinlovelace
Copy link
Contributor

Great stuff. From a user's perspective it may be more useful to have delete_dsn = TRUE if overwrite = TRUE (doable?) but can see why this new parameter has been added: to keep sf as faithful to GDAL as possible.

Out of interest, where are GDAL's deleting capabilities/args documented? This is the best I can find (always struggle to find any docs about GDAL - probably not looking hard enough!): http://www.gdal.org/gdalmanage.html

edzer added a commit that referenced this issue Apr 20, 2017
edzer added a commit that referenced this issue Apr 20, 2017
modifies the default layer name in case of ESRI Shapefile by stripping a trailing .shp. The GDAL driver does this, but in case we don't do it, delete_layer will not work, as the layer name GDAL writes (without .shp) does not match the one we used to pass to GDAL (with the .shp).
@edzer
Copy link
Member

edzer commented Apr 20, 2017

We now have tidy versions

> library(sf)
Linking to GEOS 3.5.1, GDAL 2.1.3, proj.4 4.9.2, lwgeom 2.3.2 r15302
> nc = read_sf(system.file("shape/nc.shp", package="sf"))
> write_sf(nc, "test.shp")
> write_sf(nc, "test.shp")

that are silent and overwrite (have delete_layer=TRUE) by default.

The original construct with update = TRUE now leads to an error; update is for adding layers to existing datasources (like adding a table to an existing database), not for overwriting.

@tim-salabim
Copy link
Member

Nice! Great work, thanks!

@eivindhammers
Copy link
Author

Great stuff!

@Robinlovelace
Copy link
Contributor

Awesome work, that will make life much easier for many people!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants