Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading shapefile from FAT-formatted keydrive on OSX #252

Closed
rsbivand opened this issue Mar 8, 2017 · 17 comments

Comments

Projects
None yet
3 participants
@rsbivand
Copy link
Member

commented Mar 8, 2017

These threads 1 and 2 show odd behaviour when reading shapefiles from FAT-formatted keydrives on OSX with CRAN sfr and rgdal binaries. I'll ask the affected people to join in here with more details, for now:

> sf::st_read(dsn="/Volumes/DRENTHE/ds/NEweather", layer="frz28_7100j.shp")
Cannot open data source /Volumes/DRENTHE/ds/NEweather
Error in CPL_read_ogr(dsn, layer, as.character(options), quiet, iGeomField -
: 
  Open failed.
> traceback()
4: stop(list(message = "Open failed.\n", call = CPL_read_ogr(dsn, 
       layer, as.character(options), quiet, iGeomField - 1L, type, 
       promote_to_multi, int64_as_string), cppstack = NULL))
3: .Call("sf_CPL_read_ogr", PACKAGE = "sf", datasource, layer, options, 
       quiet, iGeomField, toTypeUser, promote_to_multi, int64_as_string)
2: CPL_read_ogr(dsn, layer, as.character(options), quiet, iGeomField - 
       1L, type, promote_to_multi, int64_as_string)
1: sf::st_read(dsn = "/Volumes/DRENTHE/ds/NEweather", layer =
"frz28_7100j.shp”)
> setwd("/Volumes/DRENTHE/ds/NEweather/")
> list.files(".", pattern=".shp$")
[1] "frz28_7100j.shp" "frz32_7100j.shp" "gdd50_7100j.shp" "map7100.shp"    
> file.exists("./frz28_7100j.shp")
[1] TRUE
@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

I followed Roger's request to see what happens when the dsn is specified as the full shapefile name. Indeed this solves the problem; it remains in the original case:

> setwd("/Volumes/DRENTHE/ds/NEweather")
> list.files(".", pattern=".shp$")
[1] "frz28_7100j.shp" "frz32_7100j.shp" "gdd50_7100j.shp" "map7100.shp"    
> file.exists("./frz28_7100j.shp")
[1] TRUE
> head(rgdal::readOGR(dsn=".", layer="frz28_7100j"),2)
Error in ogrInfo(dsn = dsn, layer = layer, encoding = encoding, use_iconv = use_iconv,  : 
  Cannot open data source
> head(rgdal::readOGR(dsn="frz28_7100j.shp", layer="frz28_7100j"),2)
OGR data source with driver: ESRI Shapefile 
Source: "frz28_7100j.shp", layer: "frz28_7100j"
with 5556 features
It has 16 fields
Integer64 fields read as strings:  AVG_LENGTH NUM_YR_OCC NUM_YR_NOM 
      coordinates STATION_ID STATE     STATION_NA LATITUDE_D LONGITUDE_ ELEVATION_ OID_
1 (-85.95, 32.95)      10160    AL ALEXANDER CITY      32.95     -85.95        640    0
2 (-88.13, 33.13)      10178    AL     ALICEVILLE      33.13     -88.13        240    1
  COOP_ID STATE_1       STN_NAME LAT_DD LONG_DD ELEV_FT AVG_LENGTH NUM_YR_OCC NUM_YR_NOM
1   10160      AL ALEXANDER CITY     33     -86     640        256         27         27
2   10178      AL     ALICEVILLE     33     -88     240        257         17         17
> head(sf::st_read(dsn=".", layer="frz28_7100j"),2)
Cannot open data source /Volumes/DRENTHE/ds/NEweather
Error in CPL_read_ogr(dsn, layer, as.character(options), quiet, iGeomField -  : 
  Open failed.
> head(sf::st_read(dsn="frz28_7100j.shp", layer="frz28_7100j"),2)
Reading layer `frz28_7100j' from data source `/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp' using driver `ESRI Shapefile'
Simple feature collection with 5556 features and 16 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: -177.35 ymin: -14.33 xmax: 174.1 ymax: 71.28
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
Simple feature collection with 2 features and 16 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: -88.13 ymin: 32.95 xmax: -85.95 ymax: 33.13
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
  STATION_ID STATE     STATION_NA LATITUDE_D LONGITUDE_ ELEVATION_ OID_ COOP_ID STATE_1
1      10160    AL ALEXANDER CITY      32.95     -85.95        640    0   10160      AL
2      10178    AL     ALICEVILLE      33.13     -88.13        240    1   10178      AL
        STN_NAME LAT_DD LONG_DD ELEV_FT AVG_LENGTH NUM_YR_OCC NUM_YR_NOM
1 ALEXANDER CITY     33     -86     640        256         27         27
2     ALICEVILLE     33     -88     240        257         17         17
             geometry
1 POINT(-85.95 32.95)
2 POINT(-88.13 33.13)
@rsbivand

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2017

Good. Did this also work without changing the working directory?

dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp", layer="frz28_7100j"

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

Yes:

> getwd() # not the USB
[1] "/Users/rossiter"
> head(sf::st_read(dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j", layer="frz28_7100j.shp"),2)
Cannot open data source /Volumes/DRENTHE/ds/NEweather/frz28_7100j
Error in CPL_read_ogr(dsn, layer, as.character(options), quiet, iGeomField -  : 
  Open failed.
> head(sf::st_read(dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp", layer="frz28_7100j"),2)
Reading layer `frz28_7100j' from data source `/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp' using driver `ESRI Shapefile'
Simple feature collection with 5556 features and 16 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: -177.35 ymin: -14.33 xmax: 174.1 ymax: 71.28
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
Simple feature collection with 2 features and 16 fields
geometry type:  POINT
dimension:      XY
bbox:           xmin: -88.13 ymin: 32.95 xmax: -85.95 ymax: 33.13
epsg (SRID):    4267
proj4string:    +proj=longlat +datum=NAD27 +no_defs
  STATION_ID STATE     STATION_NA LATITUDE_D LONGITUDE_ ELEVATION_ OID_ COOP_ID STATE_1
1      10160    AL ALEXANDER CITY      32.95     -85.95        640    0   10160      AL
2      10178    AL     ALICEVILLE      33.13     -88.13        240    1   10178      AL
        STN_NAME LAT_DD LONG_DD ELEV_FT AVG_LENGTH NUM_YR_OCC NUM_YR_NOM
1 ALEXANDER CITY     33     -86     640        256         27         27
2     ALICEVILLE     33     -88     240        257         17         17
             geometry
1 POINT(-85.95 32.95)
2 POINT(-88.13 33.13)
@rsbivand

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2017

So this is the clearest resolution; with sf and (I hope) rgdal, you should now be able to get away with just giving the dsn, and leaving layer missing:

st_read(dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp")

which will read the dsn, and choose the only or first layer. Shapefiles only have one layer, so that should be watertight. Please also try the rgdal::readOGR() equivalent.

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

Indeed. These both succeed:

sf::st_read(dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp")
rgdal::readOGR(dsn="/Volumes/DRENTHE/ds/NEweather/frz28_7100j.shp")

Although this goes against the usual practice for shapefiles. The documentation for st_read does say "in case layer is missing, st_read will read the first layer of dsn" and for readOGR "From rgdal 1.2.*, layer may be missing, in which case ogrListLayers examines the dsn, and fails if there are no layers, silently reads the only layer if only one layer is found, and reads the first layer if multiple layers are present, issuing a warning that layer should be given explicitly." And the documentation for dsn does say "for some dirvers, dsn is a file name".

However, having different behaviour on FAT and Mac formatted external drives is certainly confusing to the user.

@rsbivand

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2017

@edzer : do you still have access to an OSX machine (as in December)? Is there anyone we could ask who knows both OSX and GDAL?

@cyrus621 Are other drivers more robust? Could you consider migrating to a different format for your students (GPKG)? Do we need to think about changing usual practice? I could edit our book code (not in the book, but from the website). I agree that we should continue to use dsn=, layer=, and not rely on "automatic" layer choice.

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

The issue is that many datasets, including the one in this exercise (USA climate stations) are provided as shapefiles, and in student projects they go get more shapefiles on their own. Also we use some ASDAR2 exercises which I've worked into tutorials, and many of these use shapfiles. Hence the need to read shapefiles. Now that the students are aware of the issue I can show them how to use readOGR with just the full file name as dsn and no layer. Then the developers can consider if it's possible to make the 'old' behaviour of dsn/layer also work on FAT volumes on OS/X.

@edzer

This comment has been minimized.

Copy link
Member

commented Mar 8, 2017

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

Yes, a USB stick, or any USB-connected external drive, mounted at /Volumes

@edzer

This comment has been minimized.

Copy link
Member

commented Mar 8, 2017

Yes, I can confirm that things work as documented on the hard drive, but not on a usb stick mounted on /Volumes, and only reading using the single file path to the .shp does work from the /Volumes.

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

It will work properly on a Mac-formatted USB stick/hard drive. The issue is with FAT-formatted sticks/drives.

@rsbivand

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2017

I've changed to code examples using readOGR() on shapefiles on www.asdar-book.org to use only the file dsn rather than the directory - running tests now.

@edzer

This comment has been minimized.

Copy link
Member

commented Mar 8, 2017

Funny case. Should we add something to the documentation of dsn?

@rsbivand

This comment has been minimized.

Copy link
Member Author

commented Mar 8, 2017

I guess we might do that. I can't see how both sf and rgdal (with the same underlying static linked GDAL) and very different compiled code are being misled. It could be something in the GDAL build, but that is hard to test.

@edzer

This comment has been minimized.

Copy link
Member

commented Mar 8, 2017

Time to stop using shapefiles?

@cyrus621

This comment has been minimized.

Copy link

commented Mar 8, 2017

I remember being at an ESRI demonstration in Syracuse in the early 1990s when the ESRI rep was promoting shapefiles as the new technology to replace coverages. I could never understand why they wanted to do away with proper topology. Anyway, shapefiles will be with us for a long time, for example the (US) Library of Congress has it as a "sustainable" format: http://www.digitalpreservation.gov/formats/fdd/fdd000280.shtml.

@edzer

This comment has been minimized.

Copy link
Member

commented Mar 8, 2017

You're right; we'd better document how they can be read.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.