Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OGR corrupts dbase IV files #2486

Closed
Infiziert90 opened this issue May 7, 2020 · 17 comments
Closed

OGR corrupts dbase IV files #2486

Infiziert90 opened this issue May 7, 2020 · 17 comments

Comments

@Infiziert90
Copy link

Expected behavior and actual behavior.

Expected:
dbase_issue1

Actual:
This is after editing the file via QGIS with the "ogr" driver
dbase_issues2

The problem seems to be related to the dbase IV and dbase VII format of the used file.

Steps to reproduce the problem.

  1. Edit a dbaseIV file with "." fields
  2. Open with the original dbase IV program

Operating system

Windows10 64Bit
QGIS 3.12.2

GDAL version and provenance

Compiled against GDAL/OGR 3.0.4

@jratike80
Copy link
Collaborator

Does "." stand for a NULL attribute?

@Infiziert90
Copy link
Author

dbase IV knows no NULL, NULL got implemented with dbase VII,
The "." is a blank field.

@jratike80
Copy link
Collaborator

What I was remembering is that the field in .dbf may be born as totally empty.
How does dBase IV show the attached nulltest.txt? The second record in this .dbf file should have the "test" field in the native state.
nulltest.txt

@Infiziert90
Copy link
Author

Infiziert90 commented May 7, 2020

both fields empty
Unbenannt
(my cursor is in the third line)

@Infiziert90
Copy link
Author

the problem occurs only in float fields which are empty

@jratike80
Copy link
Collaborator

Somehow seems that dbase IV is initializing the field with plain "."
By reading http://independent-software.com/dbase-dbf-dbt-file-format.html the author believes that the field should be filled with spaces up to field length:
"Floating point number, stored as string, padded with spaces if shorter than the field length".

@Infiziert90
Copy link
Author

"Floating point number, stored as string, padded with spaces if shorter than the field length".

yeah, that is what should happen.
OGR is not doing this, instead it fills it with probably NULL?

@rouault
Copy link
Member

rouault commented May 7, 2020

I don't think we really care about the original dBase IV program... I'd say that if there's no interoperability problem with current ESRI tools that deal with Shapefiles, which is the reason for DBF support in GDAL, then doing nothing is probably the best action.

@Infiziert90
Copy link
Author

Infiziert90 commented May 7, 2020

The problem is that programs use the dBase iv standard and not some pseudo standard.
We noticed this problem, because it crashes other programs that use the generated dBase file.

I'd say that if there's no interoperability problem with current ESRI tools that deal with Shapefile

Arcgis is dealing with it correct, the dbf that comes out of arcgis is not changing the values.

@rouault
Copy link
Member

rouault commented May 7, 2020

because the results is a crashes with other programs that use the generated dBase file.

which ones ?

Arcgis is dealing with it correct, the dbf that comes out of arcgis is not changing the values.

Can you attach a DBF created by ArcGIS with NULL value in a floating point field ?

@jratike80
Copy link
Collaborator

jratike80 commented May 7, 2020

ESRI products do read without problems shapefiles written with GDAL so the interoperability problems must be mostly outside the pure GIS scope. I do not mean that the problem that you have is not real and annoying for you. What are those other programs which are crashing BTW?

@Infiziert90
Copy link
Author

Infiziert90 commented May 7, 2020

which ones ?

Pcgeofim, a program that let's you simulate and calculate groundwater for input areas.

Can you attach a DBF created by ArcGIS with NULL value in a floating point field ?

I can post it tomorrow, when I'm back at work.

Edit:
Ah wait, the Screenshot above should already be an output from arcgis... I'm 98% sure, because it is an active project that I used for testing

@jratike80
Copy link
Collaborator

Could you tell how to create a shapefile with null floating point values with ArcMap 10.8? I tried by creating a new shapefile and digitizing a few points but ArcMap automatically writes value 0.000000000e+00 into the field. Should I update the value into NULL with Python or something?

With GDAL I could update values into nulls with
ogrinfo -dialect sqlite -sql "update floatnulltest_gdal set test=NULL" floatnulltest_gdal.shp

Dbf file has now lots of stars ****. If I open this shapefile with ArcMap, which succeeds without errors, and save the layer as a new copy it turns the row of stars into 0.0000000000.

OpenJUMP is also saving floating point values that I have not edited into shapefile by using zeroes with lots of decimals: 0.00000000.

Should I understand that ESRI and OpenJUMP developers have been thinking that shapefile does not even support NULL as a value of a numeric field?

@Infiziert90
Copy link
Author

dBase IV is not supporting NULL. Can you check the version that is used for your dbf file?
You can get it from the headers or with programs that can show you the info

@rouault
Copy link
Member

rouault commented May 7, 2020

If I open this shapefile with ArcMap, which succeeds without errors, and save the layer as a new copy it turns the row of stars into 0.0000000000.

So, there's no specific way with ArcMap to have a dedicated NULL value in a floating-point field ?

AFAICS, the current behaviour of GDAL / shapelib dates back to the very first GDAL release from 2001 : https://github.com/OSGeo/gdal/blob/v1.1.5/ogr/ogrsf_frmts/shape/dbfopen.c#L968
As people haven't loudly complained in the last 20 years, I don't think there's some urgency at changing this. GDAL has also somehow become a reference in the mean time... Maybe others should adapt to accept nicely ****** :-), like pyshp does https://github.com/GeospatialPython/pyshp/blob/master/shapefile.py#L941 . On the reading side, OGR/shapelib also accept a decimal field with space character as meaning NULL, so that could potentially be changed to that, but I don't think that would help in the specific use case of this ticket

@jratike80
Copy link
Collaborator

jratike80 commented May 7, 2020

So, there's no specific way with ArcMap to have a dedicated NULL value in a floating-point field ?

Right. ESRI is storing either -1.7976931348623158e+308 or zero to represent NULL.

http://resources.esri.com/help/9.3/ArcGISDesktop/com/Gp_ToolRef/geoprocessing_tool_reference/geoprocessing_considerations_for_shapefile_output.htm

Data Type containing null value - Shapefile representation
When tool requires a NULL, infinity, or NaN (Not a Number) to be output.
Representation: -1.7976931348623158e+308 (IEEE standard for the maximum negative value)

Number (all other geoprocessing tools)
Representation: 0

I wonder it they have been thinking that values which mean NULL should be something else than the uninitilized value of dfb field.

http://independent-software.com/dbase-dbf-dbt-file-format.html

Note that fields may be filled entirely with spaces to indicate an uninitialized value (not a NULL value – these had not been invented yet).

I could find from the history the Jim Matthews is behind the allstars solution in shapelib version 1.2.9 http://shapelib.maptools.org/release.html but not the background about where the stars appeared. Later also Geotools adopted this feature https://osgeo-org.atlassian.net/browse/GEOT-5617.

I don't really see this as a bug.

@jratike80
Copy link
Collaborator

jratike80 commented May 7, 2020

dBase IV is not supporting NULL. Can you check the version that is used for your dbf file?
You can get it from the headers or with programs that can show you the info

ArcMap, GDAL and OpenJUMP are creating .dbf with "0x03" as file type byte which means "FoxBASE+/Dbase III plus, no memo".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants