Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Column wkb_geometry appears when importing #2107

Closed
iriberri opened this issue Feb 6, 2015 · 9 comments
Closed

Column wkb_geometry appears when importing #2107

iriberri opened this issue Feb 6, 2015 · 9 comments
Assignees
Labels
Milestone

Comments

@iriberri
Copy link
Contributor

iriberri commented Feb 6, 2015

These last days I've seen that the column "wkb_geometry" appears empty when uploading some files (exported previously from CartoDB).

image

AFAIK, this is caused by regular CSV imports and not by guessing -- or at least the files I'm using "cannot be guessed" as they contain already a geometry and type guessing does not apply. It also happens when the file is guessed, though.

Example of CSV that provokes this here. Importing it to CartoDB makes the column appear. You can get rid of the_geom to force the guessing, the column also appears in this scenario.

cc @juanignaciosl @javisantana @rafatower

@juanignaciosl juanignaciosl added this to the Guarrate milestone Feb 6, 2015
@juanignaciosl juanignaciosl self-assigned this Feb 6, 2015
@javisantana
Copy link
Contributor

wkb_geometry is the column ogr2ogr uses by default, sounds like:

  • something wrong in the importer
  • a regresion in ogr2ogr

cc @rafatower

@juanignaciosl
Copy link
Contributor

After upgrading ogr2ogr2 from 2.0.0+svn.27830-precise1 to 2.0.0+svn.28335-precise1ubuntu1 some testing, the new version of ogr2ogr always creates wkb_geometry column, no matter you enable type guessing or not indeed, and I've seen no way to configure this. There's a param that looked promising, NONE_AS_UNKNOWN, but it doesn't seem to do what we want.

So, @rafatower , do you think we should file an issue or fix for ogr2ogr2? We could delete the column if it's empty, but it looks like a dirty patch to me, unless this is the (new) expected behaviour of ogr2ogr2.

@rafatower
Copy link
Contributor

I'd file a ticket into gdal project with some easy steps to reproduce (i.e: the smallest geojson file and command line to create the table, without all the cartodb code surounding it) and notify Even Rouault

cc/ @javisantana

PS: if it is not documented then assume it is a regression bug rather than undocumented behavior.

@juanignaciosl
Copy link
Contributor

Update: it happens only using -nlt PROMOTE_TO_MULTI parameter. Explanation:

PROMOTE_TO_MULTI can be used to automatically promote layers that mix polygon or multipolygons to multipolygons, and layers that mix linestrings or multilinestrings to multilinestrings. Can be usefull when converting shapefiles to PostGIS (and other target drivers) that implements strict checks for geometry type.

I've opened an issue at GDAL.

Importing of one of our fixtures, SHP1.zip, fails without that parameter, it doesn't seem that we can freely remove it. What do you think about adding that "drop wkb_geometry column after ogr2ogr if it is empty" until it gets fixed?

@juanignaciosl
Copy link
Contributor

Already fixed in trunk, a new release of ogr2ogr binary should fix it.

@rafatower
Copy link
Contributor

I just created a new version of ogr2ogr2 package, updated from gdal trunk following these instructions:
https://github.com/CartoDB/cartodb/wiki/How-to-build-gdal-and-ogr2ogr2

The new package is being built in launchpad: https://launchpad.net/~cartodb/+archive/ubuntu/gis-testing/+packages

After that, I'll install it in staging and we'll need to regression test imports related to ogr2ogr2:

  • csv with and without type guessing
  • geojson

@rafatower
Copy link
Contributor

New version has a regression:

$ make check-3
#...
csv regression tests
  imports records with cell line breaks (FAILED - 1)

Failures:

  1) csv regression tests imports records with cell line breaks
     Failure/Error: runner.db[%Q{
     Sequel::DatabaseError:
       PG::Error: ERROR:  relation "cdb_importer.importer_b9f46576c19611e499273e06c05daa57" does not exist
       LINE 3:       FROM cdb_importer.importer_b9f46576c19611e499273e06c05...
                          ^
     # ./services/importer/spec/acceptance/csv_spec.rb:140:in `block (2 levels) in <top (required)>'

Finished in 0.60066 seconds
1 example, 1 failure

Failed examples:

rspec ./services/importer/spec/acceptance/csv_spec.rb:128 # csv regression tests imports records with cell line breaks

It can be easily reproduced by importing this file: https://github.com/CartoDB/cartodb/blob/master/services/importer/spec/fixtures/in_cell_line_breaks.csv

@rafatower
Copy link
Contributor

It doesn't work in production. A trace from local setup:

2015-03-03 11:18:29 UTC: Filename: /tmp/imports/20150303-7340-1mabedw/in_cell_line_breaks.csv Size (bytes): 1497
2015-03-03 11:18:29 UTC: Importing data from /tmp/imports/20150303-7340-1mabedw/in_cell_line_breaks.csv
2015-03-03 11:18:29 UTC: File-based import load
2015-03-03 11:18:29 UTC: Errored importing data from /tmp/imports/20150303-7340-1mabedw/in_cell_line_breaks.csv:
2015-03-03 11:18:29 UTC: Encoding::UndefinedConversionError: "\x8D" to UTF-8 in conversion from Windows-1252 to UTF-8
2015-03-03 11:18:29 UTC: ----------------------------------------------------
2015-03-03 11:18:29 UTC: ["/home/developer/src/cartodb/services/importer/lib/importer/csv_normalizer.rb:128:in `each_line'"
 "/home/developer/src/cartodb/services/importer/lib/importer/csv_normalizer.rb:128:in `normalize'"
 "/home/developer/src/cartodb/services/importer/lib/importer/csv_normalizer.rb:52:in `run'"
 "/home/developer/src/cartodb/services/importer/lib/importer/loader.rb:117:in `block (2 levels) in normalize'"
 "/home/developer/src/cartodb/services/importer/lib/importer/importer_stats.rb:56:in `timing'"
 "/home/developer/src/cartodb/services/importer/lib/importer/loader.rb:110:in `block in normalize'"
 "/home/developer/src/cartodb/services/importer/lib/importer/loader.rb:107:in `each'"
 "/home/developer/src/cartodb/services/importer/lib/importer/loader.rb:107:in `inject'"
 "/home/developer/src/cartodb/services/importer/lib/importer/loader.rb:107:in `normalize'"

@rafatower
Copy link
Contributor

Solved in ogr2ogr22.0.0+svn.28596-precise1ubuntu2 and will be deployed to production in next chef pass.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants