New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impoved reading and writing from/to PostGIS (SQL in general?) support #595

Open
jorisvandenbossche opened this Issue Oct 22, 2017 · 8 comments

Comments

Projects
None yet
4 participants
@jorisvandenbossche
Copy link
Member

jorisvandenbossche commented Oct 22, 2017

Currently we only have a very basic read_postgis function, and we certainly want a write function as well (#189). But, we currently also have some different open (overlapping) PRs and issues related to improving the IO support for PostGIS. Therefore I thought to open a new general issue to get some overview.

Open PRs:

  • #440 PR adding to_postgis
  • #457 PR with both read/write for postgis
  • #546 PR to use geoalchemy in from_postgis
  • one not related to postgis: #101 PR to add support for sqlite

Does somebody have an insight in what the main differences are between the postgis PRs? How to proceed with those?

Some questions related to this that we might need to answer:

  • Do we want to use geoalchemy (https://geoalchemy-2.readthedocs.io/en/latest/)? (and thus add it as an optional requirement) What does it bring?
  • Can we actually support more than PostGIS? More general SQL support? (#490) Eg also MySQL has spatial data data types (https://dev.mysql.com/doc/refman/5.7/en/spatial-datatypes.html) But eg geoalchemy does not seem to support that.
  • Naming of the functions (#161): currenlty GeoDataFrame.from_postgis and read_postgis. Depending on the question above, we might want to make it more general (read_sql, to_sql). Personally I would retire the 'from_postgis' for read_postgis (or read_sql) anyhow.

There is some relevant discussion in #161 as well.

Other related issues: #451 on adding SRID support in read_postgis

cc @jdmcbr @dimitri-justeau @showjackyang @adamboche @kuanb @emiliom @perrygeo @carsonfarmer

@jdmcbr

This comment has been minimized.

Copy link
Member

jdmcbr commented Dec 3, 2017

@jorisvandenbossche Well, my answer is biased, but not being aware of significant advantages geoalchemy2 would provide, I'd be inclined to keep the geopandas writing functionality a little lighter weight. That said, if someone were to point out some real advantages of geoalchemy2, my opinion would likely change.

I don't have any personal experience with spatial MySQL, but if someone has a test database set up they'd be willing to give access to, I'd be happy to try to connect with geopandas and see what breaks.

I'm fine with retiring from_postgis. I remembering finding the pandas choice regarding to and read slightly unintuitive at first, but we might as well be consistent with that standard.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Dec 3, 2017

@jdmcbr

This comment has been minimized.

Copy link
Member

jdmcbr commented Dec 3, 2017

@mrocklin I would guess mainly that tacking light geometry handling on top of pandas SQL support offered roughly the same capabilities with considerably less work. I've never looked seriously into using GDAL for postGIS access, but I don't see anything at http://www.gdal.org/drv_pg.html that pops out as motivating a move to rely on GDAL.

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Dec 3, 2017

@jdmcbr

This comment has been minimized.

Copy link
Member

jdmcbr commented Dec 3, 2017

Well, the fiona manual has this to say: "If your data is in a RDBMS like PostGIS, use a Python DB package or ORM like SQLAlchemy or GeoAlchemy. Maybe you’re using GeoDjango already. If so, carry on."

@mrocklin

This comment has been minimized.

Copy link
Member

mrocklin commented Dec 3, 2017

@dimitri-justeau

This comment has been minimized.

Copy link

dimitri-justeau commented Dec 3, 2017

Hello everyone, sorry I am answering a bit late to this topic, I have been very busy lately.

Here are the reasons of my choice to rely on geoalchemy2 in the PR I made:

  • Pandas relies on sqlalchemy for its to_sql method. The advantage is that Pandas does not have to worry about building sql statements. It just binds a dataframe's datatypes to sqlalchemy's datatypes. I think that this strategy is a good one since each library deals with what it had been designed for: pandas with dataframes, sqlalchemy with RDBMs. If new backends are supported by sqlalchemy, they are automatically available within Pandas, as well as API changes.

  • I wanted to keep close to this approach, since geoalchemy2 provides an abstraction layer over PostGIS datatypes. If the code I added in #546 can seems a lot, it is actually more or less a replication of what exists in Pandas. I just had to fully rewrite some methods of the subclassed objects to use the appropriate object types.

@jdmcbr Your approach in #440 is in fact much simpler and does not introduce a new dependency, but is also in its current state more limited. In fact, in the PostGIS tables that are created, the geometry type is just geometry, without any SRID information. The SRID could be easily informed relying on EWKT instead of WKT (c.f. #546). The datatypes could also be inferred the same way I did by subclassing the SQLTable to GeoSQLTable.

From this point, I think that the choice to rely on geoalchemy2 or not is mainly a design decision since the same functionalities could be achieved with both approaches. I personally don't have a particular preference, but maybe extending @jdmcbr PR to be more complete would prevent the addition of a new dependency. On the other hand, if geolachemy2 supports more backends in the future, it also might be interesting to stick to it.

Cheers,

Dim'

@jorisvandenbossche

This comment has been minimized.

Copy link
Member Author

jorisvandenbossche commented Dec 8, 2017

Would using geoalchemy2 take care of updating the "geometry_columns" table in postgis?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment