Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GeoPandas as dependency? #169

Open
tfardet opened this issue Aug 21, 2023 · 6 comments
Open

Add GeoPandas as dependency? #169

tfardet opened this issue Aug 21, 2023 · 6 comments
Labels
question Further information is requested

Comments

@tfardet
Copy link
Collaborator

tfardet commented Aug 21, 2023

EDIT: given the current state of geopandas packaging via wheels, I propose adding it as a real dependency and removing GeoFrDataFrame


I see that geopandas was removed on purpose as a dependency from pynsee in from #6
Though I understand the rationale, I do wonder whether, in the long run, it would not make sense to make it an optional dependency such that GeoFrDataFrame is a subset of GeoDataFrame if geopandas is not available but is actually a GeoDataFrame if geopandas is available.

In terms of import, one would replace the import in pynsee/geodata/__init__.py by

try:
    from geopandas import GeoDataFrame as GeoFrDataFrame
except ImportError:
    from .GeoFrDataFrame import GeoFrDataFrame

This does not imply any breaking change as far as I can tell: we could keep the get_geom/translate/zoom functions that do not exist in geopandas (though it might be worth making a custom GeoFrSeries for improved compatibility).

Opinions on that?

@tfardet tfardet added the question Further information is requested label Aug 21, 2023
@hadrilec
Copy link
Contributor

So far I tried not to use geopandas as a dependency, as it is sometimes not easy to install.
I am not really keen on having it as a dependency.
What would be the advantages of having it as a dependency?

@tfardet
Copy link
Collaborator Author

tfardet commented Aug 21, 2023

Here it would be an optional dependency, i.e. it is not required to install pynsee, but if available, it is used.
The advantage is that people that have geopandas installed directly get a GeoDataFrame with all its capabilities, without having to do the extra conversion step.
It may seem somewhat minor but the extra maintenance cost is also zero for the minimal implementation.

This also comes from the fact that I would assume that most people getting a DataFrame with a "geometry" column expect it to be a GeoDataFrame or a subclass, and thus expect all the GeoDataFrame/GeoSeries methods to work.
With this implementation, we do not break any expectations (people not expecting anything do not get less than they expected, people expecting the GeoDataFrame get it).

@hadrilec
Copy link
Contributor

I would find it a bit confusing to have sometimes an output that is a geopandas.GeoDataFrame and sometimes an output that is pynsee.GeoFrDataFrame depending on the dependencies. Such feature would mean that if a bug happens it is not straightforward to know whether it is related to geopandas or not. As a maintainer I would prefer to not have to answer this question. I think the extra step is not a big one, at least it makes it explicit that we are not handling geopandas object.

In the long run, if this need is expressed by more users, why not considering it, but tbh for now I think there are already enough issues to tackle.

@tfardet tfardet added the wontfix This will not be worked on label Aug 21, 2023
@tgrandje
Copy link
Collaborator

Hi @hadrilec , I'm not sure your statement about geopandas is still relevant. I found that geopandas installation was efficiently simplified due to the release of whl files for Fiona on pypi less than 6 months ago. It sure could be a pain before that (particularly on windows), but I don't think that's the case anymore.

The only problems I'm still encountering with geopandas are linked to conflicts between gdal (osgeo) and fiona (which is another GDAL binding for python). There are cases where gdal is mandatory (for instance, to activate geodjango) and depending on geopandas could indeed be problematic. (I still have to see the consequences of using both libraries...)

In any case, this is very specific and I'm not sure we need to concern ourselves with geopandas's installation anymore...

@tfardet tfardet removed the wontfix This will not be worked on label Nov 15, 2023
@tfardet tfardet changed the title GeoPandas as optional dependency? Add GeoPandas as dependency? Jan 8, 2024
@tfardet
Copy link
Collaborator Author

tfardet commented Aug 9, 2024

I'm bumping this issue once more, pushing for the addition of geopandas because:

  1. it would give us a proper geographic object to return for geodata (which GeoFrDataFrame is not), so people can work with all geofunctions
  2. it would solve most of the issues associated to Bugfix/geoparquet storage #204 and let us simplify the code
  3. it would simplify save_df and making it faster because we could pass the library instead and load geoparquet directly with geopandas, as GeoDataFrame, rather than having to load via pandas and then cast into another class.

@hadrilec could you please check again on the computer from the INSEE so we can move forward with this if you confirm there are no installation issues?

@hadrilec
Copy link
Contributor

I will give it another try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants