Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add greedy (topological) coloring #61

Merged
merged 5 commits into from May 17, 2020
Merged

Conversation

martinfleis
Copy link
Member

@martinfleis martinfleis commented Apr 9, 2020

Implementation of greedy package for topological (greedy) colouring of GeoPandas GeoDataFrames. This is the first draft to have something concrete to work on. It is essentially copy&paste of the original code from https://github.com/martinfleis/greedy with few changes here and there to mention that greedy.greedy is now mapclassify.greedy and to make deps optional. Three notebooks used as docs for greedy are now merged to one to become part of Pysal Notebooks.

Apart from the obvious questions whether this is the optimal way how to implement it within mapclassify, there are some additional which need to be resolved.

  1. License. The core of _balanced algorithm is ported from QGIS where it is licensed under GPL-3.0. To make my life easier, I have used the same license for whole greedy. We need to figure out how to resolve this as GPL-3.0 is not compatible with BSD-3-Clause afaik. I don't mind for my code and @nyalldawson mentioned that he's also willing to relicense that part coming from QGIS if needed.

  2. Terminology. QGIS uses topological colouring, I am using mostly greedy colouring. What do you prefer? It would be good to have continuity from existing greedy, but I don't see it as a crucial thing if we decide that topological coloring is a preferred way. We can change greedy to something else.

  3. _geos_sw is a function which generates libpysal.W object based on intersection between objects or their buffered versions. Naturally it is significantly slower that Queen or Rook weights (but that will get better with pygeos within GeoPandas eventually), but it does not require topological correctness of dataset and allows tolerance (i.e. buffer). I feel that the right place for it could be libpysal as it may be useful elsewhere apart from greedy colouring. Should we implement it here with a note that it should go to libpysal and remove it from here once it is released there?

For details on what greedy brings, see https://martinfleis.github.io/greedy/.

xref pysal/splot#94, geopandas/geopandas#1165

@knaaptime
Copy link
Member

@knaaptime knaaptime commented Apr 9, 2020

sweet, thanks! Others may have opinions here, but my quick responses would be

  1. agree this will need to be resolved. we're kinda sticklers for BSD-3, so if you're willing to relicense, that would be the path of least resistance

  2. i vote for topological coloring, as I personally find it far more intuitive but i dont have a terribly strong preference

  3. how does _geos_sw compare to libpysal.weights.fuzzy_contiguity?

@martinfleis
Copy link
Member Author

@martinfleis martinfleis commented Apr 9, 2020

@knaaptime I had no idea fuzzy_contiguity exists, to be honest. I see only one difference and that is in the way how tolerance is set. While in fuzzy_contiguity that is a fraction of gdf bbox length, in _geos_sw the actual distance between features. Otherwise the algorithm is essentially the same, just buffer in here is vectorised unlike in libpysal. I guess that the best way forward is proposing minor enhancement to fuzzy_contiguity to allow both ways of setting tolerance.

@martinfleis
Copy link
Member Author

@martinfleis martinfleis commented Apr 9, 2020

I have marked most of the tests with pytest.xfail as there is no rtree on CI. That would be ideal to get from conda-forge, but apart from numba everything now comes from pip (I tried to get rtree but it resulted in error https://travis-ci.org/github/pysal/mapclassify/builds/673150902). Do you have a preference how to go around that? We may even keep it xfailing if this functionality will eventually come from fuzzy_contiguity.

@knaaptime
Copy link
Member

@knaaptime knaaptime commented Apr 9, 2020

we should just install everything from conda (like we do elsewhere) and avoid pip on CI altogether

@nyalldawson
Copy link

@nyalldawson nyalldawson commented Apr 9, 2020

Just to confirm that I'm happy to relicense any part of this I've written to whatever license works for your needs. Even WTFPL if you want.

@knaaptime
Copy link
Member

@knaaptime knaaptime commented Apr 9, 2020

@nyalldawson 😁 appreciate it!

@martinfleis
Copy link
Member Author

@martinfleis martinfleis commented Apr 9, 2020

I have opened a small PR in libpysal pysal/libpysal#280 extending functionality of existing fuzzy_contiguity to accommodate the needs of greedy (specific buffer distance). That resolves the question of _geos_sw which will be replaced by libpysal.fuzzy_contiguity as soon as this update is released.

Copy link
Member

@sjsrey sjsrey left a comment

This looks very nice.

I left one question about the license.

mapclassify/greedy.py Outdated Show resolved Hide resolved
@sjsrey sjsrey added this to In progress in mapclassify 2.3.0 May 17, 2020
@martinfleis
Copy link
Member Author

@martinfleis martinfleis commented May 17, 2020

Thanks @sjsrey. I have removed the explicit license from the file as it will inherit the license of the whole package now.

@sjsrey sjsrey merged commit b61e0b3 into pysal:master May 17, 2020
2 checks passed
@jGaboardi jGaboardi moved this from In progress to Done in mapclassify 2.3.0 May 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

5 participants