Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for multiple gdf and geojson as input #169

Merged
merged 17 commits into from
Aug 26, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
63 changes: 63 additions & 0 deletions docs/example/input-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -366,6 +366,69 @@ tp.Topology(dict_in, prequantize=False).to_json()
</div>
</div>

* * *

## `list` of GeoDataFrames
From the package `geopandas` (not a hard dependency).

<div class="code-example mx-1 bg-example">
<div class="example-label" markdown="1">
Example 🔧
{: .label .label-blue-000 }
</div>
<div class="example-text" markdown="1">

```python
import topojson as tp
import geopandas gpd

gdf_1 = gpd.GeoDataFrame({
"uniq_name": ["abc", "def"],
"shrd_name": ["rect", "rect"],
"geometry": [
geometry.Polygon([[1, 1], [2, 1], [2, 2], [1, 2], [1, 1]]),
geometry.Polygon([[0, 1], [1, 1], [1, 2], [0, 2], [0, 1]])
]
})
gdf_2 = gdf_1.dissolve(by='shrd_name', as_index=False)

topo = tp.Topology(data=[gdf_1, gdf_2], object_name=['geom_1', 'geom_2'], prequantize=False)
topo.to_dict()

```

```python
{'type': 'Topology',
'objects': {'geom_1': {'geometries': [{'properties': {'uniq_name': 'abc',
'shrd_name': 'rect'},
'type': 'Polygon',
'arcs': [[-1, 2]],
'id': 0},
{'properties': {'uniq_name': 'def', 'shrd_name': 'rect'},
'type': 'Polygon',
'arcs': [[1, 0, 3]],
'id': 1}],
'type': 'GeometryCollection'},
'geom_2': {'geometries': [{'properties': {'shrd_name': 'rect',
'uniq_name': 'abc'},
'type': 'Polygon',
'arcs': [[1, 2, 3]],
'id': 0}],
'type': 'GeometryCollection'}},
'bbox': (0.0, 1.0, 2.0, 2.0),
'arcs': [[[1.0, 2.0], [1.0, 1.0]],
[[0.0, 1.0], [0.0, 2.0], [1.0, 2.0]],
[[1.0, 2.0], [2.0, 2.0], [2.0, 1.0], [1.0, 1.0]],
[[1.0, 1.0], [0.0, 1.0]]]}
```
```python
topo.to_gdf(object_name='geom_2').plot(column='shrd_name')
topo.to_gdf(object_name='geom_1').plot(column='uniq_name')
```
<img src="../images/multiple_objects.png">

</div>
</div>
<script>
window.addEventListener("DOMContentLoaded", event => {
var opt = {
Expand Down
7 changes: 4 additions & 3 deletions docs/example/output-types.md
Original file line number Diff line number Diff line change
Expand Up @@ -263,7 +263,8 @@ print(topo.to_json(pretty=True))
</pre>
The `pretty` option depends on the setting `indent` and `maxlinelength`, these default to `4` and `88` respectively.

More options in generating the GeoJSON from the computed Topology are `validate` (`True` or `False`), `winding_order` and `decimals`. Where the TopoJSON standard defines a winding order of clock-wise orientation for outer polygons and counter-clockwise orientation for inner polygons is the winding order in the GeoJSON standard the opposite (`CCW_CW`). The `decimals` option defines the number of decimals for the output coordinates.
More options in generating the GeoJSON from the computed Topology are `validate` (`True` or `False`), `winding_order` and `decimals` and `object_name`. Where the TopoJSON standard defines a winding order of clock-wise orientation for outer polygons and counter-clockwise orientation for inner polygons is the winding order in the GeoJSON standard the opposite (`CCW_CW`). The `decimals` option defines the number of decimals for the output coordinates. With the option `object_name` it is possible to specify which object you want to serialize to GeoJSON (in case of multiple objects in the input data), defaults to index `0`.

</div>
</div>

Expand Down Expand Up @@ -299,7 +300,7 @@ topo.to_alt()
```
<div id="embed_output_mesh_altair"></div>

A few more convenience options are included for Altair visualizations, such as assigning a color property for individual features and using geographic projections.
A few more convenience options are included for Altair visualizations, such as assigning a color property for individual features and using geographic projections and defining the `object_name`. With the option `object_name` it is possible to specify which object you want to serialize to Altair (in case of multiple objects in the input data), defaults to index `0`.

Per TopoJSON specification, information of individual features are stored as an nested object within `properties`. For example here is shown the properties of the feature at index-0:

Expand Down Expand Up @@ -339,7 +340,7 @@ topo.to_alt(color='properties.name:N', projection='equalEarth')

## .to_gdf()

Serialize the Topology object into a GeoPandas GeoDataFrame. This destroys the Topology. GeoPandas is an optional dependency and not automatically installed.
Serialize the Topology object into a GeoPandas GeoDataFrame. This destroys the Topology. GeoPandas is an optional dependency and not automatically installed. With the option `object_name` it is possible to specify which object you want to serialize into a GeoDataFrame using the `object_name` (in case of multiple objects in the input data), defaults to index `0`.

**Note:** There is no winding-order enforcement in the OGR model; so the Fiona/OGR `TopoJSON` driver is NOT used in this routine, but the `.to_geojson()` function.

Expand Down
Binary file added docs/images/multiple_objects.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
12 changes: 12 additions & 0 deletions tests/test_extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -476,3 +476,15 @@ def test_extract_read_geojson_from_json_dict():
topo = Extract(data).to_dict()

assert len(topo["linestrings"]) == 287


def test_extract_read_multiple_gdf_object_name():
world = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
world = world[["continent", "geometry", "pop_est"]]
continents = world.dissolve(by="continent", aggfunc="sum")

topo = Extract(
data=[world, continents], options={"object_name": ["world", "continents"]}
).to_dict()

assert len(topo["objects"]) == len(world) + len(continents)
40 changes: 27 additions & 13 deletions tests/test_hashmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,8 @@ def test_hashmap_geomcol_multipolygon_polygon():
topo = Hashmap(data).to_dict()

assert topo["objects"]["data"]["geometries"][0]["geometries"][0]["arcs"] == [
[[4, 0], [1]],
[[2]]
[[4, 0], [1]],
[[2]],
]


Expand Down Expand Up @@ -276,22 +276,36 @@ def test_hashmap_fiona_gpkg_to_dict():

assert len(topo["linestrings"]) == 4


# issue #148 and issue #167
def test_hashmap_serializing_holes():
mp = geometry.shape({
"type": "MultiPolygon",
"coordinates": [
[
[[0, 0], [20, 0], [10, 20], [0, 0]], # CCW
[[8, 2], [12, 12], [17, 2], [8, 2]], # CW
[[3, 2], [5, 6], [7, 2], [3, 2]], # CW
mp = geometry.shape(
{
"type": "MultiPolygon",
"coordinates": [
[
[[0, 0], [20, 0], [10, 20], [0, 0]], # CCW
[[8, 2], [12, 12], [17, 2], [8, 2]], # CW
[[3, 2], [5, 6], [7, 2], [3, 2]], # CW
],
[[[10, 3], [15, 3], [12, 9], [10, 3]]], # CCW
],
[[[10, 3], [15, 3], [12, 9], [10, 3]]], # CCW
]
})
}
)
topo = Hashmap(mp)
topo = topo.to_dict()

arc = topo['objects']['data']['geometries'][0]['arcs']
arc = topo["objects"]["data"]["geometries"][0]["arcs"]
assert arc == [[[0], [1], [2]], [[3]]]


def test_hashmap_read_multiple_gdf_object_name():
world = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
world = world[["continent", "geometry", "pop_est"]]
continents = world.dissolve(by="continent", aggfunc="sum")

topo = Hashmap(
data=[world, continents], options={"object_name": ["world", "continents"]}
).to_dict()

assert len(topo["objects"]) == 2
14 changes: 14 additions & 0 deletions tests/test_topology.py
Original file line number Diff line number Diff line change
Expand Up @@ -608,6 +608,7 @@ def test_topology_topoquantize():

assert len(topo["arcs"]) == 149


# test for https://github.com/mattijn/topojson/issues/164
def test_topology_gdf_keep_index():
gdf = (
Expand All @@ -619,3 +620,16 @@ def test_topology_gdf_keep_index():
gdf_idx = topo.to_gdf().index.to_list()

assert gdf_idx == [1, 2, 11, 12, 13]


def test_topology_write_multiple_object_json_dict():
world = geopandas.read_file(geopandas.datasets.get_path("naturalearth_lowres"))
world = world[["continent", "geometry", "pop_est"]]
continents = world.dissolve(by="continent", aggfunc="sum")

topo = topojson.Topology(
data=[world, continents], object_name=["world", "continents"]
)
topo_dict = topo.to_dict()

assert len(topo_dict["objects"]) == 2
6 changes: 5 additions & 1 deletion topojson/core/cut.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import itertools
import pprint
import copy
import warnings
import numpy as np
from shapely import geometry
from shapely.strtree import STRtree
from shapely.errors import ShapelyDeprecationWarning
from .join import Join
from ..ops import insert_coords_in_line
from ..ops import np_array_bbox_points_line
Expand Down Expand Up @@ -109,7 +111,9 @@ def _cutter(self, data):
mp = geometry.MultiPoint([mp])

# create spatial index on junctions
tree_splitter = STRtree(mp)
with warnings.catch_warnings():
warnings.filterwarnings("ignore", category=ShapelyDeprecationWarning)
tree_splitter = STRtree(mp)
slist = []
# junctions are only existing in coordinates of linestring
if self.options.shared_coords:
Expand Down
31 changes: 29 additions & 2 deletions topojson/core/extract.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ def __init__(self, data, options={}):
self._is_single = True
self._invalid_geoms = 0
self._tried_geojson = False
self._is_multi_geom = False
self._geom_offset = 0

self.output = self._extractor(data)

Expand Down Expand Up @@ -165,6 +167,7 @@ def _serialize_geom_type(self, geom):
- geopandas.GeoSeries
- dict of objects that provide a __geo_interface__
- list of objects that provide a __geo_interface__
- list of geopandas.GeoDataFrames
- object that provide a __geo_interface__
- TopoJSON dict
- TopoJSON string
Expand Down Expand Up @@ -574,8 +577,32 @@ def _extract_list(self, geom):
geom : list
List instance
"""
# convert list to indexed-dictionary
data = dict(enumerate(geom))
# check if there are multiple entries in the `object_name` in settings.
# currently only supports multiple GeoDataFrames as input entries
if len(self.options.object_name) > 1:
# list consist of objects
if len(self.options.object_name) != len(geom):
raise LookupError(
"the number of data objects does not match the number of object_name"
)
geom_offset = np.cumsum([len(gdf) for gdf in geom]).tolist()
geom_offset.pop()
geom_offset.insert(0, 0)
self._geom_offset = geom_offset
for ix, gdf in enumerate(geom):
gdf = gdf.copy()
start = geom_offset[ix]
gdf["__geom_name"] = self.options.object_name[ix]
geom[ix] = dict(enumerate(gdf.to_dict(orient="records"), start))

for ix in range(1, len(geom)):
geom[0].update(geom.pop(ix))
data = geom[0]
self._is_multi_geom = True
else:
# list consist of features
# convert list to indexed-dictionary
data = dict(enumerate(geom))

# new data dictionary is created, throw the geometries back to main()
self._is_single = False
Expand Down
38 changes: 26 additions & 12 deletions topojson/core/hashmap.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,22 +92,36 @@ def _hashmapper(self, data):
# resolve bookkeeping of coordinates in objects, including delta-encoding
list(self._resolve_objects(["arcs", "coordinates"], self._data["objects"]))

objects = {}
objects["geometries"] = []
objects["type"] = "GeometryCollection"
for feature in data["objects"]:
feat = data["objects"][feature]
feat["id"] = feature

if "geometries" in feat and len(feat["geometries"]) == 1:
feat["type"] = feat["geometries"][0]["type"]
resolved_data_objects = {}
for object_ix, object_name in enumerate(self.options.object_name):
objects = {}
objects["geometries"] = []
objects["type"] = "GeometryCollection"
for feature in data["objects"]:
feat = data["objects"][feature]
if not self._is_multi_geom:
do_resolve = True
feat["id"] = feature
elif (
"__geom_name" in feat["properties"]
and feat["properties"]["__geom_name"] == object_name
):
do_resolve = True
feat["id"] = feature - self._geom_offset[object_ix]
del feat["properties"]["__geom_name"]
else:
do_resolve = False

self._resolve_arcs(feat)
if do_resolve:
if "geometries" in feat and len(feat["geometries"]) == 1:
feat["type"] = feat["geometries"][0]["type"]

objects["geometries"].append(feat)
self._resolve_arcs(feat)

objects["geometries"].append(feat)
resolved_data_objects[object_name] = objects
data["objects"] = {}
data["objects"][self.options.object_name] = objects
data["objects"] = resolved_data_objects

# prepare to return object
data = self._data
Expand Down
4 changes: 0 additions & 4 deletions topojson/core/join.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
from shapely.wkb import loads
from shapely.ops import shared_paths
from shapely.ops import linemerge
from shapely import speedups
from ..ops import select_unique_combs
from ..ops import simplify
from ..ops import quantize
Expand All @@ -15,9 +14,6 @@
from ..utils import serialize_as_svg
from .extract import Extract

if speedups.available:
speedups.enable()


class Join(Extract):
"""
Expand Down
Loading