Skip to content

Commit

Permalink
Merge pull request #13 from HTenkanen/support_all_filters
Browse files Browse the repository at this point in the history
ENH: Support all filters
  • Loading branch information
HTenkanen committed Apr 16, 2020
2 parents f0ff4f6 + 26c8e62 commit 09b1994
Show file tree
Hide file tree
Showing 28 changed files with 1,488 additions and 190 deletions.
17 changes: 17 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,23 @@
Changelog
=========

v0.4.0
------

- read PBF using custom queries (allows anything to be fetched)
- read landuse from PBF
- read natural from PBF
- improve geometry parsing so that geometry type is read automatically according OSM rules
- modularize code-base
- improve test coverage


v0.3.1
------

- generalize code base
- read Points of Interest (POI) from PBF

v0.2.0
------

Expand Down
17 changes: 10 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@
[![PyPI version](https://badge.fury.io/py/pyrosm.svg)](https://badge.fury.io/py/pyrosm)[![build status](https://api.travis-ci.org/HTenkanen/pyrosm.svg?branch=master)](https://travis-ci.org/HTenkanen/pyrosm)[![Coverage Status](https://codecov.io/gh/HTenkanen/pyrosm/branch/master/graph/badge.svg)](https://codecov.io/gh/HTenkanen/pyrosm)

**Pyrosm** is a Python library for reading OpenStreetMap from `protobuf` files (`*.osm.pbf`) into Geopandas GeoDataFrames.
Pyrosm makes it easy to extract various datasets from OpenStreetMap pbf-dumps including e.g. road networks and buildings (points of interest in progress).
Pyrosm makes it easy to extract various datasets from OpenStreetMap pbf-dumps including e.g. road networks, buildings,
Points of Interest (POI), landuse and natural elements. Also fully customized queries are supported which makes it possible
to parse the data from OSM with more specific filters.


**Pyrosm** is easy to use and it provides a somewhat similar user interface as another popular Python library [OSMnx](https://github.com/gboeing/osmnx)
for parsing different datasets from the OpenStreetMap pbf-dump including road networks, buildings and Points of Interest (later also landuse and possibility to make customized calls).
**Pyrosm** is easy to use and it provides a somewhat similar user interface as [OSMnx](https://github.com/gboeing/osmnx).
The main difference between pyrosm and OSMnx is that OSMnx reads the data over internet using OverPass API, whereas pyrosm reads the data from local OSM data dumps
that can be downloaded e.g. from [GeoFabrik's website](http://download.geofabrik.de/). This makes it possible to read data much faster thus
allowing e.g. parsing street networks for whole country in a matter of minutes instead of hours (however, see [caveats](#caveats)).
Expand All @@ -25,16 +26,18 @@ which is also used by OpenStreetMap contributors to distribute the OSM data in P
- read street networks (separately for driving, cycling, walking and all-combined)
- read buildings from PBF
- read Points of Interest (POI) from PBF
- read landuse from PBF
- read "natural" from PBF
- read any other data from PBF by using a custom user-defined filter
- filter data based on bounding box
- apply custom filter to filter data
- e.g. keeping only specific type of buildings can be done by applying a filter: `{'building': ['residential', 'retail']}`


## Roadmap

- add parsing of landuse
- improve docs and make simple website
- run benchmarks against other tools
- add possibility to crop PBF and save a subset into new PBF.
- add more tests
- automate PBF downloading from Geofabrik (?)

## Install

Expand Down
32 changes: 32 additions & 0 deletions make.bat
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
@echo off

REM Cython building utility commands for Windows

REM Clean all C-files, pyd-files, pyrobuf-directory, build-directory, and egg-info
if "%1" == "clean" (
IF EXIST *.pyd (
del /S *.pyd
)
REM For Some reason C-files are not detected automatically with if-exist
del /S *.c

IF EXIST .coverage (
del /S .coverage
)

IF EXIST pyrosm.egg-info (
RMDIR /S /Q pyrosm.egg-info
)

IF EXIST pyrobuf (
RMDIR /S /Q pyrobuf
)

IF EXIST build (
RMDIR /S /Q build
)

IF EXIST .pytest_cache (
RMDIR /S /Q .pytest_cache
)
)
38 changes: 8 additions & 30 deletions pyrosm/buildings.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from pyrosm.data_manager import get_osm_data
from pyrosm.geometry import create_polygon_geometries
from pyrosm.frames import create_gdf
from pyrosm.relations import prepare_relations
from pyrosm.frames import prepare_geodataframe
from pyrosm.utils import validate_custom_filter
import geopandas as gpd
import warnings

Expand All @@ -12,9 +11,7 @@ def get_building_data(node_coordinates, way_records, relations, tags_as_columns,
custom_filter = {"building": True}
else:
# Check that the custom filter is in correct format
if not isinstance(custom_filter, dict):
raise ValueError(f"'custom_filter' should be a Python dictionary. "
f"Got {custom_filter} with type {type(custom_filter)}.")
validate_custom_filter(custom_filter)

# Ensure that the "building" tag exists
if "building" not in custom_filter.keys():
Expand All @@ -31,32 +28,13 @@ def get_building_data(node_coordinates, way_records, relations, tags_as_columns,
)

# If there weren't any data, return empty GeoDataFrame
if ways is None:
warnings.warn("Could not find any buildings for given area.",
if nodes is None and ways is None and relations is None:
warnings.warn("Could not find any landuse elements for given area.",
UserWarning,
stacklevel=2)
return gpd.GeoDataFrame()

# Create geometries for normal ways
geometries = create_polygon_geometries(node_coordinates,
ways)

# Convert to GeoDataFrame
way_gdf = create_gdf(ways, geometries)
way_gdf["osm_type"] = "way"

# Prepare relation data if it is available
if relations is not None:
relations = prepare_relations(relations, relation_ways,
node_coordinates,
tags_as_columns)
relation_gdf = gpd.GeoDataFrame(relations)
relation_gdf["osm_type"] = "relation"

gdf = way_gdf.append(relation_gdf, ignore_index=True)
else:
gdf = way_gdf

gdf = gdf.dropna(subset=['geometry']).reset_index(drop=True)

# Prepare GeoDataFrame
gdf = prepare_geodataframe(nodes, node_coordinates, ways,
relations, relation_ways, tags_as_columns)
return gdf
3 changes: 2 additions & 1 deletion pyrosm/config/default_tags.py
Original file line number Diff line number Diff line change
Expand Up @@ -856,7 +856,8 @@
# PUBLIC_TRANSPORT TAGS
# ========================
# See: https://wiki.openstreetmap.org/wiki/Key%3Apublic_transport
public_transport_columns = ["stop_position",
public_transport_columns = basic_info_tags + \
["stop_position",
"platform",
"station",
"stop_area",
Expand Down
2 changes: 0 additions & 2 deletions pyrosm/data/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,8 +24,6 @@ def get_path(dataset):
"""
if dataset in _package_files:
return os.path.abspath(os.path.join(_module_path, _package_files[dataset]))
elif dataset in _temp_files:
return os.path.join(_temp_path, _temp_files[dataset])
else:
msg = "The dataset '{data}' is not available. ".format(data=dataset)
msg += "Available datasets are {}".format(", ".join(available))
Expand Down
5 changes: 3 additions & 2 deletions pyrosm/data_filter.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -93,8 +93,9 @@ cdef filter_osm_records(data_records,
if not isinstance(osm_data_type, list):
osm_data_type = [osm_data_type]

if len(data_filter) == 0:
data_filter = None
if data_filter is not None:
if len(data_filter) == 0:
data_filter = None

if data_filter is not None:
filter_keys = list(data_filter.keys())
Expand Down
12 changes: 10 additions & 2 deletions pyrosm/data_manager.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -85,8 +85,9 @@ cdef get_way_arrays(way_records, relation_way_ids, osm_keys, tags_as_columns, da
if relation_way_ids is not None:
# Separate ways that are part of a relation
ways, relation_ways = separate_relation_ways(ways, relation_way_ids)
relation_ways = convert_way_records_to_lists(relation_ways, tags_as_columns)
relation_arrays = convert_to_arrays_and_drop_empty(relation_ways)
if len(relation_ways) > 0:
relation_ways = convert_way_records_to_lists(relation_ways, tags_as_columns)
relation_arrays = convert_to_arrays_and_drop_empty(relation_ways)

# Process separated ways
ways = convert_way_records_to_lists(ways, tags_as_columns)
Expand All @@ -99,6 +100,10 @@ cdef get_osm_ways_and_relations(way_records, relations, osm_keys, tags_as_column
# Tags that should always be kept
tags_as_columns += ["id", "nodes", "timestamp", "version"]

# If any way records weren't passed in, cannot parse anything
if way_records is None:
return None, None, None

# Get relations for specified OSM keys (one or multiple)
if relations is not None:
filtered_relations = get_relation_arrays(relations, osm_keys, data_filter)
Expand All @@ -122,6 +127,9 @@ cdef get_osm_ways_and_relations(way_records, relations, osm_keys, tags_as_column
tags_as_columns,
data_filter,
filter_type)
# If relation ways could not be parsed, also relations should be returned as None
if relation_ways is None:
filtered_relations = None

# If there weren't any ways return None
if ways is None:
Expand Down
6 changes: 5 additions & 1 deletion pyrosm/frames.pxd
Original file line number Diff line number Diff line change
@@ -1,2 +1,6 @@
cpdef create_nodes_gdf(node_dict_list)
cpdef create_gdf(data_records, geometry_array)
cpdef create_gdf(data_records, geometry_array)
cpdef prepare_way_gdf(node_coordinates, ways)
cpdef prepare_node_gdf(nodes)
cpdef prepare_geodataframe(nodes, node_coordinates, ways,
relations, relation_ways, tags_as_columns)
57 changes: 54 additions & 3 deletions pyrosm/frames.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,11 @@ import pandas as pd
import geopandas as gpd
from pyrosm._arrays cimport concatenate_dicts_of_arrays
from pyrosm.geometry cimport _create_point_geometries

from pyrosm.geometry cimport create_way_geometries
from pyrosm.relations import prepare_relations

cpdef create_nodes_gdf(nodes):
cdef str k
if isinstance(nodes, list):
nodes = concatenate_dicts_of_arrays(nodes)
df = pd.DataFrame()
Expand All @@ -13,10 +15,9 @@ cpdef create_nodes_gdf(nodes):
df['geometry'] = _create_point_geometries(nodes['lon'], nodes['lat'])
return gpd.GeoDataFrame(df, crs='epsg:4326')


cpdef create_gdf(data_arrays, geometry_array):
cdef str key
df = pd.DataFrame()

for key, data in data_arrays.items():
# When inserting nodes,
# those should be converted
Expand All @@ -28,3 +29,53 @@ cpdef create_gdf(data_arrays, geometry_array):

df['geometry'] = geometry_array
return gpd.GeoDataFrame(df, crs='epsg:4326')

cpdef prepare_way_gdf(node_coordinates, ways):
if ways is not None:
geometries = create_way_geometries(node_coordinates,
ways)
# Convert to GeoDataFrame
way_gdf = create_gdf(ways, geometries)
way_gdf['osm_type'] = "way"
else:
way_gdf = gpd.GeoDataFrame()
return way_gdf

cpdef prepare_node_gdf(nodes):
if nodes is not None:
# Create GeoDataFrame from nodes
node_gdf = create_nodes_gdf(nodes)
node_gdf['osm_type'] = "node"
else:
node_gdf = gpd.GeoDataFrame()
return node_gdf

cpdef prepare_relation_gdf(node_coordinates, relations, relation_ways, tags_as_columns):
if relations is not None:
relations = prepare_relations(relations, relation_ways,
node_coordinates,
tags_as_columns)

relation_gdf = gpd.GeoDataFrame(relations)
relation_gdf['osm_type'] = "relation"

else:
relation_gdf = gpd.GeoDataFrame()
return relation_gdf

cpdef prepare_geodataframe(nodes, node_coordinates, ways,
relations, relation_ways,
tags_as_columns):
# Prepare nodes
node_gdf = prepare_node_gdf(nodes)

# Prepare ways
way_gdf = prepare_way_gdf(node_coordinates, ways)

# Prepare relation data
relation_gdf = prepare_relation_gdf(node_coordinates, relations, relation_ways, tags_as_columns)

# Merge all
gdf = pd.concat([node_gdf, way_gdf, relation_gdf])
gdf = gdf.dropna(subset=['geometry']).reset_index(drop=True)
return gdf
2 changes: 1 addition & 1 deletion pyrosm/geometry.pxd
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
cpdef create_point_geometries(xarray, yarray)
cdef _create_point_geometries(xarray, yarray)
cdef _create_way_geometries(node_coordinates, way_elements)
cpdef create_way_geometries(node_coordinates, way_elements)
cdef create_pygeos_polygon_from_relation(node_coordinates, relation_ways, member_roles)
cpdef create_polygon_geometries(node_coordinates, way_elements)
cdef create_linear_ring(coordinates)
cpdef create_node_coordinates_lookup(nodes)
cdef pygeos_to_shapely(geom)
Expand Down

0 comments on commit 09b1994

Please sign in to comment.