Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
HTenkanen committed Oct 27, 2023
2 parents b196aa1 + 696861a commit b618445
Show file tree
Hide file tree
Showing 3 changed files with 9 additions and 35 deletions.
36 changes: 5 additions & 31 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,16 +11,15 @@
**Pyrosm** is a Python library for reading OpenStreetMap data from Protocolbuffer Binary Format -files (`*.osm.pbf`) into Geopandas GeoDataFrames.
Pyrosm makes it easy to extract various datasets from OpenStreetMap pbf-dumps including e.g. road networks, buildings,
Points of Interest (POI), landuse and natural elements. Also fully customized queries are supported which makes it possible
to parse the data from OSM with more specific filters.

to parse the data from OSM with more specific filters.

**Pyrosm** is easy to use and it provides a somewhat similar user interface as [OSMnx](https://github.com/gboeing/osmnx).
The main difference between pyrosm and OSMnx is that OSMnx reads the data over internet using OverPass API, whereas pyrosm reads the data from local OSM data dumps
that can be downloaded e.g. from [GeoFabrik's website](http://download.geofabrik.de/). This makes it possible to read data faster thus
allowing e.g. parsing street networks for the whole country fairly efficiently (however, see [caveats](#caveats)).

The library has been developed by keeping performance in mind, hence, it is mainly written in Cython (*Python with C-like performance*)
which makes it probably faster than any other Python alternatives for parsing OpenStreetMap data.
which makes it fast to parse OpenStreetMap data from PBF files.
Pyrosm is built on top of another Cython library called [Pyrobuf](https://github.com/appnexus/pyrobuf) which is a faster Cython alternative
to Google's Protobuf library: It provides 2-4x boost in performance for deserializing the protocol buffer messages compared to
Google's version with C++ backend. Google's Protocol Buffers is a commonly used and efficient method to serialize and compress structured data
Expand All @@ -41,13 +40,6 @@ which is also used by OpenStreetMap contributors to distribute the OSM data in P
- filter data based on bounding box
- export networks as a directed graph to `igraph`, `networkx` and `pandana`

## Roadmap

- add possibility to optimize memory usage (see #87)
- add possibility to simplify graph (see #89)
- add possibility to crop PBF and save a subset into new PBF.
- add Cython specific tests

## Install

Pyrosm is distributed via PyPi and conda-forge.
Expand Down Expand Up @@ -83,23 +75,6 @@ That being said, it is also possible to extract neighborhood level information w
Using `pyrosm` is straightforward. See [docs](https://pyrosm.readthedocs.io/en/latest/basics.html)
for instructions how to use the library.

## Performance

See [docs](https://pyrosm.readthedocs.io/en/latest/benchmarking.html) for more comprehensive benchmarking tests. Reading all drivable roads in Helsinki Region (approx. 85,000 roads)
takes approximately **12 seconds** (laptop with 16GB memory, SSD drive, and Intel Core i5-8250U CPU 1.6 GHZ). And the result looks something like:

![Helsinki_driving_net](resources/img/Helsinki_driving_net.PNG)

Parsing all buildings from the same area (approx. 180,000) takes approximately **17 seconds**. And the result looks something like:

![Helsinki_building_footprints](resources/img/Helsinki_building_footprints.png)

Parsing all Points of Interest (POIs) with defaults elements (amenities, shops and tourism)
takes approximately **14 seconds** (approx. 32,000 features).
And the result looks something like:

![Helsinki_POIs](resources/img/Helsinki_POIs_amenity_shop_tourism.png)

## Get in touch + contributions

If you find a bug from the tool, have question, or would like to suggest a new feature to it, you can [make a new issue here](https://github.com/HTenkanen/pyrosm/issues).
Expand All @@ -111,18 +86,17 @@ please check the [contribution guidelines](https://pyrosm.readthedocs.io/en/late

You can install a local development version of the tool by 1) installing necessary packages with conda and 2) building pyrosm from source:

1. install conda-environment for Python 3.7 or 3.8 by:
1. install conda-environment for Python 3.12 by:

- Python 3.7 (you might want to modify the env-name which is `test` by default): `$ conda env create -f ci/37-conda.yaml`
- Python 3.8: `$ conda env create -f ci/38-conda.yaml`
- Python 3.12 (you might want to modify the env-name which is `test` by default): `$ conda env create -f ci/312-conda.yaml`

2. build pyrosm development version from master (activate the environment first):

- `pip install -e .`

You can run tests with `pytest` by executing:

`$ pytest -v`
`$ pytest . -v`


## License and copyright
Expand Down
6 changes: 3 additions & 3 deletions pyrosm/frames.pyx
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ cpdef prepare_way_gdf(node_coordinates, ways, parse_network, calculate_seg_lengt
way_gdf["v"] = v

# Calculate the length of the geometries
way_gdf["length"] = calculate_geom_array_length(way_gdf.geometry.values.data)
way_gdf["length"] = calculate_geom_array_length(way_gdf.geometry.values.to_numpy())

# For cases not related to networks
else:
Expand Down Expand Up @@ -126,7 +126,7 @@ cpdef prepare_relation_gdf(node_coordinates, relations, relation_ways, tags_as_c
node_coordinates,
tags_as_columns)

relation_gdf = gpd.GeoDataFrame(relations)
relation_gdf = gpd.GeoDataFrame(relations, crs="epsg:4326")
relation_gdf['osm_type'] = "relation"

else:
Expand Down Expand Up @@ -154,7 +154,7 @@ cpdef prepare_geodataframe(nodes, node_coordinates, ways,
# Prepare nodes
node_gdf = prepare_node_gdf(nodes)
else:
node_gdf = pd.DataFrame()
node_gdf = gpd.GeoDataFrame()

# Merge all
gdf = pd.concat([node_gdf, way_gdf, relation_gdf])
Expand Down
2 changes: 1 addition & 1 deletion tests/test_graph_exports.py
Original file line number Diff line number Diff line change
Expand Up @@ -334,7 +334,7 @@ def test_igraph_connectivity(immutable_nodes_and_edges):

# Test that finding shortest paths works for all nodes
N = g.vcount()
shortest_paths = g.shortest_paths_dijkstra(
shortest_paths = g.distances(
source=5, target=[i for i in range(N)], weights="length"
)

Expand Down

0 comments on commit b618445

Please sign in to comment.