## OpenStreetMap Data for China

The OpenStreetMap (OSM) Data for China, provided by Geofabrik GmbH through
https://download.geofabrik.de/asia/china.html￼, is an openly accessible and regularly updated geospatial dataset derived from the global OpenStreetMap project. It offers comprehensive vector data describing China’s road network, land use, buildings, natural features, and a wide range of additional geographic entities. The dataset is widely used in urban studies, transportation research, and spatial econometrics due to its free access, extensive coverage and high level of detail in metropolitan regions.


In [1]:
%pip install geopandas networkx shapely pyproj

Collecting geopandas
  Using cached geopandas-1.1.1-py3-none-any.whl.metadata (2.3 kB)
Collecting networkx
  Using cached networkx-3.4.2-py3-none-any.whl.metadata (6.3 kB)
Collecting shapely
  Using cached shapely-2.1.2-cp310-cp310-manylinux2014_x86_64.manylinux_2_17_x86_64.whl.metadata (6.8 kB)
Collecting pyproj
  Using cached pyproj-3.7.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (31 kB)
Collecting numpy>=1.24 (from geopandas)
  Using cached numpy-2.2.6-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (62 kB)
Collecting pyogrio>=0.7.2 (from geopandas)
  Using cached pyogrio-0.12.0-cp310-cp310-manylinux_2_28_x86_64.whl.metadata (5.3 kB)
Collecting pandas>=2.0.0 (from geopandas)
  Using cached pandas-2.3.3-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl.metadata (91 kB)
Collecting certifi (from pyproj)
  Using cached certifi-2025.11.12-py3-none-any.whl.metadata (2.5 kB)
Collecting pytz>=2020.1 (from pandas>=2.0.0->geopandas)
  Us

In [3]:
import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

roads = gpd.read_file("data/beijing-251127-free/gis_osm_roads_free_1.shp")
bus_stops = gpd.read_file("data/beijing-251127-free/gis_osm_transport_free_1.shp")
rental_df = pd.read_csv("data/retailrent.csv")

geometry = [Point(xy) for xy in zip(rental_df.longitude, rental_df.latitude)]
rental = gpd.GeoDataFrame(rental_df, geometry=geometry, crs="EPSG:4326")


In [26]:
roads.head(20)

Unnamed: 0,osm_id,code,fclass,name,ref,oneway,maxspeed,layer,bridge,tunnel,geometry
0,4231222,5114,secondary,,,F,0,0,F,F,"LINESTRING (116.38943 39.90626, 116.38945 39.9..."
1,4231223,5114,secondary,广场东侧路,,F,70,0,F,F,"LINESTRING (116.39346 39.89888, 116.39344 39.8..."
2,4263917,5112,trunk,复兴路,G108,F,0,0,F,F,"LINESTRING (116.3251 39.9063, 116.32474 39.906..."
3,4263918,5115,tertiary,兵部洼胡同,,B,0,0,F,F,"LINESTRING (116.38186 39.89886, 116.38187 39.8..."
4,4484480,5115,tertiary,军博西路,,F,0,0,F,F,"LINESTRING (116.31552 39.9065, 116.31558 39.90..."
5,4484630,5112,trunk,东长安街,,F,0,0,F,F,"LINESTRING (116.41165 39.90694, 116.41152 39.9..."
6,4493469,5114,secondary,南新华街,,F,0,0,F,F,"LINESTRING (116.37797 39.89737, 116.37803 39.8..."
7,4822336,5115,tertiary,慧忠路,,F,0,0,F,F,"LINESTRING (116.39472 39.99325, 116.39496 39.9..."
8,4822342,5122,residential,,,B,0,0,F,F,"LINESTRING (116.39718 39.99328, 116.39722 39.9..."
9,4822362,5112,trunk,北四环中路,,F,80,0,F,F,"LINESTRING (116.36406 39.98593, 116.36562 39.9..."


In [4]:
railways = gpd.read_file("data/beijing-251127-free/gis_osm_railways_free_1.shp")
subway_line = railways[railways['fclass'] == 'subway']

In [5]:
subway_line.head()

Unnamed: 0,osm_id,code,fclass,name,layer,bridge,tunnel,geometry
3,24818659,6103,subway,北京地铁1号线,-4,F,T,"LINESTRING (116.16492 39.93855, 116.16533 39.9..."
32,25041427,6103,subway,,-1,F,T,"LINESTRING (116.29349 40.06402, 116.29359 40.0..."
33,25041428,6103,subway,,0,F,F,"LINESTRING (116.29822 40.07002, 116.2988 40.07..."
47,25197624,6103,subway,北京地铁13号线,-2,F,T,"LINESTRING (116.42756 39.95245, 116.42757 39.9..."
65,26141450,6103,subway,,1,T,F,"LINESTRING (116.31185 40.03598, 116.31143 40.0..."


In [22]:
bus_stops = gpd.read_file("data/beijing-251127-free/gis_osm_transport_free_1.shp")
bus_stops.head()

Unnamed: 0,osm_id,code,fclass,name,geometry
0,32618803,5621,bus_stop,铸钟厂,POINT (116.38547 39.94023)
1,32618809,5621,bus_stop,德内甘水桥,POINT (116.38149 39.94313)
2,124205138,5601,railway_station,北京丰台,POINT (116.29534 39.85)
3,240421944,5621,bus_stop,海淀黄庄南,POINT (116.3126 39.97227)
4,268389725,5621,bus_stop,大石桥北,POINT (116.32224 40.0122)


In [10]:
import networkx as nx
from shapely.geometry import LineString, Point

G = nx.Graph()   # 道路图一般无向即可

for idx, row in roads.iterrows():
    geom = row.geometry
    attrs = row.drop("geometry").to_dict()

    lines = geom.geoms if geom.geom_type == "MultiLineString" else [geom]

    for line in lines:
        coords = list(line.coords)

        for u, v in zip(coords[:-1], coords[1:]):
            u_pt = Point(u)
            v_pt = Point(v)
            length = u_pt.distance(v_pt)

            G.add_node(u, node_type="road")
            G.add_node(v, node_type="road")

            G.add_edge(u, v, length=length, geometry=LineString([u, v]), **attrs)