## 基础导入啥的

In [None]:
import geopandas as gpd
import pandas as pd
from shapely.geometry import Point
from shapely.strtree import STRtree

## 1.gvi

The GVI dataset used in this study is derived from the Treepedia project (MIT, 2015), based on Google Street View images captured around 2015. While this may not fully reflect recent greening interventions or urban modifications, GVI is used here as a general proxy for the visual landscape structure. Limitations related to data currency are discussed in Section X.

本研究所用GVI数据来自于MIT Treepedia项目，基于约2015年的Google街景图像。尽管可能无法反映最近绿化或街景变化，但其作为城市空间绿意结构的代表性指标，仍具有参考价值，相关时效性局限将在后文讨论部分说明。

To address gaps in GVI coverage along street segments, a borough-level imputation strategy was applied. Each road segment was first assigned a GVI value based on the average of nearby GVI points within a 20-meter buffer. For segments lacking any such points, the mean GVI value of the corresponding Greater London borough was used as a proxy. All imputed values were flagged to ensure transparency and to support subsequent interpretation.

为解决街道路段内 GVI 数据覆盖不足的问题，本研究采用了基于 Borough（伦敦行政区）划分的分组插值策略。首先，优先使用 20 米缓冲区内的 GVI 点均值赋值给道路段；若缓冲区内未找到任何 GVI 点，则根据该路段所在 Borough 的平均 GVI 值进行插值补全。所有插值段均被标记，以保证分析过程的透明性和后续结果解释的严谨性。

### 先导入一下数据（一直出错啊妈的

In [None]:
boroughs_gdf = gpd.read_file("../data/BoroughShp/borough/borough.shp")
print(boroughs_gdf.columns)

edges_gdf = gpd.read_file("../data/Roads/london_edges_FIXED.gpkg")
print(edges_gdf.columns)

gvi_points = gpd.read_file("../data/Env/greenview_london.json/greenview_london.json")
print(gvi_points.columns)

In [None]:
projected_crs = "EPSG:27700"
edges_gdf = edges_gdf.to_crs(projected_crs)
boroughs_gdf = boroughs_gdf.to_crs(projected_crs)
gvi_points = gvi_points.to_crs(projected_crs)

In [None]:
boroughs_gdf = boroughs_gdf[['NAME', 'geometry']].rename(columns={'NAME': 'name'})

In [None]:
# 将 GVI 点赋值给 Borough（我们之后要 fallback）
gvi_points = gvi_points.to_crs(boroughs_gdf.crs)
gvi_with_borough = gpd.sjoin(gvi_points, boroughs_gdf, how='inner', predicate='within')

# 计算 Borough 的平均 GVI 值
borough_gvi_mean = gvi_with_borough.groupby('name')['greenView'].mean().to_dict()
print(borough_gvi_mean)

In [None]:
import geopandas as gpd
import pandas as pd
from tqdm.notebook import tqdm
tqdm.pandas()

# === 0. 只抽取前 3000 条边作为测试集 ===
# edges_sample = edges_gdf.iloc[:10000].copy()
edges_sample = edges_gdf

# === 1. 创建每条路段的中点列 ===
print("1begin")
edges_sample['midpoint'] = edges_sample.geometry.interpolate(0.5, normalized=True)
midpoints_gdf = gpd.GeoDataFrame(edges_sample[['midpoint']], geometry='midpoint', crs=edges_sample.crs)

# === 2. 建立 GVI 空间索引，提升效率 ===
print("2begin")
gvi_points = gvi_points.to_crs(edges_sample.crs)
gvi_sindex = gvi_points.sindex

# === 3. 对每条道路构建 30 米缓冲区并查找 GVI 点 ===
print("3begin")
def compute_gvi_buffer_mean(point):
    buffer = point.buffer(30)
    possible = list(gvi_sindex.intersection(buffer.bounds))
    near = gvi_points.iloc[possible]
    near = near[near.intersects(buffer)]
    if len(near) > 0:
        return near['greenView'].mean()
    else:
        return None

midpoints_gdf['gvi_buffer_mean'] = midpoints_gdf['midpoint'].progress_apply(compute_gvi_buffer_mean)

# === 4. 给中点赋 Borough 标签，用于 fallback ===
print("4begin")
midpoints_with_borough = gpd.sjoin(midpoints_gdf, boroughs_gdf, how='left', predicate='within')
midpoints_with_borough['borough'] = midpoints_with_borough['name']

# === 5. 建立最终 gvi 值和标记列 ===
print("5begin")
def assign_final_gvi(row):
    if pd.notnull(row['gvi_buffer_mean']):
        return row['gvi_buffer_mean'], 0
    else:
        borough = row['borough']
        return borough_gvi_mean.get(borough, None), 1

midpoints_with_borough[['gvi_final', 'gvi_flag']] = midpoints_with_borough.apply(assign_final_gvi, axis=1, result_type='expand')

# === 6. 合并结果回 edges_sample ===
print("6begin")
edges_sample['gvi_final'] = midpoints_with_borough['gvi_final']
edges_sample['gvi_flag'] = midpoints_with_borough['gvi_flag']

In [None]:
print(edges_sample.shape)
print(edges_sample['gvi_final'].value_counts(dropna=False))

print(gvi_points.crs)
print(edges_gdf.crs)

In [None]:
import matplotlib.pyplot as plt
import pandas as pd

# 自定义 GVI 分级颜色（从低到高）
gvi_bins = [0.0, 0.2, 0.4, 0.6, 0.8, 1.0]
gvi_colors = ['#d62728', '#ff7f0e', '#ffdb58', '#2ca02c', '#006400']
gvi_labels = ['0–0.2', '0.2–0.4', '0.4–0.6', '0.6–0.8', '0.8–1.0']

# 步骤 1：归一化
edges_sample['gvi_final_norm'] = edges_sample['gvi_final'] / edges_sample['gvi_final'].max()

# 步骤 2：分组
edges_sample['gvi_group'] = pd.cut(
    edges_sample['gvi_final_norm'],
    bins=gvi_bins,
    labels=gvi_labels,
    include_lowest=True
)

# 步骤 3：绘图
fig, ax = plt.subplots(figsize=(12, 12))
for label, color in zip(gvi_labels, gvi_colors):
    subset = edges_sample[edges_sample['gvi_group'] == label]
    if not subset.empty:
        subset.plot(ax=ax, color=color, linewidth=0.8, label=f'GVI {label}')

# 美化图形
ax.set_title("Sample Road Segments – Green View Index (GVI)", fontsize=16)
ax.set_axis_off()
ax.legend(title="GVI Range", loc='lower left')
plt.tight_layout()
plt.show()

In [None]:
print(edges_sample['gvi_final_norm'].value_counts(dropna=False))

In [None]:
ax = edges_gdf.sample(3000).plot(color='gray', linewidth=0.5)
gvi_points.sample(3000).plot(ax=ax, color='green', markersize=1)
midpoints_gdf.sample(3000).plot(ax=ax, color='red', markersize=5)