# 일부 특성의 경우 하나의 동 이내에서 의미 있는 것이 아니라
# 그 영향력의 범위가 여러동에 걸쳐 있다.

- ex) 대학교

이를 해결하기 위해 각 동에서 거리가 가까운 동 구하기

In [1]:
from google.colab import drive
drive.mount('/content/drive')

drive_path = '/content/drive/MyDrive/new_project1/'

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [2]:
import pandas as pd
import numpy as np

import re

import geopandas as gpd
from shapely.geometry import Point

In [3]:
dong_shp = gpd.read_file('/content/drive/MyDrive/new_project1/data/seoul_shp/dong/bnd_dong_11_2023_2023_2Q.shp')
dong_shp.head()

Unnamed: 0,BASE_DATE,ADM_NM,ADM_CD,geometry
0,20230701,사직동,11010530,"POLYGON ((953553.932 1953335.741, 953555.211 1..."
1,20230701,삼청동,11010540,"POLYGON ((954025.242 1953916.389, 954026.972 1..."
2,20230701,부암동,11010550,"POLYGON ((952490.380 1956548.821, 952497.594 1..."
3,20230701,평창동,11010560,"POLYGON ((953683.828 1959209.871, 953665.283 1..."
4,20230701,한남동,11030740,"POLYGON ((956238.296 1950166.610, 956237.942 1..."


In [4]:
# 동별 거리를 계산하기 위한 centroid 구하기
dong_shp['centroid'] = dong_shp['geometry'].centroid
dong_shp.head()

Unnamed: 0,BASE_DATE,ADM_NM,ADM_CD,geometry,centroid
0,20230701,사직동,11010530,"POLYGON ((953553.932 1953335.741, 953555.211 1...",POINT (953232.514 1952855.551)
1,20230701,삼청동,11010540,"POLYGON ((954025.242 1953916.389, 954026.972 1...",POINT (954197.713 1954457.198)
2,20230701,부암동,11010550,"POLYGON ((952490.380 1956548.821, 952497.594 1...",POINT (952643.871 1955418.736)
3,20230701,평창동,11010560,"POLYGON ((953683.828 1959209.871, 953665.283 1...",POINT (953060.025 1957493.939)
4,20230701,한남동,11030740,"POLYGON ((956238.296 1950166.610, 956237.942 1...",POINT (956284.258 1948804.661)


In [5]:
region_code = pd.read_csv('/content/drive/MyDrive/new_project1/data/complete/region_code.csv')
region_code.head()

Unnamed: 0,ADM_CD,dong_name,통계청코드,도로명코드10,통계청코드8,도로명코드8,도로명코드,시도명,시군구,읍면동
0,11010530,서울특별시 종로구 사직동,1101053,1111053000,11010530,11110530,11110530,서울특별시,종로구,사직동
1,11010540,서울특별시 종로구 삼청동,1101054,1111054000,11010540,11110540,11110540,서울특별시,종로구,삼청동
2,11010550,서울특별시 종로구 부암동,1101055,1111055000,11010550,11110550,11110550,서울특별시,종로구,부암동
3,11010560,서울특별시 종로구 평창동,1101056,1111056000,11010560,11110560,11110560,서울특별시,종로구,평창동
4,11030740,서울특별시 용산구 한남동,1103074,1117068500,11030740,11170685,11170685,서울특별시,용산구,한남동


In [6]:
region_code['ADM_CD'] = region_code['ADM_CD'].astype('int')
dong_shp['ADM_CD'] = dong_shp['ADM_CD'].astype('int')

In [7]:
dong_shp['ADM_NM'] = dong_shp['ADM_NM'].apply(lambda x: re.sub(r'\·', '', x))
dong_shp['ADM_NM'] = dong_shp['ADM_NM'].apply(lambda x: re.sub(r'\.', '', x))
dong_shp[dong_shp['ADM_NM'].str.contains('\.') | dong_shp['ADM_NM'].str.contains('\·') ]

Unnamed: 0,BASE_DATE,ADM_NM,ADM_CD,geometry,centroid


In [8]:
map_base = dong_shp.merge(region_code[['ADM_CD', '시군구']],
                          how = 'left',
                          on = 'ADM_CD')
map_base.head()

Unnamed: 0,BASE_DATE,ADM_NM,ADM_CD,geometry,centroid,시군구
0,20230701,사직동,11010530,"POLYGON ((953553.932 1953335.741, 953555.211 1...",POINT (953232.514 1952855.551),종로구
1,20230701,삼청동,11010540,"POLYGON ((954025.242 1953916.389, 954026.972 1...",POINT (954197.713 1954457.198),종로구
2,20230701,부암동,11010550,"POLYGON ((952490.380 1956548.821, 952497.594 1...",POINT (952643.871 1955418.736),종로구
3,20230701,평창동,11010560,"POLYGON ((953683.828 1959209.871, 953665.283 1...",POINT (953060.025 1957493.939),종로구
4,20230701,한남동,11030740,"POLYGON ((956238.296 1950166.610, 956237.942 1...",POINT (956284.258 1948804.661),용산구


### 가까운 동 계산

In [9]:
# 빈 데이터프레임 생성
column_list = ['ADM_CD', "ADM_NM"]
dong_cnt = 10

for i in range(1, dong_cnt+1):
    column_list.append(f'dong_{i}_name')
    column_list.append(f'dong_{i}_code')
    column_list.append(f'dong_{i}_distance')

dong_distance = pd.DataFrame(columns=column_list)

In [10]:
gu_num = 0

for gu in map_base['시군구'].unique():
    temp = map_base[map_base['시군구'] == gu]
    temp.reset_index(drop = True, inplace = True)

    for idx in range(len(temp)):
        distances = []
        centroid_point = temp['centroid'][idx]

        for centroid in temp['centroid']:
            distance = centroid_point.distance(centroid)
            distances.append(distance)

        # 가장 가까운 동 선택
        sorted_indices = sorted(range(len(distances)), key=lambda i: distances[i])[1:dong_cnt+1]

        # 변수 생성
        gu_num += 1
        for i, dong_num in enumerate(sorted_indices):
            new_idx = gu_num * 100 + idx
            dong_distance.loc[new_idx, 'ADM_NM'] = temp['ADM_NM'][idx]
            dong_distance.loc[new_idx, 'ADM_CD'] = temp['ADM_CD'][idx]
            dong_distance.loc[new_idx, f'dong_{i+1}_name'] = temp['ADM_NM'][dong_num]
            dong_distance.loc[new_idx, f'dong_{i+1}_code'] = temp['ADM_CD'][dong_num]
            dong_distance.loc[new_idx, f'dong_{i+1}_distance'] = distances[dong_num]

In [11]:
dong_distance.head()

Unnamed: 0,ADM_CD,ADM_NM,dong_1_name,dong_1_code,dong_1_distance,dong_2_name,dong_2_code,dong_2_distance,dong_3_name,dong_3_code,...,dong_7_distance,dong_8_name,dong_8_code,dong_8_distance,dong_9_name,dong_9_code,dong_9_distance,dong_10_name,dong_10_code,dong_10_distance
100,11010530,사직동,교남동,11010580,635.233236,무악동,11010570,1100.795535,청운효자동,11010720,...,2629.908256,혜화동,11010730,2824.294295,이화동,11010640,2954.444825,종로56가동,11010630,3006.574146
201,11010540,삼청동,가회동,11010600,803.025763,청운효자동,11010720,1089.234297,혜화동,11010730,...,2152.179477,무악동,11010570,2355.899387,교남동,11010580,2442.123675,종로56가동,11010630,2675.205807
302,11010550,부암동,청운효자동,11010720,1501.35695,삼청동,11010540,1827.288226,평창동,11010560,...,2873.494251,혜화동,11010730,3171.345321,종로1234가동,11010610,3351.85061,이화동,11010640,3975.779966
403,11010560,평창동,부암동,11010550,2116.518779,삼청동,11010540,3242.858983,청운효자동,11010720,...,4641.594097,종로1234가동,11010610,4885.944016,교남동,11010580,4963.20854,이화동,11010640,5041.078795
504,11010570,무악동,교남동,11010580,829.797726,사직동,11010530,1100.795535,청운효자동,11010720,...,2769.861402,혜화동,11010730,3617.155481,이화동,11010640,3934.409421,종로56가동,11010630,4072.88976


In [12]:
dong_distance.to_csv(f'{drive_path}final/data/dong_distance.csv', index = False)