## From The standard Node-Link data to Network
Data Source : http://nodelink.its.go.kr/data/data01.aspx

The standard node-link files consist of two kinds of shape files(node, link).

The _shapefile_ format is a popular geospatial vector data format for geographic information system (GIS) software. It is developed and regulated by Esri as a (mostly) open specification for data interoperability among Esri and other GIS software products. [wiki](https://en.wikipedia.org/wiki/Shapefile)

To construct network, we should modifiy shpfiles.

In [52]:
import shapefile
import pandas as pd
from pyproj import Proj, transform
import networkx as nx

If you see an error message that `No module named modulename`, you should install module. To resolve this problem, we need to istall 'pyshp' and 'pyproj'. To do this, we use `pip`. We can install module easily by typing `pip install modulename`. To install `pyshp` and `pyproj`, we type the below codes in `terminal` or `anaconda prompt`.

```
pip install pyshp
pip install pyproj
```
If there is a problem to install pyproj, then do the following; conda install -c anaconda pyproj 

## Construct dataframe by using shp file

In [53]:
shp_path_node = './[2023-07-17]NODELINKDATA/MOCT_LINK.shp'
sf_node = shapefile.Reader(shp_path_node, encoding='cp949')
shp_path_link = './[2023-07-17]NODELINKDATA/MOCT_NODE.shp'
sf_link = shapefile.Reader(shp_path_link, encoding='cp949')

In [54]:
# construct pandas dataframe

#grab the shapefile's field names
# node
fields_node = [x[0] for x in sf_node.fields][1:]
records_node = sf_node.records()
shps = [s.points for s in sf_node.shapes()] # node has coordinate data.
# link
fields_link = [x[0] for x in sf_link.fields][1:]
records_link = sf_link.records()


#write the records into a dataframe
# node
df_node = pd.DataFrame(columns=fields_node, data=records_node)
#add the coordinate data to a column called "coords"
df_node = node_dataframe.assign(coords=shps)
# link
df_link = pd.DataFrame(columns=fields_link, data=records_link)

In [55]:
df_node.head()

Unnamed: 0,LINK_ID,F_NODE,T_NODE,LANES,ROAD_RANK,ROAD_TYPE,ROAD_NO,ROAD_NAME,ROAD_USE,MULTI_LINK,...,REST_VEH,REST_W,REST_H,C-ITS,LENGTH,UPDATEDATE,REMARK,HIST_TYPE,HISTREMARK,coords
0,3620033001,3620016502,3620012200,1,106,0,927,봉호로,0,0,...,0,0,0,0,10388.784111,20230519,,,,"[(335314.2438000003, 429868.4894999992), (3353..."
1,3620038802,3620008701,3620019000,2,107,0,-,홍술로,0,0,...,0,0,0,0,189.852452,20230519,,,,"[(351601.4249, 418763.5850000009), (351753.241..."
2,3620149100,3620063200,3620067900,2,107,0,-,삼분2길,0,0,...,0,0,0,0,156.785095,20230519,,,,"[(327014.95380000025, 423936.93579999916), (32..."
3,3670134000,3670061100,3670061200,2,103,3,26,동고령로,0,0,...,0,0,0,0,293.967615,20230519,,,,"[(317691.6010999996, 349552.37069999985), (317..."
4,3670016500,3670007400,3670007800,1,106,0,905,성암로,0,0,...,0,0,0,0,1939.169843,20230519,,,,"[(326245.5685999999, 355726.0282000005), (3262..."


In nodelink data, all positions in nodes are assigned based on **korea 2000 좌표계**. Their positions are changed based on 
**wgs84 (위도/경도)** by using Proj package.

In [56]:
# Change coordinate system
# korea 2000/central belt 2010 (epsg:5186) to wgs84(epsg:4326)
inProj = Proj(init = 'epsg:5186')
outProj= Proj(init = 'epsg:4326')
latitude = []
longitude= []
for idx,row in df_node.iterrows():
    x,y  = row.coords[0][0],row.coords[0][1]  # korea 2000 좌표계
    nx,ny = transform(inProj,outProj,x,y)     # 새로운 좌표계
    latitude.append(ny)
    longitude.append(nx)
df_node['latitude'] = latitude
df_node['longitude']= longitude
del df_node['coords'] # delete coords
print(df_node)

  in_crs_string = _prepare_from_proj_string(in_crs_string)
  in_crs_string = _prepare_from_proj_string(in_crs_string)
  nx,ny = transform(inProj,outProj,x,y)     # 새로운 좌표계


KeyboardInterrupt: 

In order to use Gephi, it is essential to have two files node and line file. Also each one has a special properties. That is, node name has to be indexed as an **ID** and link has two names **Source** and **Target**. File in below show that how to change the given form to gephi-type file form.

In [None]:
# Change column name to draw network in Gephi
df_node.rename(columns={'NODE_ID':'Id'},inplace = True)
df_link.rename(columns={'F_NODE':'Source','T_NODE':'Target'},inplace = True)

In [None]:
df_node.head()

Unnamed: 0,LINK_ID,F_NODE,T_NODE,LANES,ROAD_RANK,ROAD_TYPE,ROAD_NO,ROAD_NAME,ROAD_USE,MULTI_LINK,...,REST_VEH,REST_W,REST_H,C-ITS,LENGTH,UPDATEDATE,REMARK,HIST_TYPE,HISTREMARK,coords
0,3620033001,3620016502,3620012200,1,106,0,927,봉호로,0,0,...,0,0,0,0,10388.784111,20230519,,,,"[(335314.2438000003, 429868.4894999992), (3353..."
1,3620038802,3620008701,3620019000,2,107,0,-,홍술로,0,0,...,0,0,0,0,189.852452,20230519,,,,"[(351601.4249, 418763.5850000009), (351753.241..."
2,3620149100,3620063200,3620067900,2,107,0,-,삼분2길,0,0,...,0,0,0,0,156.785095,20230519,,,,"[(327014.95380000025, 423936.93579999916), (32..."
3,3670134000,3670061100,3670061200,2,103,3,26,동고령로,0,0,...,0,0,0,0,293.967615,20230519,,,,"[(317691.6010999996, 349552.37069999985), (317..."
4,3670016500,3670007400,3670007800,1,106,0,905,성암로,0,0,...,0,0,0,0,1939.169843,20230519,,,,"[(326245.5685999999, 355726.0282000005), (3262..."


In [None]:
df_link[1:5]

Unnamed: 0,NODE_ID,NODE_TYPE,NODE_NAME,TURN_P,UPDATEDATE,REMARK,HIST_TYPE,HISTREMARK
1,1080010400,101,쌈지마당길-인수봉길진,0,20230519,,,
2,1080010500,101,신망애피아노,0,20230519,,,
3,1080010600,101,네네치킨,1,20230519,,,
4,1080010700,101,쌈지마당길-인수봉길진,0,20230519,,,


In [None]:
df_node.to_csv('node_data.csv')
df_link.to_csv('link_data.csv')