# Lab 5: API

In this lab, you will retrieve hospital data (상급종합병원) and administrative distrct (시도) geometry from VWorld API. Then calculate the number of hospitals in each administrative district (시도) and create a choropleth map to examine the distribution of hospitals in South Korea. 


The data of this lab was obtained from the following resources. 
* Hospital data (상급종합병원): VWorld 검색 API 2.0 레퍼런스 (https://www.vworld.kr/dev/v4dv_search2_s001.do)
* 시도 geometry: VWorld WMS/WFS API 2.0 레퍼런스 (https://www.vworld.kr/dev/v4dv_wmsguide2_s001.do)

## Notes:
**Before you submit your lab, make sure everything runs as expected WITHOUT ANY ERROR.** <br>
**Make sure you fill in any place that says `YOUR CODE HERE` or `YOUR ANSWER HERE`:**

In [None]:
FULL_NAME = ""

In [None]:
# Import necessary packages
import requests
import geopandas as gpd
import pandas as pd
from urllib.parse import urlencode

## 1. Hospital Data Collection

**1.1.** Visit the API documentation page (https://www.vworld.kr/dev/v4dv_search2_s001.do) and review the information on how to retrieve hospital data for tertiary general hospitals (상급종합병원). You will need to use the `TYPE` and `CATEGORY` parameters.

**1.2.** (2 points) From the URL below, locate the code corresponding to tertiary general hospitals (상급종합병원), and store it in the variable `hospital_code`.
Make sure to save it as a string, as the code begins with '0'. <br>
장소분류코드: https://www.vworld.kr/contents/%EB%B8%8C%EC%9D%B4%EC%9B%94%EB%93%9C_%EC%9E%A5%EC%86%8C%EB%B6%84%EB%A5%98%EC%BD%94%EB%93%9C_20240712.xlsx

**1.3.** (1 point) Use the `requests` library to call the API and store the response in the variable `response_poi`.
Note that there are 45 such hospitals in South Korea, so set the size parameter to a value greater than 45, even though this parameter is optional.

In [None]:
# Your code here
YOUR_API_KEY = ''


response_poi = requests.get()


In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
import re

assert type(hospital_code) == str
assert hospital_code == '020315601010'
_size = re.search(r'size=(\d+)', response_poi.url)
assert int(_size.group(1)) >= 45
_category = re.search(r'category=(\d+)', response_poi.url)
assert _category.group(1) == hospital_code
assert response_poi.status_code == 200

print('Success!')

**1.4.** (4 points) Convert the response to a DataFrame and store it in the variable `hospital_df`. Note that you need to examine the structure of the response to extract the relevant data. 

In [None]:
# Your code here

hospital_df = 



In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""

assert hospital_df.shape[0] == 45
assert '강릉아산병원' in hospital_df['title'].values

print('Success!')

You might have noticed that the point column stores both x and y coordinates together in the form of a dictionary.
<br>

**1.5.** (4 points) Split the `point` column into two separate columns: `long` and `lat`. The `long` column should store the value of `x`, and the `lat` column should store the value of `y`, respectively.
<br>

**1.6.** (1 point) Change the data type of the `long` and `lat` columns to float. This is important for later calculations.

**1.7.** (1 point) Drop the columns other than `id`, `title`, `long`, and `lat` columns from the DataFrame.


In [None]:
# Your code here

hospital_df.head(3)

In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""

assert hospital_df.shape == (45, 4)
isinstance(hospital_df['long'][0], float)
assert isinstance(hospital_df['lat'][0], float)
assert 'address' not in hospital_df.columns and 'point' not in hospital_df.columns

print('Success!')

**1.9.** (2 points) Convert the DataFrame to a GeoDataFrame using the `long` and `lat` columns as the geometry. Then save the GeoDataFrame as `hospital_gdf`.

In [None]:
hospital_gdf = 
hospital_gdf

In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""

assert hospital_gdf.shape == (45, 5)
assert hospital_gdf.crs == 'EPSG:4326'
assert 'geometry' in hospital_gdf.columns
assert hospital_gdf['geometry'][0].geom_type == 'Point'

print('Success!')

## 2. Administrative Distrct (시도) Geometry Collection

**2.1.** Visit the API documentation page (https://www.vworld.kr/dev/v4dv_wmsguide2_s001.do) and review the information on how to retrieve administrative district (시도) geometry data. In this case, you will need to use the `typename` parameter.

**2.2.** (2 points) From the URL below, locate the code corresponding to *시도* and store it in the variable `layer`. Additionally, it is recommended to set the `output` parameter to `json` for faster data retrieval.  
WFS 칼럼정보 파일:
https://www.vworld.kr/contents/%EB%B8%8C%EC%9D%B4%EC%9B%94%EB%93%9C_WFS_%EC%BB%AC%EB%9F%BC%EC%A0%95%EB%B3%B4.xlsx

**2.3.** (3 points) Use the `urlencode` function to encode the parameters for the API request, and save the resulting URL in the variable `url_sido`.


In [None]:
# Your code here

url_sido = 

print(url_sido)

In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
import re

assert url_sido.split('?')[0] == 'https://api.vworld.kr/req/wfs'
assert 'REQUEST'.lower() in url_sido.lower()
_request = re.search(r'REQUEST=([^&]+)', url_sido)
assert _request.group(1) == 'GetFeature'
_layer = re.search(r'typename=([^&]+)', url_sido)
assert _layer.group(1) == layer

print('Success!')

**2.4.** (2 points) Load the geometry data by passing the `url_sido` to `gpd.read_file()`. Store the result in the variable `sido_gdf`.

**2.5.** (1 point) If the CRS of `sido_gdf` is not `EPSG:4326`, convert it using the `to_crs()` method.

**2.6.** (2 points) Calculate the number of hospitals in each administrative district (*시도*), and store the result in the `hospital` column of the `sido_gdf` GeoDataFrame.


In [None]:
# Your code here


sido_gdf
    

In [None]:
""" Test code for the previous code. This cell should NOT give any errors when it is run."""
import re

assert sido_gdf.shape == (17, 8)
assert sido_gdf.crs == 'EPSG:4326'
assert sido_gdf['hospital'].sum() == 45

print('Success!')

# Done