## Set up area of interest

First thing to do is show the area of interest, in this case we will use Bali Province, Indonesia. We use library `folium` here.

In [1]:
# define library
import folium

# define coordinates
main_coords = [-8.3304977, 115.0906401]  # Bali coordinates
radius = 80000  # radius in meter

main_map = folium.Map(main_coords, zoom_start=9)

folium.Circle(main_coords, radius=radius, color='#ff0000', fill=True, fill_color='#ffff00', fill_opacity=0.2
            ).add_to(main_map)

main_map

## Scraping the data

Next is scraping data from the area. Here we use libraries `pandas`, `requests`, and `json`.

In [6]:
# libraries
import pandas as pd
import requests
import json

# API url
overpass_api = "http://overpass-api.de/api/interpreter"

# query
query_medical = """
[out:json];
(area['name:id'='Provinsi Bali'];
node[amenity~'(clinic|dentist|doctors|hospital|pharmacy)'](area);
);
out;
"""

response_medical = requests.get(overpass_api, params={'data':query_medical})
data_medical = response_medical.json()

Then we show the output from scraping process. It is in **json** format.

In [None]:
data_medical

Since we only need data in key `elements`, so we have to remove the header from the json output.

In [8]:
list_medical = data_medical['elements']

We can show the data again. Now we have a list contains data of each node of medical facilities.

In [None]:
list_medical

Convert the list into dataframe.

In [10]:
df_medical_raw = pd.DataFrame.from_dict(pd.json_normalize(list_medical), orient='columns')

Show first 5 rows from the dataframe.

In [11]:
df_medical_raw.head()

Unnamed: 0,type,id,lat,lon,tags.addr:city,tags.addr:postcode,tags.addr:street,tags.amenity,tags.healthcare,tags.is_in,...,tags.name:lt,tags.name:uk,tags.name:vi,tags.name:zh-sg,tags.population,tags.ref,tags.source:population,tags.timezone,tags.type,tags.wikipedia
0,node,429250405,-8.354431,114.621454,Jembrana,82218.0,Wijaya Kusuma,hospital,hospital,"West Bali,Bali,Indonesia",...,,,,,,,,,,
1,node,600488283,-8.76022,115.175837,"Badung, Bali",80361.0,Jalan Uluwatu,hospital,,,...,,,,,,,,,,
2,node,964939213,-8.712574,115.173211,,,,pharmacy,pharmacy,"Legian,South Bali,Bali,Indonesia",...,,,,,,,,,,
3,node,965148761,-8.717328,115.174146,,,,pharmacy,,"Kuta,South Bali,Bali,Indonesia",...,,,,,,,,,,
4,node,1067112056,-8.736851,115.167952,,,,hospital,,"South Bali,Bali,Indonesia",...,,,,,,,,,,


As shown above, there are 62 columns. But here we only need 4 columns: `latitude (lat)`, `longitude (lon)`, `tags.name (name)`, and `tags.type (type)`. So we take only these 4 columns and drop the others. Don't forget to rename the columns, sort the data (here is by name), and reset the index.

In [12]:
# take only columns needed, others are dropped. rename columns and sort.

df_medical = df_medical_raw[['lat', 'lon', 'tags.name', 'tags.amenity']]\
                .rename(columns={'tags.name':'name', 'tags.amenity':'type'})\
                .sort_values(by=['name']).reset_index()

Here is our dataframe now.

In [13]:
df_medical

Unnamed: 0,index,lat,lon,name,type
0,155,-8.657657,115.147088,Apotek,pharmacy
1,210,-8.683945,115.157051,Apotek Guardian Seminyak Square,pharmacy
2,13,-8.540324,115.126398,Apotek Karunia,pharmacy
3,211,-8.690060,115.172972,Apotek Kimia Farma,pharmacy
4,82,-8.766855,115.178133,Apotek Kimia Farma By Pass Ngurah Rai,pharmacy
...,...,...,...,...,...
223,198,-8.122974,115.071388,,hospital
224,199,-8.111922,115.089568,,hospital
225,200,-8.508834,115.267787,,doctors
226,201,-8.508842,115.268091,,pharmacy


There are many `NaN` values in column `name`, so we fill them with value from column `type`.

In [14]:
# fill NaN value in column 'name' with value from column 'type'

df_medical.name.fillna(df_medical.type, inplace=True)

Then remove column `index` because now we have new index.

In [15]:
# remove column 'index'

df_medical = df_medical.drop(columns=['index'])

This is our final dataframe.

In [16]:
df_medical

Unnamed: 0,lat,lon,name,type
0,-8.657657,115.147088,Apotek,pharmacy
1,-8.683945,115.157051,Apotek Guardian Seminyak Square,pharmacy
2,-8.540324,115.126398,Apotek Karunia,pharmacy
3,-8.690060,115.172972,Apotek Kimia Farma,pharmacy
4,-8.766855,115.178133,Apotek Kimia Farma By Pass Ngurah Rai,pharmacy
...,...,...,...,...
223,-8.122974,115.071388,hospital,hospital
224,-8.111922,115.089568,hospital,hospital
225,-8.508834,115.267787,doctors,doctors
226,-8.508842,115.268091,pharmacy,pharmacy


Save the dataframe into csv file.

In [20]:
df_medical.to_csv('list_medical.csv', sep=',')

<br />

## Visualize the data 

Next, we visualize the data using `folium`. We will map each entity based on its geolocation. First we set up an empty map.

In [17]:
island_coords = [-8.455, 115.09]     # coordinates of the island area

island_map = folium.Map(island_coords, zoom_start=10)

island_map

After that, we group the entities based on its type: `clinic`, `dentist`, `doctors`, `hospital`, and `pharmacy`. Then iterate each row to get its data.

In [18]:
for group_name, group in df_medical.groupby('type'):
    feature_group = folium.FeatureGroup(group_name)
    
    for row in group.itertuples():
        if row[4] == 'clinic':
            folium.Marker(location=[row.lat, row.lon], tooltip=row[3], 
                          icon=folium.Icon(color='purple', icon_color='white', prefix='fa', icon='medkit')
                         ).add_to(feature_group)        
        
        elif row[4] == 'dentist':
            folium.Marker(location=[row.lat, row.lon], tooltip=row[3], 
                          icon=folium.Icon(color='green', icon_color='white', prefix='fa', icon='medkit')
                         ).add_to(feature_group)
        
        elif row[4] == 'doctors':
            folium.Marker(location=[row.lat, row.lon], tooltip=row[3], 
                          icon=folium.Icon(color='darkblue', icon_color='white', prefix='fa', icon='medkit')
                         ).add_to(feature_group)
        elif row[4] == 'hospital':
            folium.Marker(location=[row.lat, row.lon], tooltip=row[3], 
                          icon=folium.Icon(color='red', icon_color='white', prefix='fa', icon='medkit')
                         ).add_to(feature_group)
        else:   # pharmacy
            folium.Marker(location=[row.lat, row.lon], tooltip=row[3], 
                          icon=folium.Icon(color='orange', icon_color='white', prefix='fa', icon='medkit')
                         ).add_to(feature_group)
    feature_group.add_to(island_map)
    
folium.LayerControl().add_to(island_map)

island_map

Last step, save the map into html file. Now we have map contains our data which is grouped by their type.

In [19]:
island_map.save(outfile='Bali_medical_facilities.html')