# Worldwide Nuclear Power Plants (NPPs) Analysis
## 1. Worldwide NPPs Dataset Acquisition
Worldwide NPPs datasets used in this analysis are acquired from these following sources:
<br>
1. Nuclear Power Reactors in the World 2020 edition, 40th edition of Reference Data Series No. 2. Downloaded from: http://www-pub.iaea.org/MTCD/Publications/PDF/RDS-2-40_web.pdf. Tables in the pdf files are extracted using Tabula, an open source tool to extract table from pdf file, then save it into csv file. Tabula can be downloaded in: https://tabula.technology.
2. Power Reactor Information System (PRIS), International Atomic Energy Agency. Accessed from: https://pris.iaea.org/PRIS/ (Last update: September 2020).
</br>

### 1.1. Creating pandas dataframe from operational NPPs dataset.

In [1]:
import pandas as pd

OP_df = pd.read_csv("file:///D:/Projects/Worldwide_NPP_Analysis/tabula-NPP_operational_detail.csv")
print(OP_df.shape)
OP_df.tail()

(442, 17)


Unnamed: 0,CC,COUNTRY,REACTOR_CODE,REACTOR_NAME,TYPE,MODEL,MWT,MWE_G,MWE_N,OPERATOR,NSSS_SUPPLIER,CONST_START,GRID_CONN,OP_START,EAF_PCT,UCF_PCT,NON_ELEC
437,us,USA,US-425,VOGTLE-2,PWR,WH 4LP (DRYAMB,3626,1229,1152,SOUTHERN,WH,08-1976,04-1989,05-1989,94.5,94.5,-
438,us,USA,US-382,WATERFORD-3,PWR,CE 2LP (DRYAMB,3716,1250,1168,ENTERGY,CE,11-1974,03-1985,09-1985,89.1,89.3,-
439,us,USA,US-390,WATTS BAR-1,PWR,WH 4LP (ICECND,3459,1210,1157,TVA,WH,07-1973,02-1996,05-1996,90.2,90.2,-
440,us,USA,US-391,WATTS BAR-2,PWR,WH 4LP (ICECND,3411,1218,1164,TVA,WH,09-1973,06-2020,10-2020,79.8,79.8,-
441,us,USA,US-482,WOLF CREEK,PWR,WH 4LP (DRYAMB,3565,1285,1200,WCNOC,WH,05-1977,06-1985,09-1985,83.3,83.3,-


### 1.2. Creating pandas dataframe from permanently shutdown NPPs dataset.

In [2]:
SD_df = pd.read_csv("file:///D:/Projects/Worldwide_NPP_Analysis/tabula-NPP_permanent_shutdown.csv")
print(SD_df.shape)
SD_df.head()

(189, 14)


Unnamed: 0,CC,COUNTRY,REACTOR_CODE,REACTOR_NAME,TYPE,MWT,MWE_G,MWE_N,OPERATOR,NSSS_SUPPLIER,CONST_START,GRID_CONN,OP_START,SHUTDOWN
0,am,ARMENIA,AM-18,ARMENIAN-1,PWR,1375,408,376,ANPPCJSC,FAEA,07-1969,12-1976,10-1977,02-1989
1,be,BELGIUM,BE-1,BR-3,PWR,41,12,10,CEN/SCK,WH,11-1957,10-1962,10-1962,06-1987
2,bg,BULGARIA,BG-1,KOZLODUY-1,PWR,1375,440,408,KOZNPP,AEE,04-1970,07-1974,10-1974,12-2020
3,bg,BULGARIA,BG-2,KOZLODUY-2,PWR,1375,440,408,KOZNPP,AEE,04-1970,08-1975,11-1975,12-2020
4,bg,BULGARIA,BG-3,KOZLODUY-3,PWR,1375,440,408,KOZNPP,AEE,10-1973,12-1980,01-1981,12-2020


### 1.3. The Description of Column Name in the Dataset
#### <center>The Description of Column Name
|Column name|Description|
|--- |:-- |
|CC|ISO two-letter country code|
|COUNTRY|COUNTRY where nuclear reactor located|
|REACTOR_NAME|Reactor Name|
|TYPE|Type of the reactor|
|MODEL|Model of Reactor|
|MWT|Maximum thermal power generated by reactor|
|MWE_G|Maximum gross electrical power generated by reactor's turbine|
|MWE_N|Maximum nett electrical power supplied to the grid|
|OPERATOR|Nuclear power plant operator|
|NSSS_SUPPLIER|Supplier of nuclear steam supply system|
|CONST_START|Month and year when NPP construction started|
|GRID_CONN|Month and year when NPP first connected to the electrical grid|
|OP_START|Month and year when NPP first commercial operation started|
|EAF_PCT|NPP Energy availability factor percentage|
|UCF_PCT|NPP unit capability factor percentage|
|NON_ELEC|NPP application other than electrical generation|
|SHUTDOWN|Month and year when NPP was permanently shutdown|

## 2. Web Scrapping NPPs Location Coordinate with Wikipedia API
This step is using function wiki_get_coordinates from wiki_scrap.py that created to get the location coordinate of all NPP in the dataset from wikipedia pages with Wikipedia API. Module wiki_scrap.py can be seen in https://github.com/abedkristanto/Worldwide_NPP_Analysis/blob/master/wiki_scrap.py. Requesting many data at once has a risk of raising ConnectionError before the data can be saved using jupyter notebook magic function %store. Therefore, reactor name list will be divided into 20 names per list for each data request. All the coordinate list will be joined in a new list called coordinate_list.

### 2.1. Operational NPPs Location Coordinates

In [59]:
# Convert OP_df['REACTOR_NAME'] dataframe to list for function input
import numpy as np

reactor_list = OP_df.REACTOR_NAME.to_list()
list_of_list = []
length = len(reactor_list)
num_ingroup = 20
idx_range = list(np.arange(0, length, num_ingroup))
for idx in idx_range:
    if idx < idx_range[-1]:
        temp = reactor_list[idx:idx+num_ingroup]
        list_of_list.append(temp)
    else:
        temp = reactor_list[idx::]
        list_of_list.append(temp)
print(list_of_list[22])

['WATTS BAR-2', 'WOLF CREEK']


In [48]:
coordinate_list = []
unknown_list = []

In [110]:
from wiki_scrap import wiki_get_coordinates

# wiki_get_coordinates function return 2 list, coordinates list
# and list of reactor names that have no wikipedia page.
coord, unknown = wiki_get_coordinates(list_of_list[22]) # list_of_list[0] - list_of_list[22]

In [111]:
coordinate_list = coordinate_list + coord
unknown_list = unknown_list + unknown
%store coordinate_list
%store unknown_list
print("Number of unknown data:", len(unknown_list))
print("Number of coordinate data found", len(coordinate_list))
print("Unknown data: ", unknown_list)

Stored 'coordinate_list' (list)
Stored 'unknown_list' (list)
Number of unknown data: 10
Number of coordinate data found 442
Unknown data:  ['NARORA-2', 'HIGASHI DORI-1 (TOHOKU)', 'SIZEWELL B', 'KHMELNITSKI-1', 'KHMELNITSKI-2', 'ROVNO-1', 'ROVNO-3', 'ROVNO-4', 'LASALLE-1', 'LASALLE-2']


In [114]:
# convert coordinate list to pandas dataframe
OP_df_coord = pd.DataFrame(coordinate_list, columns = ['REACTOR_NAME', 'LATITUDE', 'LONGITUDE'])
%store OP_df_coord
print('Unknown coordinates:', unknown_list)
OP_df_coord.shape

Stored 'OP_df_coord' (DataFrame)
Unknown coordinates: ['NARORA-2', 'HIGASHI DORI-1 (TOHOKU)', 'SIZEWELL B', 'KHMELNITSKI-1', 'KHMELNITSKI-2', 'ROVNO-1', 'ROVNO-3', 'ROVNO-4', 'LASALLE-1', 'LASALLE-2']


(442, 3)

In [115]:
OP_df_loc = OP_df.copy()
OP_df_loc['LATITUDE'] = OP_df_coord.iloc[:,1]
OP_df_loc['LONGITUDE'] = OP_df_coord.iloc[:,2]
OP_df_loc.head()

Unnamed: 0,CC,COUNTRY,REACTOR_CODE,REACTOR_NAME,TYPE,MODEL,MWT,MWE_G,MWE_N,OPERATOR,NSSS_SUPPLIER,CONST_START,GRID_CONN,OP_START,EAF_PCT,UCF_PCT,NON_ELEC,LATITUDE,LONGITUDE
0,ar,ARGENTINA,AR-1,ATUCHA-1,PHWR,PHWR KWU,1179,362,340,NASA,SIEMENS,06-1968,03-1974,06-1974,81.6,81.8,-,-33.967222,-59.2075
1,ar,ARGENTINA,AR-3,ATUCHA-2,PHWR,PHWR KWU,2160,745,693,NASA,SIEMENS,07-1981,06-2020,05-2020,56.8,56.8,-,-33.967222,-59.2075
2,ar,ARGENTINA,AR-2,EMBALSE,PHWR,CANDU 6,2064,656,608,NASA,AECL,04-1974,04-1983,01-1984,54.8,54.9,-,-32.232,-64.443
3,am,ARMENIA,AM-19,ARMENIAN-2,PWR,VVER V-270,1375,408,375,ANPPCJSC,FAEA,07-1975,01-1980,05-1980,66.9,68.8,-,40.180844,44.148908
4,be,BELGIUM,BE-2,DOEL-1,PWR,WH 2LP,1311,454,445,EBL+EDF,ACECOWEN,07-1969,08-1974,02-1975,80.9,81.3,-,51.324722,4.258611


In [116]:
#OP_df_loc.to_csv("NPP_operational_2019_&loc.csv", index=False)

##### Manual search result in wikipedia.org for NPPs with unknown location coordinate:

|Reactor name|LATITUDE|LONGITUDE|
|--- |:-- |:-- |
|NARORA-2|28.15805556|78.40944444|
|HIGASHI DORI-1|41.18805556|141.39027778|
|SIZEWELL B|52.215|1.61972|
|KHMELNITSKI 1-2|50.30138889|26.64972222|
|ROVNO 1-4|51.32777778|25.89166667|
|LASALLE 1-2|41.24555556|-88.66916667|
|DARLINGTON 1-4|43.88944444|-78.71972222|

### 2.2. Permanently Shutdown NPPs Location Coordinates

In [6]:
import numpy as np

reactor_list_SD = SD_df.REACTOR_NAME.to_list()
list_of_list_SD = []
length = len(reactor_list_SD)
num_ingroup = 20
idx_range = list(np.arange(0, length, num_ingroup))
for idx in idx_range:
    if idx < idx_range[-1]:
        temp = reactor_list_SD[idx:idx+num_ingroup]
        list_of_list_SD.append(temp)
    else:
        temp = reactor_list_SD[idx::]
        list_of_list_SD.append(temp)
print(len(list_of_list_SD))
print(list_of_list_SD[9])

10
['SHOREHAM', 'THREE MILE ISLAND-1', 'THREE MILE ISLAND-2', 'TROJAN', 'VERMONT YANKEE', 'YANKEE NPS', 'ZION-1', 'ZION-2', 'INDIAN POINT-2']


In [7]:
coordinate_list_SD = []
unknown_list_SD = []

In [30]:
from wiki_scrap import wiki_get_coordinates

# wiki_get_coordinates function return 2 list, coordinates list
# and list of reactor names that have no wikipedia page.
coord, unknown = wiki_get_coordinates(list_of_list_SD[9]) # list_of_list_SD[0] - list_of_list_SD[9]

In [31]:
coordinate_list_SD = coordinate_list_SD + coord
unknown_list_SD = unknown_list_SD + unknown
%store coordinate_list_SD
%store unknown_list_SD
print("Number of unknown data:", len(unknown_list_SD))
print("Number of coordinate data found", len(coordinate_list_SD))
print("Unknown data: ", unknown_list_SD)

Stored 'coordinate_list_SD' (list)
Stored 'unknown_list_SD' (list)
Number of unknown data: 9
Number of coordinate data found 189
Unknown data:  ['SUPER-PHENIX', 'KNK II', 'MZFR', 'WUERGASSEN', 'CHINSHAN-1', 'CHINSHAN-2', 'SIZEWELL A-1', 'SIZEWELL A-2', 'CVTR']


In [32]:
# convert coordinate list to pandas dataframe
SD_df_coord = pd.DataFrame(coordinate_list_SD, columns = ['REACTOR_NAME', 'LATITUDE', 'LONGITUDE'])
%store SD_df_coord
print('Unknown coordinates:', unknown_list_SD)
SD_df_coord.shape

Stored 'SD_df_coord' (DataFrame)
Unknown coordinates: ['SUPER-PHENIX', 'KNK II', 'MZFR', 'WUERGASSEN', 'CHINSHAN-1', 'CHINSHAN-2', 'SIZEWELL A-1', 'SIZEWELL A-2', 'CVTR']


(189, 3)

In [33]:
SD_df_loc = SD_df.copy()
SD_df_loc['LATITUDE'] = SD_df_coord.iloc[:,1]
SD_df_loc['LONGITUDE'] = SD_df_coord.iloc[:,2]
SD_df_loc.head()

Unnamed: 0,CC,COUNTRY,REACTOR_CODE,REACTOR_NAME,TYPE,MWT,MWE_G,MWE_N,OPERATOR,NSSS_SUPPLIER,CONST_START,GRID_CONN,OP_START,SHUTDOWN,LATITUDE,LONGITUDE
0,am,ARMENIA,AM-18,ARMENIAN-1,PWR,1375,408,376,ANPPCJSC,FAEA,07-1969,12-1976,10-1977,02-1989,40.180844,44.148908
1,be,BELGIUM,BE-1,BR-3,PWR,41,12,10,CEN/SCK,WH,11-1957,10-1962,10-1962,06-1987,47.512222,34.585833
2,bg,BULGARIA,BG-1,KOZLODUY-1,PWR,1375,440,408,KOZNPP,AEE,04-1970,07-1974,10-1974,12-2020,43.746111,23.770556
3,bg,BULGARIA,BG-2,KOZLODUY-2,PWR,1375,440,408,KOZNPP,AEE,04-1970,08-1975,11-1975,12-2020,43.783333,23.733333
4,bg,BULGARIA,BG-3,KOZLODUY-3,PWR,1375,440,408,KOZNPP,AEE,10-1973,12-1980,01-1981,12-2020,43.746111,23.770556


In [34]:
#SD_df_loc.to_csv("NPP_permanent_shutdown_2019_&loc.csv", index=False)

##### Manual search result in wikipedia.org for NPPs with unknown location coordinate:

|Reactor name|LATITUDE|LONGITUDE|
|--- |:-- |:-- |
|SUPER-PHENIX|45.75833333|5.47222222|
|KNK II|49.09944444|8.43277778|
|MZFR|49.10416667|8.43277778|
|WUERGASSEN|51.63916667|9.39138889|
|CHINSHAN 1-2|25.28583333|121.58611111|
|SIZEWELL A-1 & A-2|52.2155|1.6198|
|CVTR|34.26250000|-81.32916667|

## 3. Plot Worldwide NPPs Locations

In [5]:
import pandas as pd
import folium
OP_loc_plot_df = pd.read_csv("file:///D:/Projects/Worldwide_NPP_Analysis/NPP_operational_2019_&loc.csv")
SD_loc_plot_df = pd.read_csv("file:///D:/Projects/Worldwide_NPP_Analysis/NPP_permanent_shutdown_2019_&loc.csv")
world_npp_map = folium.Map(location=[0,0], zoom_start=2, tiles='OpenStreetMap')
#world_npp_map.save('index.html')
#world_npp_map

In [7]:
for lat, lng, label in zip(OP_loc_plot_df.LATITUDE,
                           OP_loc_plot_df.LONGITUDE,
                           OP_loc_plot_df.REACTOR_NAME):
    folium.features.CircleMarker(
        [lat, lng],
        radius=5, # define how big you want the circle markers to be
        color='blue',
        fill=True,
        popup=folium.Popup(label, parse_html=True),
        fill_color='white',
        fill_opacity=0.6
    ).add_to(world_npp_map)

for lat2, lng2, label2 in zip(SD_loc_plot_df.LATITUDE,
                           SD_loc_plot_df.LONGITUDE,
                           SD_loc_plot_df.REACTOR_NAME):
    folium.features.CircleMarker(
        [lat2, lng2],
        radius=5, # define how big you want the circle markers to be
        color='red',
        fill=True,
        popup=folium.Popup(label2, parse_html=True),
        fill_color='white',
        fill_opacity=0.6
    ).add_to(world_npp_map)

# show map
world_npp_map.save('index.html')
world_npp_map