## Import Modules python yang dibutuhkan
Adapun module yang digunakan antara lain:
1. Module `requests` digunakan untuk membuka web http://mounts-project.com
2. Module `json` digunakan untuk melakukan formatting data JSON
3. Module `re` digunakan untuk melakukan ekstraksi data JSON dari variable JavaScript menggunakan Regex
4. Module `pandas` digunakan untuk melakukan formatting data ke DataFrame dan converting ke Excel atau CSV
5. Module `os` digunakan untuk mendapatkan directory dan file
6. Module `JSON` digunakan untuk display data JSON

In [159]:
import requests
import json
import re
import pandas as pd
import os

# To plot
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

## Inisiasi Variable yang Dibutuhkan
Inisiasi variable untuk web http://mounts-project.com

In [160]:
mounts_url: str = 'http://mounts-project.com/timeseries/'
include_anomaly: bool = False
filter_value: float = 0.1

Inisiasi variable untuk `output_directory` tempat hasil ekstrak data disimpan

In [161]:
output_directory = os.path.join(os.getcwd(), 'output')

output_directory

'D:\\Projects\\extract-mounts\\output'

In [162]:
if not os.path.exists(output_directory):
    os.mkdir(output_directory)

Variable `volcanoes` merupakan variable kode gunung api berdasarkan ID smithsonian number (https://volcano.si.edu/).  
Untuk mendapatkan data beberapa gunung api, masukkan kode gunung api dalam bentuk `list`

In [163]:
volcanoes = [
    {
        "name" : "Lewotobi Laki-laki",
        "code" : 264180,
    },
    {
        "name" : "Marapi",
        "code" : 261140,
    },
    {
        "name" : "Anak Krakatau",
        "code" : 262000,
    },
    {
        "name" : "Kerinci",
        "code" : 261170,
    },
    {
        "name" : "Karangetang",
        "code" : 267020,
    },
    {
        "name" : "Dukono",
        "code" : 268010,
    },
    {
        "name" : "Ili Lewotolok",
        "code" : 264230,
    },
    {
        "name" : "Ibu",
        "code" : 268030,
    },
    {
        "name" : "Semeru",
        "code" : 263300,
    },
    {
        "name" : "Raung",
        "code" : 263340,
    },
    {
        "name" : "Ijen",
        "code" : 263350,
    },
    {
        "name" : "Slamet",
        "code" : 263180
    }
]

In [164]:
for index, volcano in enumerate(volcanoes):
    print(volcano['name'], volcano['code'])

Lewotobi Laki-laki 264180
Marapi 261140
Anak Krakatau 262000
Kerinci 261170
Karangetang 267020
Dukono 268010
Ili Lewotolok 264230
Ibu 268030
Semeru 263300
Raung 263340
Ijen 263350
Slamet 263180


Menginisiasi list kosong untuk menyimpan DataFrame hasil ekstraksi data. List ini akan berisi kumpulan data Thermal dan SO2 dari berbagai gunung api

In [165]:
dataframes = {}

---
## Inisiasi Fungsi-fungsi yang digunakan untuk extract, transformasi, dan export data
Fungsi yang digunakan untuk mengeskstrak variable JSON Thermal dan SO2 dari JavaScript web Mounts Project. Variabel ini tersimpan dengan nama `var_graph`

In [166]:
def get_json_from_javascript(response):
    var_graph = re.search(r"(?:^|\s|;)var\s+graph\s*=\s*([^']+})", response.text)
    string_graph = var_graph.group(1)
    json_graph = json.loads(string_graph)
    return json_graph

Hasil dari ekstraksi JSON pada fungsi `get_json_from_javascript` selanjutnya digunakan untuk extraksi nilai **SO2** dan **Thermal**

In [167]:
def get_so2_values(graph_json):
    so2_values = {
        'Date time': graph_json['data'][2]['x'],
        'Value': graph_json['data'][2]['y'],
        'Graph': graph_json['data'][2]['text'],
    }
    so2_df = pd.DataFrame.from_dict(so2_values)
    so2_df['Type'] = 'SO2'
    return so2_df

In [168]:
def get_thermal_values(graph_json):
    thermal_values = {
        'Date time': graph_json['data'][0]['x'],
        'Value': graph_json['data'][0]['y'],
        'Graph': graph_json['data'][0]['text'],
    }
    thermal_df = pd.DataFrame.from_dict(thermal_values)
    thermal_df['Type'] = 'Thermal'
    return thermal_df

Splitting kolom `Date Time` ke `Date` dan `Time`

In [169]:
def convert_to_date(date_time):
    return date_time.strftime("%Y-%m-%d")

In [170]:
def convert_to_time(date_time):
    return date_time.strftime("%H:%M:%S")

Fungsi ini digunakan untuk meng-export data hasil ekstraksi ke dalam format Excel

In [171]:
def export_to_excel(filtered_df, volcano_code, volcano_name, csv_dir):
    filename = '{} - {}'.format(volcano_name, volcano_code)
    path_excel = os.path.join(output_directory, filename)
    path_csv = os.path.join(csv_dir, filename)
    filtered_df.to_csv('{}.csv'.format(path_csv))
    filtered_df.to_excel('{}.xlsx'.format(path_excel), sheet_name='Join Data')
    return (
        '{}.xlsx'.format(path_excel),
        '{}.csv'.format(path_csv)
    )

---
## Kode Utama
Kode utama ekstraksi data

In [172]:
print('🏃‍ Extracting....')
print('==================')
for index, volcano in enumerate(volcanoes):
    volcano_code = volcano['code']
    volcano_name = volcano['name']
    
    print('🌋 Extracting {} volcano'.format(volcano_name))
    # http://mounts-project.com/timeseries/262000
    url = mounts_url+str(volcano_code) 
    
    # Buka http://mounts-project.com/timeseries/262000
    response = requests.get(url)
    
    # Get data JSON
    graph_json = get_json_from_javascript(response)
    
    # save json
    json_dir = os.path.join(os.getcwd(), 'json')
    os.makedirs(json_dir, exist_ok = True)
    json_file = os.path.join(json_dir, '{}.json'.format(volcano['code']))
                             
    with open(json_file, "w") as write_file:
        json.dump(graph_json['data'], write_file, indent=2)
    
    so2 = get_so2_values(graph_json)
    thermal = get_thermal_values(graph_json)
    
    df = pd.concat([
        so2,
        thermal
    ])
    
    df['Date time'] = pd.to_datetime(df['Date time'])
    df['Date'] = df['Date time'].apply(convert_to_date)
    df['Time'] = df['Date time'].apply(convert_to_time)
    df['Code'] = volcano_code
    df['Volcano Name'] = volcano_name
    df.set_index('Date time', inplace=True)
    
    if include_anomaly:
        filter_value = 0
    
    filtered_df = df[df['Value'] > filter_value]
    dataframes[volcano_name] = filtered_df
    print('👌 {} Extracted!'.format(volcano_name))

print('==================')
print('✅ Finish!')

🏃‍ Extracting....
🌋 Extracting Lewotobi Laki-laki volcano
👌 Lewotobi Laki-laki Extracted!
🌋 Extracting Marapi volcano
👌 Marapi Extracted!
🌋 Extracting Anak Krakatau volcano
👌 Anak Krakatau Extracted!
🌋 Extracting Kerinci volcano
👌 Kerinci Extracted!
🌋 Extracting Karangetang volcano
👌 Karangetang Extracted!
🌋 Extracting Dukono volcano
👌 Dukono Extracted!
🌋 Extracting Ili Lewotolok volcano
👌 Ili Lewotolok Extracted!
🌋 Extracting Ibu volcano
👌 Ibu Extracted!
🌋 Extracting Semeru volcano
👌 Semeru Extracted!
🌋 Extracting Raung volcano
👌 Raung Extracted!
🌋 Extracting Ijen volcano
👌 Ijen Extracted!
🌋 Extracting Slamet volcano
👌 Slamet Extracted!
✅ Finish!


## Saving output

In [173]:
df_csv = pd.DataFrame()
concat_df = []

csv_directory = os.path.join(output_directory, 'csv')
os.makedirs(csv_directory, exist_ok = True)

print('⌚ Saving output!')
print('==================')

for index, volcano in enumerate(volcanoes):
    volcano_code = volcano['code']
    volcano_name = volcano['name']
    
    excel_file, csv_file = export_to_excel(dataframes[volcano_name], volcano_code, volcano_name, csv_directory)
    concat_df.append(dataframes[volcano_name])
    
    df_csv = pd.concat([
        df_csv, pd.DataFrame([
            {
                "code" : volcano_code,
                "volcano_name" : volcano_name,
                "filename" : excel_file,
                "csv": csv_file,
                "updated_at" : dataframes[volcano_name].index.max()
            }]
    )], ignore_index=True)
    print('💾 {} saved to: {}'.format(volcano_name, excel_file))

all_volcano = os.path.join(output_directory, 'All Volcano.xlsx')
merged = pd.concat(concat_df)
merged.to_excel(all_volcano, sheet_name='Join Data')

df_csv.to_csv('output.csv', index=False)

print('💾 All Volcano saved into: {}'.format(all_volcano))
print('==================')
print('✅ Done!')

⌚ Saving output!
💾 Lewotobi Laki-laki saved to: D:\Projects\extract-mounts\output\Lewotobi Laki-laki - 264180.xlsx
💾 Marapi saved to: D:\Projects\extract-mounts\output\Marapi - 261140.xlsx
💾 Anak Krakatau saved to: D:\Projects\extract-mounts\output\Anak Krakatau - 262000.xlsx
💾 Kerinci saved to: D:\Projects\extract-mounts\output\Kerinci - 261170.xlsx
💾 Karangetang saved to: D:\Projects\extract-mounts\output\Karangetang - 267020.xlsx
💾 Dukono saved to: D:\Projects\extract-mounts\output\Dukono - 268010.xlsx
💾 Ili Lewotolok saved to: D:\Projects\extract-mounts\output\Ili Lewotolok - 264230.xlsx
💾 Ibu saved to: D:\Projects\extract-mounts\output\Ibu - 268030.xlsx
💾 Semeru saved to: D:\Projects\extract-mounts\output\Semeru - 263300.xlsx
💾 Raung saved to: D:\Projects\extract-mounts\output\Raung - 263340.xlsx
💾 Ijen saved to: D:\Projects\extract-mounts\output\Ijen - 263350.xlsx
💾 Slamet saved to: D:\Projects\extract-mounts\output\Slamet - 263180.xlsx
💾 All Volcano saved into: D:\Projects\extra

# Plotting and save Figures

In [177]:
figures_directory = os.path.join(output_directory, 'figures')
os.makedirs(figures_directory, exist_ok = True)

In [178]:
for index, row in df_csv.iterrows():
    volcano_name = row['volcano_name']
    csv_file = row['csv']
    
    df_mounts = pd.read_csv(csv_file, index_col='Date time', parse_dates=True)
    df_thermal = df_mounts[df_mounts['Type'] == 'Thermal'].loc[:, "Value"]
    df_so2 = df_mounts[df_mounts['Type'] == 'SO2'].loc[:, "Value"]
    
    labels = [
        {
            'name': 'SO2',
            'y_label': '$SO_{2} mass [tons]$',
            'df': df_so2,
            'df_smoothed': df_so2.rolling('3d').median(),
        },
        {
            'name': 'Thermal',
            'y_label': 'Thermal anomalies [N pix]',
            'df': df_thermal,
            'df_smoothed': df_thermal.rolling('3d').median(),
        }
    ]
    
    fig, axs = plt.subplots(nrows=2, ncols=1, figsize=(12, 6),
                        layout="constrained", sharex=True)
    
    fig.suptitle(volcano_name, fontsize=14)
    
    for axs_index, label in enumerate(labels):
        
        color = 'orange' if label['name'] == 'SO2' else 'red'
        log = False if label['df'].values.max() < 1000 else True
        
        axs[axs_index].bar(label['df'].index, label['df'].values,
                           width=0.9, edgecolor=None, linewidth=0, label=label['name'], color=color)
        
        # axs[axs_index].scatter(label['df'].index, label['df'].values, color=color, alpha=0.6, s=10, label=label['name'])
        # axs[axs_index].plot(label['df'].index, label['df_smoothed'].values, color=color, label=label['name'], alpha=1)
        
        if log:
            axs[axs_index].set_yscale('log')
        
        # Plot label only for the last subplot
        # if axs_index == 1:
        #     axs[axs_index].set_xlabel('Date')
        
        axs[axs_index].set_ylabel(label['y_label'])
            
        axs[axs_index].tick_params(axis='both', which='major', labelsize=8)
        axs[axs_index].xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m-%d'))
        axs[axs_index].set_xlim(label['df'].first_valid_index(), label['df'].last_valid_index())
             
        axs[axs_index].annotate(
            text=label['name'],
            xy=(0.02, 0.85),
            xycoords='axes fraction',
            fontsize='8',
            bbox=dict(facecolor='white', alpha=0.5)
        )
        
        # Rotate x label
        for _label in axs[axs_index].get_xticklabels(which='major'):
            _label.set(rotation=30, horizontalalignment='right')

    figures_name = os.path.join(figures_directory, '{}.jpg'.format(volcano_name))
    fig.savefig(figures_name)
    
    print('📊 Plot saved into: {}'.format(figures_name))
    plt.close()

📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Lewotobi Laki-laki.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Marapi.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Anak Krakatau.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Kerinci.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Karangetang.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Dukono.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Ili Lewotolok.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Ibu.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Semeru.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Raung.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Ijen.jpg
📊 Plot saved into: D:\Projects\extract-mounts\output\figures\Slamet.jpg
