# Designing an Infrasound-based Early Warning System for Merapi Volcano

### Introduction

Volcanic eruptions have killed more than 300,000 people worldwide since the 16th century, with Southeast Asia accounting for more than half of this total (Brown et al., 2017). Merapi Volcano (2968 m asl) is an active stratovolcano, that is located in the Sleman regency, Yogyakarta Special Territory. Currently there are about 1.1 million people living on the active Merapi volano’s slopes, 444,000 of which are in regions where there is a high danger of topological floods, surges and lahars (Thouret et al., 2000).The Aim of this project is to: 
- To develop a machine learning model that can timely predict the likelihood of a volcanic eruption at Mount Merapi based on infrasound data.
- To evaluate the performance of the model on a test dataset and compare it to existing methods for predicting volcanic eruptions 

In [2]:
%matplotlib inline
import matplotlib.pyplot as plt
from obspy import UTCDateTime
from obspy.clients.fdsn import Client

plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = 12, 8

### Data Ingestion

We will gather Infrasound signals from Mount Merapi through from I06AU, I04AU, I52GB, I07AU, I39PW International Monitoring System (IMS) Infrasound Arrays through IRIS Data Services. The starttime and enddtime can be seen in the table below:

| No | Network | Station |  Latitude  |  Longitude | Elevation |                          Sitename                         |      Start Time     |       End Time      |
|:--:|:-------:|:-------:|:----------:|:----------:|:---------:|:---------------------------------------------------------:|:-------------------:|:-------------------:|
|  1 |    IM   |  I06H1  |  -12.14645 |  96.82032  |    16.8   |     Cocos Islands Infrasound Array, Site H1, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  2 |    IM   |  I06H2  |  -12.14752 |  96.81855  |    17.6   |     Cocos Islands Infrasound Array, Site H2, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  3 |    IM   |  I06H3  |  -12.14509 |  96.817741 |    20.8   |     Cocos Islands Infrasound Array, Site H3, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  4 |    IM   |  I06H4  |  -12.14354 |  96.818489 |    16.0   |     Cocos Islands Infrasound Array, Site H4, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  5 |    IM   |  I06H5  |  -12.1447  |  96.820847 |    19.3   |     Cocos Islands Infrasound Array, Site H5, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  6 |    IM   |  I06H6  |  -12.14585 |  96.82428  |    29.2   |     Cocos Islands Infrasound Array, Site H6, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  8 |    IM   |  I06H7  |  -12.15422 |  96.826881 |    14.0   |     Cocos Islands Infrasound Array, Site H7, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
|  9 |    IM   |  I06H8  |  -12.15724 |  96.821693 |    11.9   |     Cocos Islands Infrasound Array, Site H8, Australia    | 2011-10-31T00:00:00 | 2599-12-31T23:59:59 |
| 10 |    IM   |  I04H1  |  -34.59761 | 116.356689 |   167.1   |       Narrogin Infrasound Array, Site H1, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 11 |    IM   |  I04H2  | -34.596169 | 116.367142 |   155.7   |       Narrogin Infrasound Array, Site H2, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 12 |    IM   |  I04H3  | -34.607571 | 116.351547 |   123.2   |       Narrogin Infrasound Array, Site H3, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 13 |    IM   |  I04H4  | -34.594002 | 116.344582 |   154.7   |       Narrogin Infrasound Array, Site H4, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 14 |    IM   |  I04H5  | -34.594711 | 116.341087 |   147.7   |       Narrogin Infrasound Array, Site H5, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 15 |    IM   |  I04H6  | -34.593121 | 116.341042 |   146.8   |       Narrogin Infrasound Array, Site H6, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 16 |    IM   |  I04H7  | -34.591122 | 116.343163 |   143.2   |       Narrogin Infrasound Array, Site H7, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 17 |    IM   |  I04H8  | -34.592918 |  116.34639 |   148.7   |       Narrogin Infrasound Array, Site H8, Australia       | 2006-02-01T00:00:00 | 2599-12-31T23:59:59 |
| 18 |    IM   |  I52H1  |  -7.37779  |  72.484169 |    2.3    |   Diego Garcia infrasound array site H1, United Kingdom   | 2002-12-18T00:00:00 | 2599-12-31T23:59:59 |
| 19 |    IM   |  I52H2  |  -7.37058  |  72.482361 |   1000.0  | Diego Garcia Infrasonic Array, Site I52H2, United Kingdom | 2003-06-24T00:00:00 | 2599-12-31T23:59:59 |
| 20 |    IM   |  I52H3  |  -7.37962  |  72.490784 |   1000.0  | Diego Garcia Infrasonic Array, Site I52H3, United Kingdom | 2003-06-24T00:00:00 | 2599-12-31T23:59:59 |
| 21 |    IM   |  I52H4  |   -7.3871  |  72.478127 |   1000.0  | Diego Garcia Infrasonic Array, Site I52H4, United Kingdom | 2003-06-24T00:00:00 | 2599-12-31T23:59:59 |
| 22 |    IM   |  I52H5  |  -7.37559  |  72.483948 |    2.28   |   Diego Garcia infrasound array site H5, United Kingdom   | 2002-12-18T00:00:00 | 2599-12-31T23:59:59 |
| 23 |    IM   |  I52H6  |  -7.37593  |  72.485573 |   1000.0  | Diego Garcia Infrasonic Array, Site I52H6, United Kingdom | 2003-06-24T00:00:00 | 2599-12-31T23:59:59 |
| 24 |    IM   |  I52H7  |  -7.37865  |  72.485626 |   1000.0  | Diego Garcia Infrasonic Array, Site I52H7, United Kingdom | 2003-06-24T00:00:00 | 2599-12-31T23:59:59 |
| 25 |    IM   |  I07H1  | -19.934851 | 134.329544 |   385.5   |     Warramunga Infrasonic Array, Site I07H1, Australia    | 2000-05-12T00:00:00 | 2599-12-31T23:59:59 |
| 26 |    IM   |  I07H2  | -19.933411 | 134.330933 |   387.6   |     Warramunga Infrasonic Array, Site I07H2, Australia    | 2000-05-12T00:00:00 | 2599-12-31T23:59:59 |
| 27 |    IM   |  I07H3  |  -19.93659 | 134.330734 |   392.0   |     Warramunga Infrasonic Array, Site I07H3, Australia    | 2000-05-12T00:00:00 | 2599-12-31T23:59:59 |
| 28 |    IM   |  I07H4  |  -19.93502 | 134.327698 |   383.0   |     Warramunga Infrasonic Array, Site I07H4, Australia    | 2000-05-12T00:00:00 | 2599-12-31T23:59:59 |
| 29 |    IM   |  I07H5  | -19.924549 |  134.32579 |   387.3   |     Warramunga Infrasonic Array, Site I07H5, Australia    | 2013-05-23T00:00:00 | 2599-12-31T23:59:59 |
| 30 |    IM   |  I07H6  | -19.932699 | 134.338852 |   387.5   |     Warramunga Infrasonic Array, Site I07H6, Australia    | 2013-05-23T00:00:00 | 2599-12-31T23:59:59 |
| 31 |    IM   |  I07H7  |  -19.94022 |  134.33963 |   387.7   |     Warramunga Infrasonic Array, Site I07H7, Australia    | 2013-05-23T00:00:00 | 2599-12-31T23:59:59 |
| 32 |    IM   |  I07H8  | -19.943291 | 134.322906 |   388.4   |     Warramunga Infrasonic Array, Site I07H8, Australia    | 2013-05-23T00:00:00 | 2599-12-31T23:59:59 |

#### Data Window

Data window will be determine the data gathered by the range of two months before and after the start of the eruption. We will use VOTW Webservices to gather these dates 

In [31]:
import pandas as pd
import geopandas as gpd
import pytz

# Load csv results from server into a Pandas DataFrame
server = 'https://webservices.volcano.si.edu/geoserver/GVP-VOTW/wms?'
query = 'service=WFS&version=2.0.0&request=GetFeature&typeName=GVP-VOTW:E3WebApp_Eruptions1960&outputFormat=csv'
df = pd.read_csv(server+query)

# drop function which is used in removing or deleting rows or columns from the CSV files
df = df.set_index('Activity_ID')
df = df.drop(columns=['FID', 'LatitudeDecimal', 'LongitudeDecimal', 'GeoLocation'])

# Query a column for a value of interest
df = df.query('VolcanoNumber == 263250')
df = df.query('StartDateYear > 2010')

print(type(df))
df.head()

<class 'pandas.core.frame.DataFrame'>


Unnamed: 0_level_0,VolcanoNumber,VolcanoName,ExplosivityIndexMax,StartDate,StartDateYear,StartDateMonth,StartDateDay,EndDate,EndDateYear,EndDateMonth,EndDateDay,ContinuingEruption
Activity_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
22264,263250,Merapi,3,20180511.0,2018,5.0,11.0,20200621.0,2020.0,6.0,21.0,False
20842,263250,Merapi,2,20131118.0,2013,11.0,18.0,20131118.0,2013.0,11.0,18.0,False
20892,263250,Merapi,3,20140309.0,2014,3.0,9.0,20140420.0,2014.0,4.0,20.0,False
22381,263250,Merapi,1,20201231.0,2020,12.0,31.0,20230224.0,2023.0,2.0,24.0,True
22476,263250,Merapi,1,20130722.0,2013,7.0,22.0,20130722.0,2013.0,7.0,22.0,False


In [44]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 6 entries, 22264 to 22488
Data columns (total 12 columns):
 #   Column               Non-Null Count  Dtype  
---  ------               --------------  -----  
 0   VolcanoNumber        6 non-null      int64  
 1   VolcanoName          6 non-null      object 
 2   ExplosivityIndexMax  6 non-null      int64  
 3   StartDate            6 non-null      float64
 4   StartDateYear        6 non-null      int64  
 5   StartDateMonth       6 non-null      float64
 6   StartDateDay         6 non-null      float64
 7   EndDate              6 non-null      float64
 8   EndDateYear          6 non-null      float64
 9   EndDateMonth         6 non-null      float64
 10  EndDateDay           6 non-null      float64
 11  ContinuingEruption   6 non-null      bool   
dtypes: bool(1), float64(7), int64(3), object(1)
memory usage: 582.0+ bytes


In [46]:


# Define a function to convert start and end dates to UTC datetime format
def convert_to_utc(date_str):
    year = int(date_str[0:4])
    month = int(date_str[4:6])
    day = int(date_str[6:8])
    date = datetime(year, month, day)
    return date.strftime('%Y-%m-%d %H:%M:%S')

# Example usage for the first row in the dataset
start_date = '20180511.0'
end_date = '20200621.0'
start_date_utc = convert_to_utc(start_date)
end_date_utc = convert_to_utc(end_date)
print(start_date_utc, end_date_utc)

2018-05-11 00:00:00 2020-06-21 00:00:00


In [92]:
import pandas as pd
from datetime import datetime

# Define a function to convert start and end dates to UTC datetime format
def convert_to_utc(date_float):
    date_str = str(int(date_float))
    year = int(date_str[0:4])
    month = int(date_str[4:6])
    day = int(date_str[6:8])
    date = datetime(year, month, day)
    return date.strftime('%Y-%m-%d %H:%M:%S')

# Apply the function to the StartDate and EndDate columns in the DataFrame
df['StartDate'] = df['StartDate'].apply(convert_to_utc)
df['EndDate'] = df['EndDate'].apply(convert_to_utc)

ValueError: invalid literal for int() with base 10: '2018-05-11 00:00:00'

In [51]:
df

Unnamed: 0_level_0,VolcanoNumber,VolcanoName,ExplosivityIndexMax,StartDate,StartDateYear,StartDateMonth,StartDateDay,EndDate,EndDateYear,EndDateMonth,EndDateDay,ContinuingEruption
Activity_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
22264,263250,Merapi,3,2018-05-11 00:00:00,2018,5.0,11.0,2020-06-21 00:00:00,2020.0,6.0,21.0,False
20842,263250,Merapi,2,2013-11-18 00:00:00,2013,11.0,18.0,2013-11-18 00:00:00,2013.0,11.0,18.0,False
20892,263250,Merapi,3,2014-03-09 00:00:00,2014,3.0,9.0,2014-04-20 00:00:00,2014.0,4.0,20.0,False
22381,263250,Merapi,1,2020-12-31 00:00:00,2020,12.0,31.0,2023-02-24 00:00:00,2023.0,2.0,24.0,True
22476,263250,Merapi,1,2013-07-22 00:00:00,2013,7.0,22.0,2013-07-22 00:00:00,2013.0,7.0,22.0,False
22488,263250,Merapi,1,2011-03-25 00:00:00,2011,3.0,25.0,2011-09-08 00:00:00,2011.0,9.0,8.0,False


In [77]:
df['EndDate'].max()

'2023-02-24 00:00:00'

In [80]:
df['StartDate'].min()

'2011-03-25 00:00:00'

In [27]:
from obspy.clients.fdsn import Client
from datetime import datetime
from obspy import UTCDateTime

client = Client("IRIS")
t2 = UTCDateTime("2014-04-20 00:00:00")

# Define the stations
bulk = [("IM", "I06H1", "*", "*", UTCDateTime("2014-04-03T00:18:09.250000Z"), t2),
        ("IM", "I52H1", "*", "*", UTCDateTime("2014-04-03T00:29:17.550000Z"), t2)]


# Retrieve station information using bulk request
st = client.get_waveforms_bulk(bulk)  

filename = f"test.sac"
st.write(filename, format="SAC")
print(f"Data saved to {filename}")

KeyboardInterrupt: 

In [35]:
t2 = UTCDateTime("2014-04-20 00:00:00")
t3 = UTCDateTime("2014-04-03T00:13:00.000000Z")

print(t2-t3)

1468020.0


In [91]:
st = client.get_waveforms_bulk(bulk, attach_response=True)

FDSNTimeoutException: Timed Out