In [1]:
import obspy
import random
import string

# Creating a base database of all earthquakes
The ndk data file from the GlobalCMT project contains earthquakes from all over earth since 1976. Attaching a unique event ID to the database helps processing data.


In [2]:
cat = obspy.read_events("jan76_dec20.ndk")

print(cat)

56832 Event(s) in Catalog:
1976-01-01T01:29:53.400000Z | -29.250, -176.960 | 7.25 Mwc
1976-01-05T02:31:44.700000Z | -13.420,  -75.140 | 5.65 Mwc
...
2020-12-31T19:50:21.800000Z |  -0.680, +146.830 | 5.18 Mwc
2020-12-31T23:12:39.300000Z |  -9.020, +119.060 | 4.96 Mwc
To see all events call 'print(CatalogObject.__str__(print_all=True))'


## Creating a unique event ID database
There are about unique 57 000 earthquake events in the catalog. The USGS earthquake search catalog has about 76 000 earthquakes with magnitude greater than 5 since 1976 in the database.

Each element in the database needs a unique ID. This unique ID is generated by a random string function, which appends a list n times (where n equals the number of earthquake events) if the generated ID is not in the list.


In [3]:
def generate_random_string(length):
    # get random string of letters and digits
    source = string.ascii_letters + string.digits
    # source has 62 elements
    rand_str = ''.join((random.choice(source) for i in range(length)))
    return rand_str

In [4]:
ID_list = []
# n is the number of earthquakes in cat
n = 56832

while len(ID_list) < n:
    random_string = generate_random_string(8)
    if random_string not in ID_list:
        ID_list.append(random_string)

## Converting the catalog into a dataframe
The function imported in the following cell is converts an Obspy catalog into a dataframe.

The dataframe contains normalized moment tensor information.

In [5]:
from dataframe_creation import create_dataframe_from_catalog

In [6]:
df = create_dataframe_from_catalog(cat)

## Attaching the ID list 
The ID list of events of random alphanumeric ID generated previously is now being attached to the dataframe.

This process is then tested via the head method and ultimately saved to the new base database file.

In [7]:
df["event_id"] = ID_list
df.head()

Unnamed: 0,longitude,latitude,depth,time,mag,mag_type,m_rr,m_tt,m_pp,m_rt,...,source_time_duration,source_time_function,gcmt_id,m_rr_norm,m_tt_norm,m_pp_norm,m_rt_norm,m_rp_norm,m_tp_norm,event_id
0,-176.96,-29.25,47800.0,1976-01-01T01:29:53.400000Z,7.25,Mwc,7.68e+19,9e+17,-7.77e+19,1.39e+19,...,18.8,box car,M010176A,0.988417,0.011583,-1.0,0.178893,0.581725,-0.419562,HTOOiGNR
1,-75.14,-13.42,85400.0,1976-01-05T02:31:44.700000Z,5.65,Mwc,-1.78e+17,-5.9e+16,2.37e+17,-1.28e+17,...,3.2,box car,C010576A,-0.613793,-0.203448,0.817241,-0.441379,0.67931,-1.0,AmPocKdR
2,159.5,51.45,15000.0,1976-01-06T21:08:25.100000Z,6.13,Mwc,1.1e+18,-3e+17,-8e+17,1.05e+18,...,5.6,box car,C010676A,0.887097,-0.241935,-0.645161,0.846774,1.0,-0.451613,SQXvPFxx
3,167.81,-15.97,173700.0,1976-01-09T23:54:40.100000Z,6.31,Mwc,-1.7e+18,2.29e+18,-5.9e+17,-2.33e+18,...,7.0,box car,C010976A,-0.729614,0.982833,-0.253219,-1.0,-0.527897,0.862661,vRfgO3yv
4,-16.29,66.33,15000.0,1976-01-13T13:29:24.900000Z,6.28,Mwc,-5.1e+17,-2.86e+18,3.37e+18,5e+16,...,6.8,box car,C011376A,-0.151335,-0.848665,1.0,0.014837,-0.231454,-0.255193,fXyl0d3q


In [8]:
base_file = "base_database_frame"
df.to_pickle(base_file)

# Creating the Chile subset
In previous analyses I created a subset of the dataset only containing events along the coast of Chile.

The following values were used to define the subset.

In [9]:
minlatitude = -28.27
maxlatitude = -18.61
minlongitude=-73.5
maxlongitude=-68.5

In [10]:

cat_chile_coast = df.query(f"longitude >= {minlongitude} and longitude <= {maxlongitude} and latitude >= {minlatitude} and latitude <= {maxlatitude}" )
                  # f"latitude > {minlatitude}",
                  # f"latitude < {maxlatitude}",
                  # "time > 2000-01-01T00:20")

In [11]:
test_t = cat_chile_coast.iloc[0]["time"]

In [12]:
# cat_chile_coast_2000 = df.query("time.time > 2000")
new_cat = cat_chile_coast[cat_chile_coast["time"].apply(lambda x: x.year)>=2000]

In [13]:
frame_output_file = "chile_events_frame"
new_cat.to_pickle(frame_output_file)