# Module 6 - Final Module

Berkeley Region Probabilistic Seismic Hazard Analysis

**Before class reading: **
- Earthquake Outlook for the San Francisco Bay Region 2014–2043, USGS report
- Earthquake Hazards 101 – the Basics, Q&A from USGS https://earthquake.usgs.gov/hazards/learn/basics.php
- Bakun et al. (2005). Implications for prediction and hazard assessment from the 2004 Parkfield earthquake, Nature, 437, 969 – 974.

**Our goals for in-class:**
- Load a Bay Area seismic catalog
- Compute the distance and time interval between earthquakes, and use these to indentify aftershocks
- Remove the aftershocks from the catalog (decluster)
- Least-squares fitting of the Gutenberg-Richter law Bay Area catalog
- Load peak ground acceleration observations from two notable M6 quakes in California
- Fit a GMPE
- Make a probabilistic seismic hazard analysis for a single M6.5 event
- Make a probabilistic seismic hazard analysis for 4 large earthquakes on major Bay Area faults

**Our goals for take-home:**
- Make a probabilistic seismic hazard analysis for 1000 uniformly distributed random earthquakes 
- Make a probabilistic seismic hazard analysis for 1000 random earthquakes with occurances obeying to the Gutenberg-Richter law

# Probabilistic Seismic Hazard Analysis Module

In this module we will combined the results from the previous two modules on Gutenberg-Richter Statistics of earthquakes, and the statistics of ground motion prediction equations (GMPE), to develop what is referred to as a probabilistic seismic hazard analysis (PHSA). PSHA is essentially cast as a statement of the probability that a certain ground motion level will be exceeded in a given time frame. Typically 50 years is assumed (sometimes 30 years is used). For example the map below shows the peak acceleration level with a 10% probabily of exceedence in 50  years. This map shows the level of ground motion that has a 10% chance of being exceeded in 50 years. Thus it is also saying that there is a 90% chance that it won't be. Thus if you design to these levels then you are designing for earthquakes with ground motions that are likely to be larger than what the structure will experience in a 50 year lifetime. For more critical structures design can be for 5% and 2% in 50 years, which have larger acceleration levels. Note that it is not a 100% certainty. It is possible, though based on the statistics unlikely that larger ground motions will be experienced in the targetted time frame.

<img src="./Figures/psha_map_california2.png">
USGS PSHA map showing acceleration levels for 10% probability of exceedence in 50 years. Units are ground acceleration in g.

The purpose of the module to to show how the Gutenberg-Richter  and ground motions models and their uncertainties are utilized to characterize ground motion hazard in a probabilistic sense. You will put these ideas together to estimate the ground motion hazard for the Berkeley Campus.

## Setup

Run this cell as it is to setup your environment.

In [None]:
import math
import numpy as np
import pandas as pd
from scipy import stats
import datetime
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature


## Gutenberg Richter Earthquake Occurrence Statistics


The frequency of earthquake recurrence as a function of magnitude has been a focus of seismological research since Gutenberg and Richters pioneering work (Gutenberg and Richter, 1949). The evidence shows that the numbers of earthquakes in a given time period scales logarthmically with magnitude. To first order there are 10 times more magnitude 5 earthquakes compared to magnitude 6 events, and 10 times more magnitude 4 earthquakes compared to magnitude 5s.

Gutenberg and Richter found that when the logarithm of the number of earthquakes is plotted vs. magnitude that the distribution mqy be plotted as the line, log(N)=A+Bm, where N is the number of earthquakes, m is the magnitude and A and B are the slope and intercept of a line. For the example described above the B-value is equal to -1 (there are 10 times fewer earthquakes for an increase of one magnitude unit). An important point to keep in mind that these parameters are based on a primary earthquake catalog in which aftershocks have been removed. The process of aftershock removal is called declustering.

Why is this important? The A- and B-values are often used to characterize the rates of earthquakes to identify regional variability. The B-value (slope parameter) is often used to distinquish between 'normal' and 'swarm-like' earthquake behavior. In geothermal areas it has been observed that the earthquake distribution is richer in small earthquakes indicating a B-value significantly less than -1. 

Gutenberg Richter is also used to characterize seismic hazard in a region by defining the annual rate of earthquake occurrence. In this module you will analyze a earthquake catalog downloaded from the Northern California Earthquake Data Center for a 100 km radius around the Berkeley Campus. You will learn how to decluster the seismicity catalog, estimate the Gutenberg Richter A- and B- values, and estimate the annual recurrence rates of large earthquake in the region. In a subsequent module you will estimate the A- and B- values and utilize the Gutenberg Richter coefficients and their uncertainty to estimate the strong ground shaking hazard for campus.

### Load  and Plot the Earthquake Catalog

Load the .csv data file of all the earthquakes 1900 - 2018 in the ANSS (Advanced National Seismic System) catalog from 100 km around Berkeley.

In [None]:
# read data
# This catalog is a M0+ search centered at Berkeley radius=100km. 
# A big enough radius to include Loma Prieta but exclude Geysers.
data=pd.read_csv('anss_catalog_1900to2018all.txt', sep=' ', delimiter=None, header=None,
                 names = ['Year','Month','Day','Hour','Min','Sec','Lat','Lon','Mag'])

#  create data arrays
year=data.Year.values
month=data.Month.values
day=data.Day.values
hour=data.Hour.values
mn=data.Min.values
sec=data.Sec.values
lat=data.Lat.values
lon=data.Lon.values
mag=data.Mag.values
nevt=len(year)        #number of events 

### Decluster

For each earthquake in the catalog with magnitude M, the subsequent earthquakes are determined to be aftershocks if they occur within a distance L(M) and time interval T(M). An example of aftershock windows from Gardner and Knopoff (1974) is shown below.

<img src="Figures/aftershock_windows.png" width=600>

To build our algorithm to identify aftershock using these windows we need to convert the year-month-day formate of dates to a timeline in number of days. We'll do this using the function `datetime.date` which for a given year, month, and day returns a datetime class object, which can be used to compute the time interval in days.

In [None]:
#Determine the number of days from the first event
days=np.zeros(nevt) # initialize the size of the array days

for i in range(0,nevt,1):
    d0 = datetime.date(year[0], month[0], day[0])
    d1 = datetime.date(year[i], month[i], day[i])
    delta = d1 - d0
    days[i]=delta.days # fill days in with the number of days since the first event (7/1/1911)

In [None]:
# plot magnitude vs. time
fig, ax = plt.subplots(figsize=(10,10))
ax.plot(days, mag,'o',alpha=0.2,markersize=5)
ax.set(xlabel='Days', ylabel='Magnitude',
       title='Raw Event Catalog')
ax.grid()

fig.savefig("figure1.png")
plt.show()

print(f'Number={nevt:d} MinMag={min(mag):.2f} MaxMag={max(mag):.2f}')

We also need a function to compute the great circle distance in km between earthquakes. We'll use the haversine formula for the great circle distance which is works well conditioned for small distances.

<img src="Figures/great_circle_eqn.png" width=800 >


<img src="Figures/Illustration_of_great-circle_distance.svg" width=300 >
Great-circle distance shown in red between two points on a sphere, P and Q. 
Source: https://en.wikipedia.org/wiki/Great-circle_distance

In [None]:
#This function computes the spherical earth distance between to geographic points and is used in the
#declustering algorithm below
def haversine_np(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)

    All args must be of equal length.
    
    The first pair can be singular and the second an array

    """
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2]) # convert degrees lat, lon to radians

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2  # great circle inside sqrt

    c = 2 * np.arcsin(np.sqrt(a))   # great circle angular separation
    km = 6371.0 * c   # great circle distance in km, earth radius = 6371.0 km
    return km

__Declustering Algorithm__

We'll build our `for` loop for indentifying aftershocks in the seismic catalog.

In [None]:
#Decluster the Catalog  Note: This cell may take a few minute to complete
cnt=0 # initialize a counting variable
save=np.zeros((1,10000000),dtype=int) # initialize a counting variable
for i in range(0,nevt,1):   # step through EQ catalog
    # logical if statements to incorporate definitions of Dtest and Ttest aftershock window bounds
    Dtest=np.power(10,0.1238*mag[i]+0.983)   # distance bounds
    if mag[i] >= 6.5:
        Ttest=np.power(10,0.032*mag[i]+2.7389)  # aftershock time bounds for M >= 6.5
    else:
        Ttest=np.power(10,0.5409*mag[i]-0.547)  # aftershock time bounds for M < 6.5
    
    a=days[i+1:nevt]-days[i]    # time interval in days to subsequent earthquakes in catalog
    m=mag[i+1:nevt]   # magnitudes of subsequent earthquakes in catalog
    b=haversine_np(lon[i],lat[i],lon[i+1:nevt],lat[i+1:nevt]) # distance in km to subsequent EQs in catalog
    
    icnt=np.count_nonzero(a <= Ttest)   # counts the number of potential aftershocks, 
                                        # the number of intervals <= Ttest bound
    if icnt > 0:  # if there are potential aftershocks
        itime=np.array(np.nonzero(a <= Ttest)) + (i+1) # indices of potential aftershocks <= Ttest bound
        for j in range(0,icnt,1):   # loops over the aftershocks         
            if b[j] <= Dtest and m[j] < mag[i]: # test if the event is inside the distance window 
                                                # and that the event is smaller than the current main EQ
                save[0][cnt]=itime[0][j]  # index value of the aftershock
                cnt += 1 # increment the counting variable

                
af_ind=np.delete(np.unique(save),0)   # This is an array of indexes that will be used to delete events flagged 
                                      # as aftershocks    


Use `np.delete(array,indices_to_delete)` to delete the aftershock events.

In [None]:
# delete the aftershock events
declustered_days=np.delete(days,af_ind)  #The aftershocks are deleted from the days array 
declustered_mag=np.delete(mag,af_ind)    #The aftershocks are deleted from the mag array 
declustered_lon=np.delete(lon,af_ind)    #The aftershocks are deleted from the lon array 
declustered_lat=np.delete(lat,af_ind)    #The aftershocks are deleted from the lat array 
n=len(declustered_days)

In [None]:
#Plot DeClustered Catalog
fig, ax = plt.subplots(figsize=(10,10))
ax.plot(declustered_days, declustered_mag,'o',alpha=0.2,markersize=5)
ax.set(xlabel='days', ylabel='magnitude',
       title='Declustered Event Catalog')
ax.grid()

plt.show()

print(f'Number={n:d} MinMag={min(declustered_mag):.2f} MaxMag={max(declustered_mag):.2f}')

In [None]:
#Make a Map of Main shock events

#Set Corners of Map
lat0=36.75
lat1=39.0
lon0=-123.75
lon1=-121.0
tickstep=0.5 #for axes
latticks=np.arange(lat0,lat1+tickstep,tickstep)
lonticks=np.arange(lon0,lon1+tickstep,tickstep)

plt.figure(1,(10,10))
ax = plt.axes(projection=ccrs.PlateCarree())
ax.set_extent([lon0, lon1, lat0, lat1], crs=ccrs.PlateCarree())
ax.set_aspect('auto')
ax.coastlines(resolution='10m',linewidth=1) #downloaded 10m, 50m
ax.set_xticks(lonticks)
ax.set_yticks(latticks, crs=ccrs.PlateCarree())
ax.set(xlabel='Longitude', ylabel='Latitude',
       title='Declustered Catalog')


x=declustered_lon
y=declustered_lat
z=declustered_mag

#Sort Descending to plot largest events on top
indx=np.argsort(z)   #determine sort index
x=x[indx]            #apply sort index
y=y[indx]
z=np.exp(z[indx])    #exponent to scale size

c = plt.cm.plasma(z/max(z))
plt.scatter(x, y, s=(z/2), facecolors=c, alpha=0.4, edgecolors=c, marker='o', linewidth=2)
plt.plot(-122.2727,37.8716,'rs',markersize=8)


plt.show()

### Gutenberg Richter Model Fitting

In [None]:
# the observed log10 number of events per year as function of magnitude (data)
#Write code here to recompute the A_declustered matrix considering only M greater than the magnitude of completeness
min_mag=1.5
max_mag=np.max(declustered_mag)
m_declustered=np.arange(min_mag,max_mag,0.1)
N_declustered=np.zeros(len(m_declustered))
numyr=(max(declustered_days)-min(declustered_days))/365
for i in range(0,len(m_declustered),1):
    N_declustered[i]=np.log10(np.count_nonzero(declustered_mag >= m_declustered[i])/numyr)    


In [None]:
# Solve for Model Parameters
# Declustered events
soln_declustered =np.polyfit(m_declustered,N_declustered,1)
GR_a_value = soln_declustered[1]
GR_b_value = soln_declustered[0]
x_declustered = m_declustered
y_declustered = np.polyval(soln_declustered,m_declustered)

print(GR_a_value ,GR_b_value)

In [None]:
# plot the observed relationship between M and log10N
plt.figure(1,(10,10))
plt.plot(m_declustered,N_declustered,'o',color='green',label='Declustered Catalog');
plt.plot(x_declustered,y_declustered,'k-',label='Gutenberg Richter Model ');
plt.xlim(0, 7);
plt.ylim(-2.1, 3);
plt.xlabel('Magnitude', fontsize=16);
plt.ylabel('Number of earthquakes, $log_{10}$ N', fontsize=16);
plt.legend(fontsize=16)
plt.grid()

In [None]:
actual_rate=10**(N_declustered[np.max(np.nonzero(m_declustered < 6.1))])
mag=m_declustered[np.max(np.nonzero(m_declustered < 6.1))]
predicted_rate=10**(GR_a_value + GR_b_value * mag)



print ('Predicted from Gutenberg Richter Model M=%3.1f events per year: %4.3f; Years per event: %4.1f'% (mag,predicted_rate,1/predicted_rate))
print ('Observed M=%3.1f events per year: %4.3f; Years per event: %4.1f'% (mag,actual_rate,1/actual_rate))

# Analysis of Strong Ground Motion Data

Earthquakes are the sudden dislocation of rock on opposite sides of a fault due to applied stress. Seismic waves are generated by this process and propagate away from the fault affecting nearby communities. It is the strong shaking from earthquakes that we recognize as the earthquake. These motions can lead to landslides, liquefaction of the ground, and of course impact anything built within or on the ground. The motions generated by fault dislocation affect many aspects of modern society. Earthquake Engineering is a field that studies the ground motions generated by earthquakes and how they affect the built environment. To utilize ground motions for engineering applications requires studying the physics of seismic wave propagation, and the development of models that effectively describe it. Of particular importance is the need to accurately model and predict seismic wave amplitudes. Such studies generally focus on examining the peak acceleration and velocity as a function of distance from the source. The physics indicates that the ground motions generally decrease in amplitude with increasing distance.

In this module we will investigate peak ground acceleration observations from two notable M6 quakes in California, the 2004 Parkfield and the 2014 West Napa earthquakes. You will analyze the data by fitting a ground motion prediction equation (GMPE), i.e. an attenuation relationship that describes the rate at which ground motions decrease with increasing distance from the source. Such GMPE relationships are of primary importance in being able to forecast the effects of future earthquakes taking into account the uncertainty as manifest as observed variance in motions with respect to median levels in many events. This information coupled with the statistics of earthquake occurrence rates, notably Gutenberg-Richter statistics, provides the frame work for characterizing future ground motion hazard.

### Load and Plot Peak Ground Acceleration Data

Make a plot showing the data in a loglog projection. Note the fact that in a single earthquake there can be significant gaps in coverage, but when considered together a more complete representation of strong ground motion attenuation may be obtained. Note that the data is also non-linear even in the log-log projection. The following is an example for the 2004 Parkfield earthquake.

<img src="./Figures/parkonly.png">

One 'g' is the gravitational acceleration at the surface of the Earth and has a value of 981 cm/$s^2$. Earthquake Engineers commonly use the peak ground acceleration in such units in their geotechnical materials and structural engineering analyses. 0.1%g is the level people generally can perceive shaking, at 2%g some people may be disoriented, at 50% the shaking is very violent and unengineered structures can suffer damage and collapse, while well engineered buildings can survive if the duration is short.

In [None]:
#Read Napa and Parkfield Earthquake Peak Ground Acceleration Data
ndist, npga=np.array(pd.read_table('napa_pga.txt')).transpose()
pdist, ppga=np.array(pd.read_table('park_pga.txt')).transpose()
dist=np.hstack((ndist,pdist))
pga =np.hstack((npga,ppga))

In [None]:
#Plot the two data sets
fig, ax = plt.subplots()
plt.loglog(ndist, npga,'r.',pdist,ppga,'b.')
ax.set(xlabel='Distance (km)', ylabel='Peak ground acceleration (g)',
       title='Peak Acceleration Data')
plt.legend(['Napa','Parkfield'],fontsize=12,loc=3)
plt.show()

### Fitting Strong Motion Data

In order to use the observations of peak ground acceleration (and other parameters like peak velocity, or spectral acceleration quantities) it is necessary to develop a model that accurately describes the behavior. From physics it is understood that in the far-field (large distance compared to the source dimension) that ground motions decay as a power law with distance due to the spreading of wave energy in three dimensions as the wavefield travels outward from the earthquake source. This is called geometrical spreading. In addition, there is a inelastic attenuation term that accounts for dissipative energy loss due to material imperfections. Based on theory the following is a simple relationship that describes this behavior.

$pga=a*{\frac{1}{r^b}}*e^{cr}$

where $r=\sqrt{(dist^2 + h^2)}$ is the total distance from the source taking into account an average depth $h$, $a$ is a coeffient that depends on magnitude and scales the overall motions, $b$ is the exponent for the power-law geometrical spreading term, and $c$ is the coefficient for the in-elastic term (important only at large distances). Taking the natural logarithm of this equation yields a linear relationship in the model coeffients.

$\mathrm{ln}(pga)=a + b*\mathrm{ln}(r) + c*r$

For this exercise we will fit the above equation to the data assuming that $c=0$. 

__Compute the ground motion prediction equation (GMPE) for the combined Parkfield and Napa earthquake data sets, plot the results, and print the best fitting solution parameters.__


The following is an example of an unweighted least squares inversion and the 95% confidence intervals for the combined data set.

<img src="./Figures/unweightedfit1.png">

### Abrahamson and Silva (2008) GMPE

The GMPE that you developed for the Napa earthquake data set is actually quite good, but it is limited to only one M6 earthquake. Abrahamson and Silva (2008, AS2008) developed a GMPE considering 2750 recordings from 140 earthquakes ranging in magnitude from 4.27 to 7.62. They report that the derived GMPE is applicable to M 5.0 to 8.5 earthquakes. The following shows the AS2008 relationship for a M7.5 earthquake.

<img src="./Figures/as2008.png">

In the following cell the definition AS2008 GMPE for hard rock, considering, distance, magnitude, and the depth to the top of the fault is given. __The function takes three input arguments, an array of distances, a magnitude and the depth to the top of the fault. The output is the natural logarithm of median peak ground acceleration from the AS2008 GMPE (lnpga), and the 95% confidence level (sigma).__ For M6.5+ events in California we can consider the top of the fault to be at zero depth. A M5 may be at 8 km in comparison.


In [None]:
def as2008(dist,M,Ztor):
    """
    This function takes an array of distances (only horizontal distance, h=0), a magnitude and the depth
    and returns the natural logarithm of median peak ground acceleration from the AS2008 GMPE, and the
    standard deviation (sigma).
    
    The function is not the complete AS2008 formulation. It is limited to the hard rock case Vs30=865, and only
    computes pga.
    
    """
    #Defined by A&S2008 DO NOT CHANGE
    c1=6.75;
    c4=4.5;
    a1=0.804;   #for PGA only this parameter is period dependent
    a2=-0.9679; #for PGA only this parameter is period dependent
    a3=0.265;
    a4=-0.231;
    a5=-0.398;
    a8=-0.0372; #for PGA only this parameter is period dependent
    a16=0.9000; #for PGA only
    VLIN=865.1; #for PGA only note for vs30=vlin f5==0
    #Defined by A&S2008 DO NOT CHANGE
    
    R=np.sqrt(dist*dist + c4*c4)      #compute total distance

    #Standard Deviation varies from 0.8 for M5 to 0.6 for M7 assume linear in ln
    if M <= 7 and M >= 5:
        sigma=0.8+(0.8-0.6)/(5-7)*(M-5)

    if M > 7:
        sigma=0.6;

    #Base model
    if M <= c1:
        f1=a1+a4*(M-c1)+a8*(8.5-M)**2+(a2+a3*(M-c1))*np.log(R)

    if M > c1:
        f1=a1+a5*(M-c1)+a8*(8.5-M)**2+(a2+a3*(M-c1))*np.log(R)

    #Depth of fault
    if Ztor <= 10:
        f6=Ztor/10*a16

    if Ztor > 10:
        f6=a16


    lnpga=f1 + f6
    
    return lnpga, sigma

Use the `as2008` function defined above to compute the GMPE for a M6 earthquake with a top fault depth of 1.0km.

In [None]:
X=np.arange(0.1,100,0.1)  # distances
M=6.0 # Magnitude                
Top=1.0 # depth to the top of the fault
[y, z]=as2008(X,M,Top)


Plot this AS2008 model for a M6 earthquake at 1km depth along with the data.

In [None]:
#Plot Results
fig, ax = plt.subplots()
ax.loglog(dist, pga,'.',X, np.exp(y),'k-',
          X,np.exp(y+z),'r-',X,np.exp(y-z),'r-')
ax.set(xlabel='Distance (km)', ylabel='Peak ground acceleration (g)',
       title='AS2008 Peak Acceleration GMPE')
plt.legend(['Observations','AS2008','AS2008 1$\sigma$'],fontsize=12)
plt.show()

# Estimation of Ground Motion Hazard

First, we found that by compiling a declustered earthquake catalog it was possible to estimate the Gutenberg-Richter statistics for earthquake occurrence. Gutenberg-Richter can then be used to assign the annual rate of occurrence of earthquakes in the broad study region (Greater San Francisco Bay Area), and this annual rate can then be used to estimate the probability of earthquake occurrence for a specific interval of time (commonly 30 years). 

In the previous section we found that the strong ground motions from earthquakes generally decrease with distance and we devised an approach to model these ground motions to develop a predicative relationship. For the two earthquakes studied it was found that differences in the distance-distribution of observations do have an effect on the estimated model parameters and, that combining the two data sets provides better overall constraint of the relationship. The dispersion of the data indicates that there is substantial uncertainty in ground motions at a given distance from an earthquake, which can be due to differences in the source processes of earthquakes, lateral heterogeneity in Earth structure affecting propagation, and site effects (e.g amplification at soft deep sediment sites compared to rock sites).

To communicate ground motion hazard it is necessary to combine probability of occurrence of earthquakes of given magnitude in a prescribed time frame (e.g. 30 years), and the probability that when an earthquake of given magnitude occurs the ground motion will exceed some value. Thus the fitted relationships and their respective uncertainties need to be combined. The probability of exceeding a ground motion level (say 10%g) is the product of the probability of occurrence of an earthquake with given magnitude, and the probability that the ground motion of that magnitude event, at the distance of the event from the location of interest will exceed that level. For this exercise we will compute this hazard for the UC Berkeley campus using the Gutenberg-Richter statistics for the San Francisco bay area.

#### Hazard due to a single event

Write code to estimate the hazard of ground motion of levels from 0.01 to 3.0 g due to an event with magnitude 6.5 at a distance of 10 km and depth to fault top of 1.0km. Use an interval of 0.01g. The ground motion hazard curve for an individual magnitude is the product of the event occurrence probability (from Gutenberg-Richter) and the probability of exceeding a ground motion level (using the AS2008 GMPE). For the ground motion exceedence probability you can sample the normal distribution considering the median and standard deviations for each magnitude level from as2008. 

In [None]:
A=3.322              #Gutenberg Ricter A value, A95%=0.082 
B=-0.797             #Gutenberg Ricter B value, B95%=0.018
lam=10**(A+B*6.5)    # annual Gutenberg-Ricter recurrance rate for a M6.5 earthquake in the bay area

acclev=np.arange(0.01,3.0,0.01) # ground motion levels
# call as2008 to compute the ln peak ground acceleration for a given distance, magnitude, and fault depth
[mu, sigma]=as2008(10,6.5,1.0)
# Probability of log acceration values from normal density function with mean and std from as2008 
prob_lev = stats.norm.cdf(np.log(acclev),mu,sigma) 
# Take the complement to find prob of a greater value
prob_acc = 1 - prob_lev
    
hazard=prob_acc*lam;  #annual rate of exceedence



Make a loglog plot of the hazard curve: annual rate of exceedence on the y-axis and acceleration level on the x-axis.

In [None]:
#Plot Results
fig, ax = plt.subplots()
ax.loglog(acclev,hazard,'k-')
ax.set(xlabel='Ground Acceleration, g', ylabel='Annual Rate of Exceedence',
       title='Hazard Curve, M6.5 at 10km')
plt.show()

Compute and print the acceleration levels there are `PP` = 10, 5, and 2 percent chances of exceeding in an interval of `intervalTime` = 50 years i.e. the first acceleration level where the annual rate of exceedence is less than or equal to -1*log(1-`PP`)/`intervalTime`.

In [None]:
print(f'Hazard Results for a single M6.5 event at 10km distance:')
# acceleration level that there is a 10% chance of exceeding in 50 years
intervalTime=50.0              #Years for Hazard reference level
PP=0.10                        #Hazard reference level
per10=np.nonzero(hazard <= -1.*np.log(1-PP)/intervalTime)
al10=acclev[per10[0][0]]
print(f'10 percent chance in {intervalTime:.0f} years of exceeding {al10:.2f} g')

# acceleration level that there is a 5% chance of exceeding in 50 years
per5=np.nonzero(hazard <= -1.*np.log(1-0.05)/intervalTime)
al5=acclev[per5[0][0]]
print(f' 5 percent chance in {intervalTime:.0f} years of exceeding {al5:.2f} g')

# acceleration level that there is a 2% chance of exceeding in 50 years
per2=np.nonzero(hazard <= -1.*np.log(1-0.02)/intervalTime)
al2=acclev[per2[0][0]]
print(f' 2 percent chance in {intervalTime:.0f} years of exceeding {al2:.2f} g')

#### Hazard due to multiple events

Next estimate the hazard of ground motion of levels from 0.01 to 3.0 g due to larege events on the Hayward, Rogers Creek and San Andreas faults. Table 1 gives the magnitudes, distances and annual recurrence rates (based on paleoseismic data) to consider. The annual rate of exceeding a ground motion level for each magnitude/distance case is the product of the probability of exceeding and the rate of occurrence. The total probability is the sum of the exceedence curves for each of the events divided by the number of events considered. <img src="./Figures/Table_1.png">

In [None]:
acclev=np.arange(0.01,3.0,0.01) #ground motion levels

#Create Magnitude , distance , and lam arrays from Table 1,  and 0.0 for Ztop
tab1_M=[8.0, 7.0, 7.0, 7.0] #randomly sample magnitudes
Ztop=np.zeros(4)     # Apply Ztop (top of fault) model 
lam=[0.005, 0.007, 0.008, 0.007] # annual Gutenberg-Ricter rate for an earthquake of each M in RandM    
tab1_R=[10.0, 1.0, 30.0, 20.0]   #randomly sample distances


In [None]:
#Compute Probabilistic Ground Motion Hazard
#Compute mean and standard deviation of ground motion for each event (magnitude, distance) using AS2008 

prob_acc=np.ones(len(acclev))
big1_hazard=np.ones((4,len(acclev)))

for j in range(0,4,1):
    # call as2008 to compute the acceleration for a given distance, magnitude, and faulth depth
    [mu, sigma]=as2008(tab1_R[j],tab1_M[j],Ztop[j])

    #Create Normal Distribution on log values and find the probabity of exceeding a value
    prob_lev = stats.norm.cdf(np.log(acclev),mu,sigma)
    #Take the complement to find prob of a greater value
    prob_acc=1 - prob_lev

    big1_hazard[j,:]=prob_acc*lam[j];  #annual rate of exceedence

#total probability is the sum of the exceedence curves for each of the events divided by the number of events considered
total_hazard=np.sum(big1_hazard,0)/4 #this will have same length as acclev

intervalTime=50.0              #Years for Hazard reference level
PP=0.10                        #Hazard reference level
#reference excedence probability: PP (10) percent in intervalTime (50) years _____ g
x_ref_lev=np.ones(len(acclev))*(-1*np.log(1-PP)/intervalTime)

In [None]:
#Plot Results
fig, ax = plt.subplots()
ax.loglog(acclev,total_hazard,'k-',acclev,x_ref_lev,'r-')

ax.set(xlabel='Ground Acceleration,g', ylabel='Annual Rate of Exceedence',
       title='Hazard Curve, Large events on 4 closest faults')
ax.legend(['Annual Rate of Exceedence','10% chance of exceeding in 50 years'])
plt.show()

#print
per10=np.nonzero(total_hazard <= -1.*np.log(1-0.1)/intervalTime)
al10=acclev[per10[0][0]]
per5=np.nonzero(total_hazard <= -1.*np.log(1-0.05)/intervalTime)
al5=acclev[per5[0][0]]
per2=np.nonzero(total_hazard <= -1.*np.log(1-0.02)/intervalTime)
al2=acclev[per2[0][0]]
print(f'Specific Hazard Results:')
print(f'10 percent chance in {intervalTime:.0f} years of exceeding {al10:.2f} g')
print(f' 5 percent chance in {intervalTime:.0f} years of exceeding {al5:.2f} g')
print(f' 2 percent chance in {intervalTime:.0f} years of exceeding {al2:.2f} g')

Do the accelerations associated with the portion of the black annual rate of exceedence curve below the red reference exceedence probability (10% chance of exceeding in 50 years) have a greater or less than 10% chance of being exceeded? 

How do those acceleration levels measure of the peak ground acceleration scale: <img src="./Figures/peak_accel_scale.png"> <img src="./Figures/PGM_scale_legend.png">

# Start of Take-Home Portion

### Hazard Computations

Consider 1000 earthquakes with random magnitudes from M5 to M6.9 at random distances from 0.1 to 50 km. Use `np.random.uniform()` to generate the random magnitude and distance distributions. Estimate the hazard of ground motion of levels from 0.01 to 3.0 g from to these random events. Model the depth of the top of the fault as 5.0-3.3*(magnitude - 5.0), but do not allow negative depths replace any with 0.0. This depth model approaximates the relationship between magnitude and rupture depth with the top depth of rupture for M5 events at 5.0km and M6.5+ events rupturing to the surface. The annual rate of exceeding a ground motion level for each magnitude/distance case is the product of the probability of exceedence and the rate of occurrence. The total probability is the sum of the hazard curves for each of the events divided by the number of events considered.

`np.random.uniform()` has three inputs: the minimum and maximum values for the distribution, and the number of random draws.

Plot and print your results for the levels of acceleration there are 10%, 5%, and 2% chances of exceeding in 50 years.

But we know that earthquakes are not uniformly distributed in magnitude, rather they follow the Gutenberg-Richter law. Again consider 1000 earthquakes with random magnitudes from M5 to M6.9 at random distances from 0.1 to 50 km, but now  use `np.random.choice()` to generate the random magnitudes from a Gutenburg-Richter distribution. 

`np.random.choice()`  has four inputs `numpy.random.choice(a, size=N, replace=True, p=y)` where `a` is asarray that a random sample is generated from, `size=` sets the number of random samples, `replace=True` sets the samples to be made with replacement, and `p=` sets the probabilities associated with each entry in `a` (https://docs.scipy.org/doc/numpy/reference/generated/numpy.random.choice.html). For our purpose you will use  `np.random.choice(x, size=N, replace=True, p=(10**y/np.sum(10**y)))` where `x` is an array of magnitudes from M5.0 to 6.9, and `y` is the logN number of events per year. Use `np.polyval` to set `y` using the Gutenberg-Richter model parameters we estimated in-class.

Use `np.random.uniform()` again to generate the distance distribution. Estimate the hazard of ground motion of levels from 0.01 to 3.0 g from to these random events. Model the depth of the top of the fault as 5.0-3.3*(magnitude - 5.0), but do not allow negative depths - replace any with 0.0. This depth model approaximates the relationship between magnitude and rupture depth with the top depth of rupture for M5 events at 5.0km and M6.5+ events rupturing to the surface. 

The annual rate of exceeding a ground motion level for each magnitude/distance case is the product of the probability of exceedence and the rate of occurrence. The total probability is the sum of the hazard curves for each of the events divided by the number of events considered.

Plot a histogram of your random earthquake magnitudes, with a log y-axis, to show that the magnitudes follow the Gutenberg-Richter law.  Include appropriate labels.

Plot and print your results for the levels of acceleration there are 10%, 5%, and 2% chances of exceeding in 50 years.

The probability of exceeding a ground motion is $P=1-e^{-\lambda *\Delta T}$, where $\lambda$ is the annual rate of exceedence (total hazard from above), and  $\Delta T$ is the reference interval of time. Use a $\Delta T = 50$ years to compute this probability as a function of acceleration level. Plot your result. Your plot will look something like: <img src="./Figures/prob_accel.png"> or <img src="./Figures/loglog_prob_accel.png"> depending on how you decide to make your axes.

### Visualization

Use `np.table_read` to read in the Longitude, Latitude files of the locations of the major Bay Area faults. The files are named: San_Andreas.txt, San_Gregorio.txt, Hayward.txt, Calaveras.txt, Hunting_Creek.txt, Rogers_Creek.txt, Concord.txt, Greenville.txt, Maacama.txt.
The first column is longitude and the second column is latitude.

In [None]:
a=pd.read_table('Hayward.txt') 
hay_lon=a.x.values
hay_lat=a.y.values

# load the rest of the faults

As we did during the in-class portion of this module, plot the main shock Bay Area earthquakes with their color and marker size scaled by the event magnitude. On top of the earthquakes plot the locations of the major faults as lines and the location of UC Berkeley Campus (as we did in-class). Include appropriate axes labels, a plot title, and a legend.

Make publication quality (clear, concise, and attractive) visualizations which you will use in your report to communicate to an audience of your peers the 1) Gutenburg-Ritcher distribution of Bay Area earthquake magnitudes, 2) the GMPE describing the relationship between ground acceleration, distance to the fault, and earthquake magnitude, and 3) the hazard curves you computed above. Make some well thought out decisions about weather to use linear or loglog axes - which is a better tool for communication (or both). For the hazard curves, make some well thought out decisions about weather to use the "Annual Rate of Exceedence" or "Probability of Exceeding in 50 Years" vs. Acceleration level curve. Which is EASIER to communicate? In either case include color-coded vertical reference lines communicating the percieved shaking levels corresponding with the acceleration levels on the hazard curve. For all of these plots include appropriate axes labels and plot titles. Use `plt.text()` to add text annotations if they will help your message.

Add to the plot lines that indicate the acceleration levels from the peak ground acceleration scale: <img src="./Figures/peak_accel_scale.png"> <img src="./Figures/PGM_scale_legend.png">

### Communication

Read "Public Education for Earthquake Hazards" by Sarah K. Nathe and answer the following questions.  A PDF is in the class bcourses file space.

Are your peers (undergraduate students at UC Berkeley) likely to heed information about hazards and do something to increase their safety? What arguements would you find persuasive to take action (make a emergency kit and plan, purchase earthquake insurance)?

_Write your answer here._

What are six 'immutable laws' of effective public education programs accourding to Nathe (2000)? What do lay people prefer public in public education programs?

How do you think the internet and social media - which were not widely used or non-exsistance in 2000 when this paper was written - can contribute to effective public education programs?

_Write your answer here._

What recent seismic event are we in the window of opportunity of to exploit for your public education report? What was the magnitude and location? Where there compelling photographs or videos you could use?

_Write your answer here._

Another useful resource for facts about the potential hazards in the Bay Area is the USGS's Haywired Scenario report. We read Chapter C for Module 4b, but Chapter A also has a LOT of useful information. A PDF is in the class bcourses file space.