# (II) Detection and Picking
This notebook demonstrates the use of EQTransformer for performing the earthquake signal detection and seismic phase (P & S) picking on continuous data. Once you have your seismic data - preferentially in mseed format and in individual subfolders for each station- you can perform the detection/picking using the following options:


### Option (I) on preprocessed (hdf5) files:

This option is recommended for smaller time periods (a few days to a month). This allows you to test the perfomance and explore the effects of different parameters while the provided hdf5 file makes it easy to access the waveforms.

For this option you first need to convert your MiniSeed files for each station into a single hdf5 file and a csv file containting the list of traces in the hdf5 file.

You can convert MiniSeed files to a hdf5 file using the following command:

In [5]:
import os
from EQTransformer.utils.hdf5_maker import preprocessor

json_basepath = os.path.join(os.getcwd(),"json/station_list.json")


In [None]:

preprocessor(preproc_dir="preproc",
             mseed_dir='downloads_mseeds', 
             stations_json=json_basepath, 
             overlap=0.3, 
             n_processor=2)

This will generate one "station_name.hdf5" and one "station_name.csv" file for each of your stations and put them into a directory named "mseed_dir+_hdfs". Then you need to pass the name of the directory containing your hdf5 & CSV files and a model. You can use relatively low threshold values for the detection and picking since EQTransformer is very robust to false positives. Enabling uncertaintiy estimation, outputing probabilities, or plotting all the detected events will slow down the process.

In [None]:
from EQTransformer.core.predictor import predictor
predictor(input_dir='downloads_mseeds_processed_hdfs',   
         input_model='../ModelsAndSampleData/EqT_model.h5',
         output_dir='detections1',
         estimate_uncertainty=False, 
         output_probabilities=False,
         number_of_sampling=5,
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.3,                
         P_threshold=0.1,
         S_threshold=0.1, 
         number_of_plots=10,
         plot_mode='time',
         batch_size=500,
         number_of_cpus=4,
         keepPS=False,
         spLimit=60) 

If you are using local MiniSeed files you can generate a station_list.json by supplying an absolute path to a directory containing Miniseed files and a station location dictionary using the stationListFromMseed function like the following:

In [None]:
from EQTransformer.utils.hdf5_maker import stationListFromMseed

mseed_directory = '/Users/username/Downloads/EQTransformer/examples/downloads_mseeds'
station_locations = {"CA06": [35.59962, -117.49268, 796.4], "CA10": [35.56736, -117.667427, 835.9]}
stationListFromMseed(mseed_directory, station_locations)

### Option (II) directly on downloaded MiniSeed files:

You can perform the detection/picking directly on .mseed files. 
This save both prerpcessing time and the extra space needed for hdf5 file. However, it can be more memory intensive. So it is recommended when mseed fils are one month long or shorter.
This option also does not allow you to estimate the uncertainties, write the prediction probabilities, or use the advantages of having hdf5 files which makes it easy to access the raw event waveforms based on detection results.   

In [None]:
from EQTransformer.core.mseed_predictor import mseed_predictor
mseed_predictor(input_dir='downloads_mseeds',   
         input_model='../ModelsAndSampleData/EqT_model2.h5',
         stations_json=json_basepath,
         output_dir='detections3',
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.3,                
         P_threshold=0.1,
         S_threshold=0.1, 
         number_of_plots=10,
         plot_mode='time_frequency',
         normalization_mode='std',
         batch_size=500,
         overlap=0.3,
         gpuid=0,
         gpu_limit=None) 

Prediction outputs for each station will be written in your output directory (i.e. 'detections').

'X_report.txt' contains processing info on input parameters used for the detection/picking and final 
results such as running time, the total number of detected events (these are unique events and duplicated ones have been already removed). 

'X_prediction_results.csv' contains detection/picking results in the figures folder you can find the plots for the number of events that you specified in the above comment.

In [2]:
import os
#from EQTransformer.utils.hdf5_maker import preprocessor

json_basepath = os.path.join(os.getcwd(),"json/station_list_sds.json")

from EQTransformer.core.mseed_predictor import sds_predictor
from obspy import UTCDateTime
sdsdir='/home/wwj/Data/waves/waves'
t1 = UTCDateTime('2019-09-01:00:00')
t2 = UTCDateTime('2019-09-03:00:00')
sds_predictor(t1,t2,sds_dir=sdsdir,   
         input_model='../ModelsAndSampleData/EqT_model.h5',
         stations_json=json_basepath,
         output_dir='detections_sds3',
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.3,                
         P_threshold=0.1,
         S_threshold=0.1, 
         number_of_plots=0,
         plot_mode='time_frequency',
         normalization_mode='std',
         batch_size=500,
         overlap=0.3,
         gpuid=0,
         gpu_limit=None) 

05-06 12:59 [INFO] [EQTransformer] Running EqTransformer  0.1.61
05-06 12:59 [INFO] [EQTransformer] *** Loading the model ...


NotImplementedError: Cannot convert a symbolic Tensor (bidirectional_1/forward_lstm_1/strided_slice:0) to a numpy array. This error may indicate that you're trying to pass a Tensor to a NumPy call, which is not supported