# (II) Detection and Picking
This notebook demonstrates the use of EQTransformer for performing the earthquake signal detection and seismic phase (P & S) picking on continuous data. Once you have your seismic data - preferentially in mseed format and in individual subfolders for each station- you can perform the detection/picking using the following options:


### Option (I) on preprocessed (hdf5) files:

This option is recommended for smaller time periods (a few days to a month). This allows you to test the perfomance and explore the effects of different parameters while the provided hdf5 file makes it easy to access the waveforms.

For this option you first need to convert your MiniSeed files for each station into a single hdf5 file and a csv file containting the list of traces in the hdf5 file.

You can convert MiniSeed files to a hdf5 file using the following command:

In [5]:
import os
from EQTransformer.utils.hdf5_maker import preprocessor

json_basepath = os.path.join(os.getcwd(),"json/station_list.json")

preprocessor(preproc_dir="preproc",
             mseed_dir='downloads_mseeds', 
             stations_json=json_basepath, 
             overlap=0.3, 
             n_processor=2)


  * SV08 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
  * B921 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
  * SV08 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
  * B921 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
 Station SV08 had 2 chuncks of data
4112 slices were written, 4114.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 2.
Original samplieng rate: 100.0.
  * CA06 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
 Station B921 had 2 chuncks of data
4112 slices were written, 4114.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 2.
Original samplieng rate: 100.0.
  * CA06 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
 Station CA06 had 2 chuncks of data
4112 slices were written, 4114.0 were expected.
Number of 1-components: 0. Number of 2-components: 0.

This will generate one "station_name.hdf5" and one "station_name.csv" file for each of your stations and put them into a directory named "mseed_dir+_hdfs". Then you need to pass the name of the directory containing your hdf5 & CSV files and a model. You can use relatively low threshold values for the detection and picking since EQTransformer is very robust to false positives. Enabling uncertaintiy estimation, outputing probabilities, or plotting all the detected events will slow down the process.

In [6]:
from EQTransformer.core.predictor import predictor
predictor(input_dir='downloads_mseeds_processed_hdfs',   
         input_model='../ModelsAndSampleData/EqT_original_model.h5',
         output_dir='detections1',
         estimate_uncertainty=False, 
         output_probabilities=False,
         number_of_sampling=5,
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.3,                
         P_threshold=0.3,
         S_threshold=0.3, 
         number_of_plots=10,
         plot_mode='time',
         batch_size=500,
         number_of_cpus=4,
         keepPS=False,
         spLimit=60) 

Running EqTransformer  0.1.58
 *** Loading the model ...
*** Loading is complete!
######### There are files for 3 stations in downloads_mseeds_processed_hdfs directory. #########



  0%|                                                                         | 0/9 [00:00<?, ?it/s][A[A[A


 22%|██████████████▍                                                  | 2/9 [00:28<01:39, 14.16s/it][A[A[A


 33%|█████████████████████▋                                           | 3/9 [00:33<01:07, 11.32s/it][A[A[A


 44%|████████████████████████████▉                                    | 4/9 [00:37<00:46,  9.36s/it][A[A[A


 56%|████████████████████████████████████                             | 5/9 [00:42<00:31,  7.99s/it][A[A[A


 67%|███████████████████████████████████████████▎                     | 6/9 [00:47<00:21,  7.10s/it][A[A[A


 78%|██████████████████████████████████████████████████▌              | 7/9 [00:52<00:13,  6.58s/it][A[A[A


 89%|█████████████████████████████

If you are using local MiniSeed files you can generate a station_list.json by supplying an absolute path to a directory containing Miniseed files and a station location dictionary using the stationListFromMseed function like the following:

In [None]:
from EQTransformer.utils.hdf5_maker import stationListFromMseed

mseed_directory = '/Users/username/Downloads/EQTransformer/examples/downloads_mseeds'
station_locations = {"CA06": [35.59962, -117.49268, 796.4], "CA10": [35.56736, -117.667427, 835.9]}
stationListFromMseed(mseed_directory, station_locations)

### Option (II) directly on downloaded MiniSeed files:

You can perform the detection/picking directly on .mseed files. 
This save both prerpcessing time and the extra space needed for hdf5 file. However, it can be more memory intensive. So it is recommended when mseed fils are one month long or shorter.
This option also does not allow you to estimate the uncertainties, write the prediction probabilities, or use the advantages of having hdf5 files which makes it easy to access the raw event waveforms based on detection results.   

In [7]:
from EQTransformer.core.mseed_predictor import mseed_predictor
mseed_predictor(input_dir='downloads_mseeds',   
         input_model='../ModelsAndSampleData/EqT_original_model.h5',
         stations_json=json_basepath,
         output_dir='detections2',
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.7,                
         P_threshold=0.3,
         S_threshold=0.3, 
         number_of_plots=10,
         plot_mode='time_frequency',
         normalization_mode='std',
         batch_size=500,
         overlap=0.9,
         gpuid=None,
         gpu_limit=None) 

01-29 22:39 [INFO] [EQTransformer] Running EqTransformer  0.1.58
01-29 22:39 [INFO] [EQTransformer] *** Loading the model ...
01-29 22:41 [INFO] [EQTransformer] *** Loading is complete!
01-29 22:41 [INFO] [EQTransformer] There are files for 3 stations in downloads_mseeds directory.
01-29 22:41 [INFO] [EQTransformer] Started working on B921, 1 out of 3 ...
01-29 22:41 [INFO] [EQTransformer] 20190901T000000Z__20190902T000000Z.mseed
01-29 22:41 [INFO] [EQTransformer] 20190902T000000Z__20190903T000000Z.mseed






01-29 22:42 [INFO] [EQTransformer] Finished the prediction in: 0 hours and 1 minutes and 0.64 seconds.
01-29 22:42 [INFO] [EQTransformer] *** Detected: 2852 events.
01-29 22:42 [INFO] [EQTransformer]  *** Wrote the results into --> " /Users/mostafamousavi/Desktop/EQTransformer/examples/detections2/B921_outputs "
01-29 22:42 [INFO] [EQTransformer] Started working on CA06, 2 out of 3 ...
01-29 22:42 [INFO] [EQTransformer] 20190901T000000Z__20190902T000000Z.mseed
01-29 22:42 [INFO] [EQTransformer] 20190902T000000Z__20190903T000000Z.mseed






01-29 22:42 [INFO] [EQTransformer] Finished the prediction in: 0 hours and 0 minutes and 36.11 seconds.
01-29 22:42 [INFO] [EQTransformer] *** Detected: 2799 events.
01-29 22:42 [INFO] [EQTransformer]  *** Wrote the results into --> " /Users/mostafamousavi/Desktop/EQTransformer/examples/detections2/CA06_outputs "
01-29 22:42 [INFO] [EQTransformer] Started working on SV08, 3 out of 3 ...
01-29 22:42 [INFO] [EQTransformer] 20190901T000000Z__20190902T000000Z.mseed
01-29 22:43 [INFO] [EQTransformer] 20190902T000000Z__20190903T000000Z.mseed






01-29 22:43 [INFO] [EQTransformer] Finished the prediction in: 0 hours and 0 minutes and 33.85 seconds.
01-29 22:43 [INFO] [EQTransformer] *** Detected: 1593 events.
01-29 22:43 [INFO] [EQTransformer]  *** Wrote the results into --> " /Users/mostafamousavi/Desktop/EQTransformer/examples/detections2/SV08_outputs "


Prediction outputs for each station will be written in your output directory (i.e. 'detections').

'X_report.txt' contains processing info on input parameters used for the detection/picking and final 
results such as running time, the total number of detected events (these are unique events and duplicated ones have been already removed). 

'X_prediction_results.csv' contains detection/picking results in the figures folder you can find the plots for the number of events that you specified in the above comment.