# Detection and Picking
This notebook demonstrates the use of EQTransformer for performing the earthquake signal detection and seismic phase (P & S) picking on continuous data. Once you have your seismic data - preferentially in mseed format and in individual subfolders for each station- you can perform the detection/picking using the following options:


### Option (I) on preprocessed (hdf5) files:

This option is recommended for smaller time periods (a few days to a month). This allows you to test the perfomance and explore the effects of different parameters while the provided hdf5 file makes it easy to access the waveforms.

For this option you first need to convert your MiniSeed files for each station into a single hdf5 file and generated a csv file containting the list of traces in the hdf5 file. You can do this using the following command:

In [2]:
from EQTransformer.utils.hdf5_maker import preprocessor
preprocessor(mseed_dir='downloads_mseeds', 
             stations_json='station_list.json', 
             overlap=0.3,
             n_processor=2)

 *** " downloads_mseeds " directory already exists!
 * --> Do you want to creat a new empty folder? Type (Yes or y) y
  * SV08 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
  * B921 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
  * SV08 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
  * B921 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
 Station SV08 had 2 chuncks of data
4112 slices were written, 4114.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 2.
Original samplieng rate: 100.0.
  * CA06 (1) .. 20190901 --> 20190902 .. 3 components .. sampling rate: 100.0
 Station B921 had 2 chuncks of data
4112 slices were written, 4114.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 2.
Original samplieng rate: 100.0.
  * CA06 (2) .. 20190902 --> 20190903 .. 3 components .. sampling rate: 100.0
 Station CA06 had 2 

This will generate one "station_name.hdf5" and one "station_name.csv" file for each of your stations and put them into a directory named "mseed_dir+_hdfs". Then you need is to pass the name of the directory containing your hdf5 & CSV files and a model. You can use relatively low threshold values for the detection and picking since EQTransformer is very robust to false positives. Enaibeling uncertaintiy estimation, outputing probabilities, or plotting all the detected events will slow down the process.

In [17]:
from EQTransformer.core.predictor import predictor
predictor(input_dir= 'downloads_mseeds_processed_hdfs',   
         input_model='sampleData&Model/EqT1D8pre_048.h5',
         output_dir='detections1',
         estimate_uncertainty=False, 
         output_probabilities=False,
         number_of_sampling=5,
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.30,                
         P_threshold=0.1,
         S_threshold=0.1, 
         number_of_plots=100,
         plot_mode = 'time',
         batch_size=500,
         number_of_cpus=4,
         keepPS=False,
         spLimit=60) 

Running EqTransformer  None
 *** Loading the model ...
*** Loading is complete!
 *** /Users/mostafamousavi/Downloads/shadow-master/doc/source/examples/detections1 already exists!
 --> Type (Yes or y) to create a new empty directory! otherwise it will overwrite!   y
######### There are files for 3 stations in downloads_mseeds_processed_hdfs directory. #########

  0%|                                                                         | 0/9 [00:00<?, ?it/s][A
 22%|██████████████▍                                                  | 2/9 [01:00<03:32, 30.38s/it][A
 33%|█████████████████████▋                                           | 3/9 [01:10<02:25, 24.24s/it][A
 44%|████████████████████████████▉                                    | 4/9 [01:21<01:41, 20.28s/it][A
 56%|████████████████████████████████████                             | 5/9 [01:30<01:07, 16.89s/it][A
 67%|███████████████████████████████████████████▎                     | 6/9 [01:39<00:43, 14.40s/it][A
 78%|███████

### Option (II) directly on downloaded MiniSeed files:

You can perform the detection/picking directly on .mseed files. 
This save both prerpcessing time and the extra space needed for hdf5 file. However, it can be more memory intensive. So it is recommended when mseed fils are one month long or shorter.
This option also does not allow you to estimate the uncertainties, write the prediction probabilities, or use the advantages of having hdf5 files which makes it easy to access the raw event waveforms based on detection results.   

In [24]:
from EQTransformer.core.mseed_predictor import mseed_predictor
mseed_predictor(input_dir= 'downloads_mseeds',   
         input_model='sampleData&Model/EqT1D8pre_048.h5',
         stations_json='station_list.json',
         output_dir='detections2',
         loss_weights=[0.02, 0.40, 0.58],          
         detection_threshold=0.30,                
         P_threshold=0.1,
         S_threshold=0.1, 
         number_of_plots=100,
         plot_mode = 'time_frequency',
         normalization_mode='std',
         overlap = 0.3,
         gpuid=None,
         gpu_limit=None) 

Running EqTransformer  None
 *** Loading the model ...
*** Loading is complete!
######### There are files for 3 stations in downloads_mseeds directory. #########
20190901T000000Z__20190902T000000Z.mseed
20190902T000000Z__20190903T000000Z.mseed


 *** Finished the prediction in: 0 hours and 2 minutes and 32.36 seconds.
 *** Detected: 1459 events.
 *** Wrote the results into --> " /Users/mostafamousavi/Downloads/shadow-master/doc/source/examples/detections2/B921_outputs "
20190901T000000Z__20190902T000000Z.mseed
20190902T000000Z__20190903T000000Z.mseed


 *** Finished the prediction in: 0 hours and 2 minutes and 19.67 seconds.
 *** Detected: 1316 events.
 *** Wrote the results into --> " /Users/mostafamousavi/Downloads/shadow-master/doc/source/examples/detections2/CA06_outputs "
20190901T000000Z__20190902T000000Z.mseed
20190902T000000Z__20190903T000000Z.mseed


 *** Finished the prediction in: 0 hours and 2 minutes and 22.13 seconds.
 *** Detected: 892 events.
 *** Wrote the results into

Prediction outputs for each station will be written in your output directory (i.e. 'detections').

'X_report.txt' contains processing info on input parameters used for the detection/picking and final 
results such as running time, the total number of detected events (these are unique events and duplicated ones have been already removed). 

'X_prediction_results.csv' contains detection/picking results in the figures folder you can find the plots for the number of events that you specified in the above comment.

You can choose between two different modes for your plots:
1) 'time':

![B921_PB_EH_2019-09-01T00:28:00.008300Z.png](attachment:B921_PB_EH_2019-09-01T00:28:00.008300Z.png)

2) 'time_frequency':

![2019-09-01%2000:28:00.008300.png](attachment:2019-09-01%2000:28:00.008300.png)