# Downloading Continuous Data
This notebook demonstrates the use of EQTransformer for downloading continuous data and for make hdf5 files . 


In [12]:
from EQTransformer.utils.downloader import makeStationList, downloadMseeds

You can use help() to learn about input parameters of each fuunction. For instance:

In [13]:
help(makeStationList)

Help on function makeStationList in module EQTransformer.utils.downloader:

makeStationList(json_path, client_list, min_lat, max_lat, min_lon, max_lon, start_time, end_time, channel_list=[], filter_network=[], filter_station=[], **kwargs)
    Uses fdsn to find available stations in a specific geographical location and time period.  
    
    Parameters
    ----------
    json_path: str
        Path of the json file that will be returned
    
    client_list: list
        List of client names e.g. ["IRIS", "SCEDC", "USGGS"].
                                
    min_lat: float
        Min latitude of the region.
        
    max_lat: float
        Max latitude of the region.
        
    min_lon: float
        Min longitude of the region.
        
    max_lon: float
        Max longitude of the region.
        
    start_time: str
        Start DateTime for the beginning of the period in "YYYY-MM-DDThh:mm:ss.f" format.
        
    end_time: str
        End DateTime for the beginning of 

### 1) Finding the availabel stations 

Defining the network, station, location, channel and time period of interest:

In [14]:
NET = "CI"
STA = "BAK,ARV"
LOC = "*"
CHA ="BHZ"
STIME="2020-09-01 00:00:00.00"
ETIME="2020-09-02 00:00:00.00"

This will download the information on the stations that are available based on your search criteria. You can filter out the networks or stations that you are not interested in, you can find the name of the appropriate client for your request from here:

In [15]:
import os
json_basepath = os.path.join(os.getcwd(),"json/station_list.json")
makeStationList(json_path=json_basepath,
                  client_list=["IRIS"],
                  min_lat=None, max_lat=None, min_lon=None, max_lon=None, 
                  network=NET,
                  station=STA,
                  location=LOC,
                  channel=CHA,                      
                  start_time=STIME, 
                  end_time=ETIME,
                  filter_network=[])

CI--ARV
CI--BAK


A json file should have been created in the json path. This contains information for the available stations (i.e. 2 stations in this case). Next, you can download the data for the available stations using the following function and script. This may take a few minutes.

### 2) Downloading the data

You can define multipel clients as the source:

In [22]:
downloadMseeds(client_list=["IRIS"], 
          stations_json=json_basepath, 
          output_dir="downloads_mseeds", 
          start_time=STIME, 
          end_time=ETIME, 
          min_lat=None, max_lat=None, min_lon=None, max_lon=None,
          chunk_size=1,
          channel_list=[],
          n_processor=1)

[2020-10-13 19:17:43,089] - obspy.clients.fdsn.mass_downloader - INFO: Initializing FDSN client(s) for IRIS.
[2020-10-13 19:17:43,093] - obspy.clients.fdsn.mass_downloader - INFO: Successfully initialized 1 client(s): IRIS.
[2020-10-13 19:17:43,096] - obspy.clients.fdsn.mass_downloader - INFO: Total acquired or preexisting stations: 0
[2020-10-13 19:17:43,097] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Requesting reliable availability.
####### There are 2 stations in the list. #######
[2020-10-13 19:17:44,003] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Successfully requested availability (0.91 seconds)
[2020-10-13 19:17:44,009] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Found 1 stations (3 channels).
[2020-10-13 19:17:44,010] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Will attempt to download data from 1 stations.
[2020-10-13 19:17:44,010] - obspy.clients.fdsn.mass_downloader - INFO: Client 'IRIS' - Status for 3 ti

The above will download the continous data (either in MiniSeed or SAC) and save them into individual folders for each station insider your defined output directory (i.e. downloads_mseeds).

# HDF5 maker

In [23]:
from EQTransformer.utils.hdf5_maker import preprocessor
help(preprocessor)


Help on function preprocessor in module EQTransformer.utils.hdf5_maker:

preprocessor(preproc_dir, mseed_dir, stations_json, overlap=0.3, n_processor=None)
    Performs preprocessing and partitions the continuous waveforms into 1-minute slices. 
    
    Parameters
    ----------
    preproc_dir: str
        Path of the directory where will be located the summary files generated by preprocessor step.
    
    mseed_dir: str
        Path of the directory where the mseed files are located. 
    
    stations_json: str
        Path to a JSON file containing station information.        
        
    overlap: float, default=0.3
        If set, detection, and picking are performed in overlapping windows.
           
    n_processor: int, default=None 
        The number of CPU processors for parallel downloading.         
    
    Returns
    ----------
    mseed_dir_processed_hdfs/station.csv: Phase information for the associated events in hypoInverse format. 
    
    mseed_dir_processed_h

In [25]:
preprocessor(preproc_dir="preproc",mseed_dir='downloads_mseeds', stations_json=json_basepath, overlap=0.3, n_processor=2)

 *** " downloads_mseeds " directory already exists!
  * ARV (1) .. 20200901 --> 20200902 .. 3 components .. sampling rate: 100.0
  * BAK (1) .. 20200901 --> 20200902 .. 3 components .. sampling rate: 100.0
 Station ARV had 1 chuncks of data
2056 slices were written, 2057.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 1.
Original samplieng rate: 100.0.
 Station BAK had 1 chuncks of data
2056 slices were written, 2057.0 were expected.
Number of 1-components: 0. Number of 2-components: 0. Number of 3-components: 1.
Original samplieng rate: 100.0.
