# Welcome to the NoisePy Colab Tutorial!

This tutorial will walk you through the basic steps of using NoisePy to compute ambient noise cross correlation functions.


First, we install the noisepy-seis package

In [None]:
! pip install noisepy-seis --upgrade
# !pip uninstall noisepy-seis
# !conda install -n noisepy ipykernel --update-deps --force-reinstall
# !cd ..
# !pip install -e ".[dev]"
# !cd Jupyter_notebook

__Warning__: NoisePy uses ```obspy``` as a core Python module to manipulate seismic data. Restart the runtime now for proper installation of ```obspy``` on Colab.

Then we import the basic modules

In [None]:
from noisepy.seis import download, cross_correlate, stack, plotting_modules
from noisepy.seis.asdfstore import ASDFRawDataStore, ASDFCCStore
from noisepy.seis.datatypes import ConfigParameters
import os
import glob

path = "/content/data" # for use in Colab
# path = "../../data" # for use locally
# print(path)

os.makedirs(path,exist_ok=True)
raw_data_path = os.path.join(path, "RAW_DATA")
cc_data_path = os.path.join(path, "CCF")
stack_data_path = os.path.join(path, "STACK")
config = ConfigParameters() # default config parameters which can be customized
config.inc_hours = 12
config.start_date = "2019_02_01_00_00_00"
config.end_date = "2019_02_02_00_00_00"

## Step 0: download data

Use the function ```download``` with the following arguments: 
* ```path``` of where to put the data
* ``` channel list```: list of the seismic channels to download, and example is shown below
* ```station list```: list of the seismic stations (we need to change this to net.sta.loc.chan) it can be "\*" (not "all") 
* ```start time```: we need to change this to a datetime object
* ```end time```: we need to change this to a datetime object, or format it with a standard UTCDatetime
* ``inc_hour``: is the number of hours as increments, this interger is used to split the original data (usually in 1-day long time series for broadband seismometers) into shorter time chunk. It helps manage memory for large arrays.


Some raw parameters are still hardcoded in the download function that calles S0A_download_ASDF_MPI.py
I broadened the lat-long search, this could be a parameter in the config.


In [None]:
download(raw_data_path, ["BHE","BHN","BHZ"],["A*"], config)

List the files that were downloaded, just to make sure !

In [None]:
print(os.listdir(raw_data_path))

Plot the raw data, make sure it's noise!

In [None]:
file = os.path.join(raw_data_path, "2019_02_01_00_00_00T2019_02_01_12_00_00.h5")
plotting_modules.plot_waveform(file,'CI','ADO',0.01,0.4) # this function takes for input: filename, network, station, freqmin, freqmax for a bandpass filter

## Step 1: Cross-correlation


In this step, we will perform the first cross-correlation function with several configuration. The default values are typical for regional seismic networks:

Window length
* cc_len = 1800  #s, 30-min windows
* step = 450 #s, overlapping window


Data Processing choices:
* **Temporal normalization**: essential processing choice: Noisepy uses 3 types of normalization with the parameter ``time_norm``, there are entered as strings: 'no','rma', 'one_bit'. RMA will run a smoothing over the absolute amplitude to normalize the time series, with the argument ``smooth_N`` (in points).


* **Spectral normalization**: essentail processing choise. NoisePy uses 2 types of normalization there as well entered as strings: 'rma' or not. one-bit whitening is missing.


In [None]:
config.freq_norm = "rma"
raw_store = ASDFRawDataStore(raw_data_path) # Store for reading raw data
cc_store = ASDFCCStore(cc_data_path) # Store for writing CC data

# print the configuration parameters. Some are chosen by default but we cab modify them
print(config)

Perform the cross correlation

In [None]:
cross_correlate(raw_store, config, cc_store)

Plot a single set of the cross correlation

In [None]:
file = os.path.join(cc_data_path, '2019_02_01_00_00_00T2019_02_01_12_00_00.h5')
plotting_modules.plot_substack_cc(file,0.1,1,200,False)

## Step 3: Stack the cross correlation

Provide a pathto where the data is.

In [None]:
stations = raw_store.get_station_list()
print(stations)
stack(stations, cc_data_path, stack_data_path, "linear")

Plot the stacks

In [None]:
print(os.listdir(cc_data_path))
print(os.listdir(stack_data_path))

In [None]:
files = glob.glob(os.path.join(stack_data_path, '**/*.h5'))
print(files)
plotting_modules.plot_all_moveout(files, 'Allstack_linear', 0.1, 0.2, 'ZZ', 1)