# Doppler Drift Searching with TurboSETI

By now, you should have:

1. Completed the three voyager notebooks.
2. Completed the `setigen` notebooks, and created some interesting SETI signals. 
3. Found another person with whom to swap SETI signals.

Next, you will each try to detect each other's SETI signals, that you each generated using `setigen`.


In [None]:
from blimpy import Waterfall

import turbo_seti.find_doppler.seti_event as turbo
import turbo_seti.find_event as find
from turbo_seti.find_doppler.find_doppler import FindDoppler
from turbo_seti.find_event.find_event_pipeline import find_event_pipeline
from turbo_seti.find_event.plot_event_pipeline import plot_event_pipeline

import os
import glob

%matplotlib inline

### Provide the input and output directories:

Note that you will be attempting to detect the `setigen` output files generated by your partner. Therefore, you need to select the appropriate data subdirectory, where your partner's `setigen` output files may be found. 

In [None]:
base_directory = '/scratch/bluse'
data_subdirectory = 'jupyter-dmacmahon/sg1'
input_directory = os.path.join(base_directory, data_subdirectory)
output_directory = os.path.join(base_directory, os.getenv('LOGNAME'), data_subdirectory)
os.makedirs(output_directory, exist_ok=True)

### List the files:

TurboSETI expects the input files to be in the same directory as the output files. We can get around this by creating symlinks to the original files in the input directory, in the output directory. 

In [None]:
def generate_symlinks(input_directory, output_directory, extension):
    """Generate symlinks for files (of the given extension) found in 
    the input directory, in the output directory.
    """
    for h5_path in glob.glob(os.path.join(input_directory, f'*.{extension}')):
        symlink_name = os.path.join(output_directory, os.path.basename(h5_path))
        if not os.path.exists(symlink_name):
            os.symlink(h5_path, symlink_name)

generate_symlinks(input_directory, output_directory, 'h5')

TurboSETI's `find_event_pipeline` needs a file containing a list of `.h5` input files, with the extension '.lst'. Here, we generate such a file based on the `.h5` symlinks we just created:

In [None]:
def generate_list_file(output_directory, extension):
    """Generate a .lst file for files in the given output directory with the given 
    extension.
    """
    h5_list = sorted(glob.glob(os.path.join(output_directory, f'*.{extension}')))  
    with open(os.path.join(output_directory, f'{extension}_files.lst'), 'w') as f:
        for h5_path in h5_list:
            f.write(h5_path + '\n')
    return h5_list
            
h5_list = generate_list_file(output_directory, 'h5')

We can check the order of the files by printing them out:

In [None]:
# Check the order of files is correct:
with open(os.path.join(output_directory, 'h5_files.lst'), 'r') as f:
    print(f.read())

### TurboSETI
We are ready to run TurboSETI's narrowband doppler drift search! You can experiment with parameters like `max_drift` and `snr` when trying to detect your partner's signals.

In [None]:
for h5_file in h5_list:
    doppler = FindDoppler(
        h5_file,
        max_drift = 4, # Max drift rate = 4 Hz/second
        snr = 8,      # Minimum signal to noise ratio = 10:1
        out_dir = output_directory # This is where the turboSETI output files will be stored.
        )
    doppler.search()

Before proceeding, we need a second `.lst` file, this time for the `.dat` outputs of the drift search we just ran.

In [None]:
dat_list = generate_list_file(output_directory, 'dat')

# Check the order of files is correct:
with open(os.path.join(output_directory, 'dat_files.lst'), 'r') as f:
    print(f.read())

### find_event_pipeline:

Determining if an event should be selected as a candidate to be looked at by a human can be done with `find_event_pipeline`. 

Recall the filter thresholds available:

a) `filter_threshold=1`: Returns hits that meet the SNR and drift rate criteria. Does not consider any ON/OFF cadence checks. 

b) `filter_threshold=2`: Reports events that are present in at least one ON observation and absent in all of the OFF observations.

c) `filter_threshold=3`: Reports events that are present in all the ON observations and absent in all of the OFF observations.

In [None]:
# Select a filter threshold:
FILTER = 1

# Name for output .csv file
csv_path = os.path.join(output_directory, 'found_event_table.csv')

find_event_pipeline(os.path.join(output_directory, 'dat_files.lst'), 
                    filter_threshold = FILTER, 
                    number_in_cadence = len(dat_list), 
                    csv_name = csv_path, 
                    saving = True)

### Plotting the results: plot_event_pipeline
Looking at the results:

In [None]:
plot_event_pipeline(csv_path, # full path of the CSV file built by find_event_pipeline()
                    os.path.join(output_directory, 'h5_files.lst'), # full path of text file containing the list of .h5 files
                    filter_spec = FILTER, # filter threshold selected earlier
                    user_validation = False) # Non-interactive