# KBMOD Results and Filtering  
  
This notebook demonstrates the basic functionality for loading and filtering results. KBMOD provides the ability to load results into a ``ResultList`` data structure and then apply a sequence of filters to those results. New filters can be defined by inheriting from the ``Filter`` class.

# Setup
Before importing, make sure you have installed kbmod using `pip install .` in the root directory.  Also be sure you are running with python3 and using the correct notebook kernel.

In [None]:
# everything we will need for this demo
from kbmod.filters.stats_filters import LHFilter, NumObsFilter
from kbmod.result_list import load_result_list_from_files, ResultList
import matplotlib.pyplot as plt
import numpy as np

# Load the results

We use the fake result data provided in ``data/fake_results_noisy`` which is generated from 256 x 256 images with multiple fake objects inserted. KBMOD is run with wider than normal filter parameters so as to produce a noisy set of results.

In [None]:
results = load_result_list_from_files("../data/fake_results_noisy/", "DEMO")
print(f"Loaded {results.num_results()} results.")

# Turn on filtered result tracking.
results.track_filtered = True

# Show the first five results.
for i in range(5):
    print(results.results[i].trajectory)

# Sorting Results

We can sort the results by any of the attributes of a ``ResultRow`` in either increasing or decreasing order.

In [None]:
results.sort(key="obs_count")
print(f"Top 5 by observation count:")
for i in range(5):
    print(results.results[i].trajectory)

print(f"\nBottom 5 by Flux:")
results.sort(key="flux", reverse=False)
for i in range(5):
    print(results.results[i].trajectory)

# Return to sorted by decreasing likelihood.
results.sort(key="final_likelihood")

# Filtering

First we create a filter based on the observations' likelihood and apply it to the result set.

In [None]:
# Filter out all results that have a likelihood < 40.0.
filter1 = LHFilter(40.0, None)
print(f"Applying {filter1.get_filter_name()}")
results.apply_filter(filter1)
print(f"{results.num_results()} results remaining.")

We can look at the rows that passed the filter. These are stored in the ``ResultList``'s ``results`` list. 

In [None]:
for i in range(5):
    print(results.results[i].trajectory)

Because we set ``results.track_filtered = True`` above, the ``ResultList`` also keeps each row that was rejected by one of the filters. These rows are indexed by the filter name, allowing the user to determine which rows were removed during which filtering stage. 

We can use the ``get_filtered`` function to retrieve all the filtered rows for a given filter name:

In [None]:
# Extract the rows that did not pass filter1.
filtered_list = results.get_filtered(filter1.get_filter_name())
for i in range(5):
    print(filtered_list[i].trajectory)

We can apply multiple filters to the ``ResultList`` to progressively rule out more and more candidate trajectories. We can even apply the same filter with different parameters.

Next we apply the ``NumObsFilter`` to filter out anything with fewer than 10 observations:

In [None]:
# Filter out all results where the peak location in the stamp is great than or equal to
# 1.5 pixels from the center.
filter2 = NumObsFilter(10)
print(f"Applying {filter2.get_filter_name()}")
results.apply_filter(filter2)
print(f"{results.num_results()} results remaining.")

To visualize the effect of this filter, we can plot one of the unfiltered stamps and one of the filtered stamps. Note that we retrieve the rows filtered by the ``LHFilter`` using the ``get_filtered`` with the filter’s name.

In [None]:
fig, axs = plt.subplots(1, 2)

unfiltered_stamp = np.array(results.results[0].stamp).reshape([21, 21])
axs[0].imshow(unfiltered_stamp, cmap="gray")
axs[0].set_title("Unfiltered Stamp")

filtered_list2 = results.get_filtered(filter1.get_filter_name())
filtered_stamp = np.array(filtered_list2[0].stamp).reshape([21, 21])
axs[1].imshow(filtered_stamp, cmap="gray")
axs[1].set_title("Filtered Stamp")

### Reverting filters

As long as we have ``track_filtered`` turned on, we can undo any of the filtering steps. This appends the previously filtered results to the end of the list (and thus does not preserve ordering). However we can always re-sort if needed.

In [None]:
results.revert_filter(filter1.get_filter_name())
print(f"{results.num_results()} results remaining.")

# Outputing Results

In addition to the "many files" format provided in the original KBMOD (and used to load the demo files), we can output the results data to a single YAML string or an AstroPy Table.

### YAML String

When serializing to a YAML string we can either save the entire ``ResultList`` (including the filtered rows) or just the unfiltered rows. To save space, the default is to serialize just the unfiltered rows.

In [None]:
# Serialize the unfiltered results
yaml_str_a = results.to_yaml()
print(f"Unfiltered is serialized to a string of length {len(yaml_str_a)}")

# Serialize the entire data structure
yaml_str_b = results.to_yaml(serialize_filtered=True)
print(f"Full data structure is serialized to a string of length {len(yaml_str_b)}")

We can send this YAML string to another machine or save it for later analysis. We can reload the ``ResultList`` directly from the YAML string.

In [None]:
loaded_results = ResultList.from_yaml(yaml_str_b)
print("Results loaded:")
print(f" * {loaded_results.num_results()} unfiltered rows.")
print(f" * {len(loaded_results.get_filtered())} filtered rows.")

### AstroPy Tables

Users may want to interact with the results data in a more familiar Table format. We support exporting the ``ResultList`` as an astropy ``Table``. Note that the table format will note enforce consistency across columns. For example changing the psi and phi curves in a table will not update the likelihoods. For this reason, it is recommended that you only export the table when you have completed the per-row operations.

In [None]:
results.to_table()