## This notebook is meant to accompany the trial_detection_Crab.py file that is included in this directory with additional comments. 

### Please note that running this whole analysis will produce ~2.5 TB of data. There are ways to reduce this amount but the brute force analysis conducted here does not attempt to save disk space. Also take note of the parallelized mosaic analysis since each parallel process can take ~10 GB of memory, so the user wants to set an appropriate value here such that they dont over load their computer.

We will go through the code to produce Figure 3 of the associated BatAnalysis paper. 

First, we need to import the relevant packages. 

In [1]:
import glob
import os
import sys
import batanalysis as ba
import matplotlib.pyplot as plt
import numpy as np
import astropy.units as u
from astropy.time import Time, TimeDelta
from astropy.io import fits
from pathlib import Path
import swiftbat





Now we need to set our data directory, which will contain all of our BAT survey data. 

In [2]:
newdir = Path("/Users/tparsota/Documents/CRAB_SURVEY_DATA")
ba.datadir(newdir, mkdir=True)

PosixPath('/Users/tparsota/Documents/CRAB_SURVEY_DATA')

Next, we query HEASARC for the observation IDs when the coordinates of the Crab was in the BAT FOV during the dates when the data for the 22 month survey was accumulated. We also ensure that the minimum amount of exposure that the Crab had to the BAT detector plane was $> 1000$ cm$^2$. 

In [2]:
object_name='Crab_Nebula_Pulsar'

#use swiftbat to create a bat source object
object_location = swiftbat.simbadlocation("Crab")
object_batsource = swiftbat.source(ra=object_location[0], dec=object_location[1], name=object_name)
table_everything, query = ba.from_heasarc(time_range=Time(["2004-12-15","2006-10-27"]), return_query=True)
minexposure = 1000     # cm^2 after cos adjust

#calculate the exposure with partial coding
exposures = u.Quantity(
    [object_batsource.exposure(ra=row["ra"],
                               dec=row["dec"],
                               roll=row["roll_angle"])[0]
        for row in table_everything
    ])

#select the observations that have greater than the minimum desired exposure
table_exposed = table_everything[exposures.value > minexposure]

To download the data, we would then do:
```
result = ba.download_swiftdata(table_exposed)
obs_ids=[i for i in table_exposed['obsid'] if result[i]['success']]
```

### OR

If we were continuing our analysis during some later point we would do:
```
obs_ids=[i.name for i in sorted(ba.datadir().glob("*")) if i.name.isnumeric()]
```

With the data downloaded and the observation IDs obtained, we can now analyze the BAT survey data. We specify that we want the BAT images to be cleaned from bright sources which have been detected at SNR=6. We also specify that all sources in the BatAnalysis or custom catalogs that have the column value ```ALWAYS_CLEAN==T``` should be cleaned as well.

After we define where the directory with all the pattern maps live in our system. Here, my patern map directory lies outside of where my BAT survey data have been downloaded and I want to include these pattern maps in my analyses for the computation of mosaic images later. Thus, I need to specify where these pattern noise maps live. If I do not then the pattern noise maps will not be included in the analyses and any later mosaic images that are created will suffer from this buildup of noise. 

In [None]:
input_dict=dict(cleansnr=6,cleanexpr='ALWAYS_CLEAN==T')
noise_map_dir=Path("/Users/tparsota/Documents/PATTERN_MAPS/")
batsurvey_obs=ba.parallel.batsurvey_analysis(obs_ids, input_dict=input_dict, patt_noise_dir=noise_map_dir, nprocs=20)

Now, after all of the survey data has been processed, we can do a batch calculation of the pha files, the drm files, and subequently the spectral fitting for each observation and pointing ID. By default the spectrum that I am fitting to the BAT survey spectra is a ```cflux*po``` model from 14-195 keV. If the model parameters are not well constrained or if the Crab is not detected at a level of 3$\sigma$ above the background noise level then the function automatically tries to place 5$\sigma$ upper limits on the detection of the source. The photon index of the power law that is fitted to the spectrum to obtain the 5$\sigma$ upper limits is explicitly set to be 2.15 since we expect this type of spectral index already. 

In the line below I set ```recalc=True``` which is useful for when the user wants to completely redo an operation using different input parameters. For example, if I want to run the below line using he defaults desribed above I can do so but if I want to change the model that is fitted, the level of detection necessary for automatically calculating upper limits or anything else, I simply set ```recalc=True``` and pass in the appropriate values and things will be updated within the ```batsurvey_obs``` objects appropriately. 

In [None]:
batsurvey_obs=ba.parallel.batspectrum_analysis(batsurvey_obs, object_name, ul_pl_index=2.15, recalc=True,nprocs=14)

To get a quick glimpse of our results, we can use a convience function to plot the various values of interest for the Crab. 

The values that can be passed in are dicitonary values associated with the observation that can be accessed from the ```BatSurvey``` objects within the ```batsurvey_obs``` list. Some of these values are shown in the line below.

In [None]:
fig, axes=ba.plot_survey_lc(batsurvey_obs, id_list=object_name, time_unit="UTC", values=["rate","snr", "flux", "PhoIndex", "exposure"])

### Next we want to do the mosaicing analysis

In order to do this step, we first have to group together all the BAT survey observations that we want to include in this step in our analysis. In most cases this is going to be all of our BAT survey observations. 

In [None]:
outventory_file=ba.merge_outventory(batsurvey_obs)

If we wanted to continue our calculations from a point later on in our analysis, we can simply skip the above line and do:
```
outventory_file=Path("./path/to/outventory_all.fits")
```
since the above cell simply creates a fits file with all the BAt survey observations that we want included in the analysis and returns the full path to the file. 

Next, we need to define the time bins for which we will create mosaic images and analyze them. To test our code, we will use the same 1 month binning as the 22 month survey paper used. We specify the ```end_datetime``` explicitly but do not pass in a ```start_datetime``` value. This is because the ```start_datetime``` value is automatically set to be the first BAT survey observation rounded to the nearest whole ```timedelta``` value (ie the floor function applied to the earliest BAT survey date to the start of that month in this case).

In [None]:
time_bins=ba.group_outventory(outventory_file, np.timedelta64(1, "M"), end_datetime=Time("2006-10-27"))

Now we can actually do the mosaicing calculation simply by doing. Where we will end up getting a list of mosaics for each month time bin and the total "time-integrated" mosaic.

In [None]:
mosaic_list, total_mosaic=ba.parallel.batmosaic_analysis(batsurvey_obs, outventory_file, time_bins, nprocs=8)

To analyze the mosaic images we simply use the same call as we did for the BAT Survey data.

In [None]:
mosaic_list=ba.parallel.batspectrum_analysis(mosaic_list, object_name, ul_pl_index=2.15, nprocs=11)
total_mosaic=ba.parallel.batspectrum_analysis(total_mosaic, object_name, ul_pl_index=2.15, use_cstat=False, nprocs=1)

And to plot our values of interest for each month, we would do:

In [None]:
fig, axes=ba.plot_survey_lc(mosaic_list, id_list=object_name, time_unit="UTC", values=["rate","snr", "flux", "PhoIndex", "exposure"])

If we wanted to see the BAT survey data alongside the mosaic data, we would then do:

In [None]:
fig, axes=ba.plot_survey_lc([batsurvey_obs,mosaic_list], id_list=object_name, time_unit="UTC", values=["rate","snr", "flux", "PhoIndex", "exposure"], same_figure=True)

To also do the weekly mosaic analysis: we would do:
```
outventory_file_weekly=ba.merge_outventory(batsurvey_obs, savedir=Path('./weekly_mosaiced_surveyresults/'))
time_bins_weekly=ba.group_outventory(outventory_file_weekly, np.timedelta64(1, "W"), start_datetime=Time("2004-12-01"), end_datetime=Time("2006-10-27"))
weekly_mosaic_list, weekly_total_mosaic=ba.parallel.batmosaic_analysis(batsurvey_obs, outventory_file_weekly, time_bins_weekly, nprocs=8)

weekly_mosaic_list=ba.parallel.batspectrum_analysis(weekly_mosaic_list, object_name, recalc=True, nprocs=11)
weekly_total_mosaic=ba.parallel.batspectrum_analysis(weekly_total_mosaic, object_name, recalc=True, use_cstat=False, nprocs=1)

```

To save disc space, it is possible to only construct the weekly mosaic and then combine the weekly mosaics together to produce the monthly mosaics that we obtained in the prior few cells. This operation is a bit more advanced but it depends on the `merge_mosaics` function.


We can now save our survey/mosaic results by doing:

In [None]:
all_data=ba.concatenate_data(batsurvey_obs, object_name, ["met_time", "utc_time", "exposure", "rate","rate_err","snr", "flux", "PhoIndex"])
with open('all_data_dictionary.pkl', 'wb') as f:
    pickle.dump(all_data, f)

all_data_monthly=ba.concatenate_data(mosaic_list, object_name, ["user_timebin/met_time", "user_timebin/utc_time", "user_timebin/met_stop_time", "user_timebin/utc_stop_time", "rate","rate_err","snr", "flux", "PhoIndex"])
with open('monthly_mosaic_dictionary.pkl', 'wb') as f:
    pickle.dump(all_data_monthly, f)

#If the weekly analysis has been completed, the next few lines can be uncommented:
#all_data_weekly=ba.concatenate_data(weekly_mosaic_list, object_name, ["user_timebin/met_time", "user_timebin/utc_time", "user_timebin/met_stop_time", "user_timebin/utc_stop_time", "rate","rate_err","snr", "flux", "PhoIndex"])
#with open('weekly_mosaic_dictionary.pkl', 'wb') as f:
#    pickle.dump(all_data_weekly, f)


Now, we can create the main Crab Pulsar Nebula plot in the manuscript:

In [None]:
energy_range=None
time_unit="MET"
values=["rate","snr", "flux", "PhoIndex"]


survey_obsid_list=["all_data_dictionary","monthly_mosaic_dictionary"]

#if the weekly mosaic dictionary was saved then the following line can be uncommented
#survey_obsid_list=["all_data_dictionary","monthly_mosaic_dictionary", "weekly_mosaic_dictionary"]

obs_list_count=0
for observation_list in survey_obsid_list:

    with open(observation_list+".pkl", 'rb') as f:
        all_data=pickle.load(f)
        data=all_data[object_name]

    # get the time centers and errors
    if "mosaic" in observation_list:

        if "MET" in time_unit:
            t0 = TimeDelta(data["user_timebin/met_time"], format='sec')
            tf = TimeDelta(data["user_timebin/met_stop_time"], format='sec')
        elif "MJD" in time_unit:
            t0 = Time(data[time_str_start], format='mjd')
            tf = Time(data[time_str_end], format='mjd')
        else:
            t0 = Time(data["user_timebin/utc_time"])
            tf = Time(data["user_timebin/utc_stop_time"])
    else:
        if "MET" in time_unit:
            t0 = TimeDelta(data["met_time"], format='sec')
        elif "MJD" in time_unit:
            t0 = Time(data[time_str_start], format='mjd')
        else:
            t0 = Time(data["utc_time"])
        tf = t0 + TimeDelta(data["exposure"], format='sec')

    dt = tf - t0

    if "MET" in time_unit:
        time_center = 0.5 * (tf + t0).value
        time_diff = 0.5 * (tf - t0).value
    elif "MJD" in time_unit:
        time_diff = 0.5 * (tf - t0)
        time_center = t0 + time_diff
        time_center = time_center.value
        time_diff = time_diff.value

    else:
        time_diff = TimeDelta(0.5 * dt)  # dt.to_value('datetime')
        time_center = t0 + time_diff

        time_center = np.array([i.to_value('datetime64') for i in time_center])
        time_diff = np.array([np.timedelta64(0.5 * i.to_datetime()) for i in dt])

    x = time_center
    xerr = time_diff

    if obs_list_count == 0:
        fig, axes = plt.subplots(len(values), sharex=True, figsize=(10,12))

    axes_queue = [i for i in range(len(values))]
    # plot_value=[i for i in values]

    e_range_str = f"{14}-{195} keV"
    #axes[0].set_title(object_name + '; survey data from ' + e_range_str)

    for i in values:
        ax = axes[axes_queue[0]]
        axes_queue.pop(0)

        y = data[i]
        yerr = np.zeros(x.size)
        y_upperlim = np.zeros(x.size)

        label = i

        if "rate" in i:
            yerr = data[i + "_err"]
            label = "Count rate (cts/s)"
        elif i + "_lolim" in data.keys():
            # get the errors
            lolim = data[i + "_lolim"]
            hilim = data[i + "_hilim"]

            yerr = np.array([lolim, hilim])
            y_upperlim = data[i + "_upperlim"]

            # find where we have upper limits and set the error to 1 since the nan error value isnt
            # compatible with upperlimits
            yerr[:, y_upperlim] = 0.1 * y[y_upperlim]

        if "mosaic" in observation_list:
            if "weekly" in observation_list:
                zorder = 9
                c = "blue"
                m = "o"
                l="Weekly Mosaic"
                ms=5
                a=0.8
            else:
                zorder = 9
                c='green'
                m = "s"
                l = "Monthly Mosaic"
                ms=7
                a = 1
        else:
            zorder = 4
            c = "gray"
            m = "."
            l = "Survey Snapshot"
            ms=3
            a = 0.3

        ax.errorbar(x, y, xerr=xerr, yerr=yerr, uplims=y_upperlim, linestyle="None", marker=m, markersize=ms,
                    zorder=zorder, color=c, label=l, alpha=a)

        if ("flux" in i.lower()):
            ax.set_yscale('log')

        if ("snr" in i.lower()):
            ax.set_yscale('log')

        ax.set_ylabel(label)

    # if T0==0:
    if "MET" in time_unit:
        label_string = 'MET Time (s)'
    elif "MJD" in time_unit:
        label_string = 'MJD Time (s)'
    else:
        label_string = 'UTC Time (s)'

    plt.gca().ticklabel_format(useMathText=True)
    axes[-1].set_xlabel(label_string)

    obs_list_count += 1

#add the UTC times as well
met_values=[126230399.334, 157766399.929]#[i.get_position()[0] for i in axes[-1].get_xticklabels()]
utc_values=[np.datetime64(sbu.met2datetime(i)) for i in met_values]

for i,j in zip(met_values, [2005, 2006]):
    for ax in axes:
        ax.axvline(i, 0, 1, ls='--', color='k')
        if ax==axes[0]:
            ax.text(i, ax.get_ylim()[1]*1.01, str(j), fontsize=12, ha='center')

axes[0].legend(loc="best")

axes[1].set_ylabel("SNR")
axes[2].set_ylabel(r"Flux (erg/s/cm$^2$)")
axes[3].set_ylabel(r"$\Gamma$")

for ax, l in zip(axes, ["a","b","c","d"]):
    ax.text(.99, .95, f"({l})", ha='right', va='top', transform=ax.transAxes,  fontsize=12)

axes[-1].axhline(2.15, 0, 1)

axes[-2].axhline(23342.70e-12, 0, 1)

fig.tight_layout()
plot_filename = object_name + '_survey_lc.pdf'
fig.savefig(plot_filename, bbox_inches="tight")
