## Multiple plots

- This notebook plots the raw lightcurve and mad applied lightcurve to see if the cleaning process has worked well.
- After, we revisit the MAD plots to see if the edge of lightcurve effects (flux high/low) were removed.
- We plot cadence vs bit to observe the flags given by `eleanor` for this lightcurve

In [1]:
cd ..

/home/astro/phrdhx/automated_exocomet_hunt


In [2]:
from analysis_tools_cython import *
from astropy.stats import sigma_clip,sigma_clipped_stats
from astropy.table import Table
import pandas as pd
import glob
from astropy.table import unique

In [3]:
def get_output(file_path):
    """Imports batch_analyse output file as pandas dataframe."""
    with open(file_path) as f:
        lines = f.readlines()
    lc_lists = [word for line in lines for word in line.split()]
    lc_lists = [lc_lists[i:i+10] for i in range(0, len(lc_lists), 10)]
    cols = ['file','signal','signal/noise','time','asym_score','width1','width2','duration','depth','transit_prob']
    df = pd.DataFrame(data=lc_lists,columns=cols)
    df[cols[1:-1]] = df[cols[1:-1]].astype('float32')
    return df

In [4]:
data = get_output('output_s6_corr.txt')

In [5]:
mad_df = pd.read_json("./data/Sectors_MAD.json")
sec = 6
clip = 4

In [6]:
data.file

0          tesslc_234556256.pkl
1          tesslc_167974071.pkl
2          tesslc_255700743.pkl
3          tesslc_445939140.pkl
4          tesslc_176615627.pkl
                   ...         
2779720    tesslc_220211279.pkl
2779721    tesslc_206895034.pkl
2779722    tesslc_172778639.pkl
2779723     tesslc_79690705.pkl
2779724    tesslc_220479588.pkl
Name: file, Length: 2779725, dtype: object

In [7]:
path = '/storage/astro2/phrdhx/tesslcs'

In [None]:
for i in data.file:
    file_paths = glob.glob(os.path.join(path,f'**/**/{i}'))[0]
    table, info = import_XRPlightcurve(file_paths,sector=sec,clip=clip,drop_bad_points=True)
    raw_table, _ = import_XRPlightcurve(file_paths,sector=sec,clip=clip,drop_bad_points=False) # want to plot raw lightcurve to see if the MAD has really worked


    camera = info[4]
    tic = info[0]
    mad_arr = mad_df.loc[:len(table)-1, f"{sec}-{camera}"]
    sig_clip = sigma_clip(mad_arr,sigma=clip,masked=False)
    med_sig_clip = np.nanmedian(sig_clip)
    rms_sig_clip = np.nanstd(sig_clip)
    fig,ax = plt.subplots(2,2,figsize=(15,8))
    ax[0,0].scatter(range(0,len(table['time'])), mad_arr, s=2)
    ax[0,0].axhline(np.nanmedian(mad_arr), c='r',label='median line')
    ax[0,0].axhline(np.nanmedian(mad_arr)+10*np.std(mad_arr[900:950]),c='blue',label='visualised MAD') # 10 sigma threshold
    ax[0,0].axhline(med_sig_clip + clip*(rms_sig_clip), c='orange',label='sigma clipped MAD')   
    #ax1.axhline(med_sig_clip - m*(rms_sig_clip), c='orange',label='sigma clipped MAD')  
    ax[0,0].set_title(f"{sec}-{camera} at {clip} sigma")
    ax[0,0].set_ylim([0.5*np.nanmedian(mad_arr),4*np.nanmedian(mad_arr)])
    
    ax[1,0].scatter(range(0,len(raw_table['time'])), raw_table['quality'], s=5)
    ax[1,0].set_title('Cadence vs bit')
    
    ax[1,1].scatter(table['time'],normalise_lc(table['corrected flux']),s=0.4)
    ax[1,1].set_title(f'MAD corrected lightcurve for TIC {tic}')
    ax[0,1].scatter(raw_table['time'],normalise_lc(raw_table['corrected flux']),s=0.4)
    ax[0,1].set_title(f'Raw lightcurve for TIC {tic}')
    #print(unique(table,keys='quality'))
    #print(len(table))

### Interpretations:

- MAD excludes data as expected.
- However, some expected points still don't get captured in the processing. After tightening sigma clipping, still the case. Maybe these are just anomalous points that we can do nothing about?