### Finding missing calibration files (wave_A)

I am getting an error on some TOIs where it says they are missing calibration files. I know that some have been downloaded previously, but for some reason they aren't getting put into the correct folders.

Plan:

1. identify the calib files needed.
2. see if I have them anywhere in the data directory.
3. copy them into the TOI folder.

In [2]:
from pathlib import Path
from astropy.io import fits

In [5]:
DATA_DIR = '/srv/scratch/astro/z5345592/data/tess_toi'

In [6]:
TOI_LIST = '5094_01' #list of 1 at the moment to make sure it works.
TOI_DIR = Path(DATA_DIR) / TOI_LIST #just the directory for the toi - need to change when I have more than 1.

In [18]:
wave_file_list = [] #a list of all the wave files needed by this toi
spec_file_keyerrors = [] #sometimes get a keyerror when the wave_file header isn't there - list of those spec files

for file in TOI_DIR.rglob('*e2ds_A.fits'):
    phdu = fits.open(file)
    try:
        wave_file = phdu[0].header['HIERARCH ESO DRS CAL TH FILE']
        wave_file_list.append(wave_file)
    except KeyError:
        spec_file_keyerrors.append(file)

In [24]:
len(spec_file_keyerrors)

120

In [25]:
phdu = fits.open(spec_file_keyerrors[0])

In [None]:
phdu[0].header

Ok - at least some of the files that don't have wave_A files in their headers seem to be 'calib' type. Going to check to see if this is the case for all of them. Also, check to see what the type is for all the files

In [None]:
for file in spec_file_keyerrors:
    phdu = fits.open(file)
    print(phdu[0].header['HIERARCH ESO DPR CATG'])

Ok, makes sense, about half of the files for this one TOI are 'calib' files - all of the files in the key errors are calib files. I need to remove them.

In future I should remove them at the end of the download script - keep only the science files

In [30]:
for file in spec_file_keyerrors:
    file.unlink()


In [31]:
count = 0
for file in TOI_DIR.rglob('*e2ds_A.fits'):
    count +=1
print(count)

125


ok, got rid of the ones that were causing the problem. Now going back to seeing whether I have all the wave files.

In [32]:
wave_file_list = [] #a list of all the wave files needed by this toi
spec_file_keyerrors = [] #sometimes get a keyerror when the wave_file header isn't there - list of those spec files

for file in TOI_DIR.rglob('*e2ds_A.fits'):
    phdu = fits.open(file)
    try:
        wave_file = phdu[0].header['HIERARCH ESO DRS CAL TH FILE']
        wave_file_list.append(wave_file)
    except KeyError:
        spec_file_keyerrors.append(file)

In [33]:
len(wave_file_list)

125

In [34]:
len(spec_file_keyerrors)

0

125 science files, no files that caused a key error

In [None]:
for wave_file in wave_file_list:
    if(TOI_DIR / wave_file).is_file():
        print('exists')
    else: 
        ('missing')

All the wave files for 5094.01 are there. Going to write this up as a series of functions and run on everything.

1. check each e2ds_A.fits file for calib in the header - delete if not science

checking to see if there are any types other than calib and science:

In [None]:
#this takes way too long to run - do it as a pbs at some point.

cat_types = set()
for file in Path(DATA_DIR).rglob('*e2ds_A.fits'):
    phdu = fits.open(file)
    obs_cat = phdu[0].header['HIERARCH ESO DPR CATG']
    cat_types.add(obs_cat)
print(cat_types)   

In [45]:
phdu = fits.open(Path(TOI_DIR) / 'HARPS.2018-04-09T00:03:14.996_ccf_M2_A.fits')

In [47]:
for file in TOI_DIR.rglob('*ccf_A.fits'):
    phdu = fits.open(file)
    print(phdu[0].header['HIERARCH ESO DPR CATG'])

In [51]:
for file in TOI_DIR.rglob('*ccf_??_A.fits'):
    phdu = fits.open(file)
    print(phdu[0].header['HIERARCH ESO DPR CATG'])

SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE


Oh I can probably do a check to make sure that the ccf is a science file at the start of from_HARPS? Ask Ben? For now put it in analysis.

FYI added to line 52 of analysis

In [52]:
#a function that returns true if a ccf file is science cat

def check_ccf_science(ccf_file):
    phdu = fits.open(file)
    cat = phdu[0].header['HIERARCH ESO DPR CATG']
    return cat == 'SCIENCE'

In [55]:
check_ccf_science(TOI_DIR / 'HARPS.2012-03-13T02:00:09.037_ccf_M2_A.fits')

False

In [57]:
for file in TOI_DIR.rglob('*ccf_??_A.fits'):
    print(check_ccf_science(file))

True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
True
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
False
Fal

In [None]:
# not finding this file when running anylysis.py for TOI 5094.01:

phdu = fits.open('/srv/scratch/astro/z5345592/data/tess_toi/5094_01/HARPS.2011-10-16T19:50:30.594_ccf_M2_A.fits')

Checking the CCF files in the directory, do they have a 'science' identifier. Do the amount of CCF files match the number of e2ds files - can I select the CCF files by their category?

In [64]:
for file in TOI_DIR.rglob('*ccf_??_A.fits'):
    phdu = fits.open(file)
    print(phdu[0].header['HIERARCH ESO DPR CATG'])


SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE
SCIENCE


I should be able to select the CCF files by the header 'HIERARCH ESO DPR CATG' and select only the science ones - then hopefully the e2ds files in the directory should match up

In [67]:
count = 0
for file in TOI_DIR.rglob('*ccf_??_A.fits'):
    if check_ccf_science(file):
        count += 1
print(count)

In [8]:
filename = 'HARPS.2016-12-06T07:15:00.957_ccf_M2_A.fits'
phdu = fits.open(Path(TOI_DIR)/filename)
phdu[0].header

SIMPLE  =                    T / file does conform to FITS standard             
BITPIX  =                  -32 / number of bits per data pixel                  
NAXIS   =                    2 / number of data axes                            
NAXIS1  =                  161 / length of data axis 1                          
NAXIS2  =                   73 / length of data axis 2                          
EXTEND  =                    T / FITS dataset may contain extensions            
COMMENT   FITS (Flexible Image Transport System) format is defined in 'Astronomy
COMMENT   and Astrophysics', volume 376, page 359; bibcode: 2001A&A...376..359H 
HIERARCH ESO DRS CCF RVC = 26.5214356409763 / Baryc RV (drift corrected) (km/s) 
HIERARCH ESO DRS CCF CONTRAST = 20.1874647818485 / Contrast of  CCF (%)         
HIERARCH ESO DRS CCF FWHM = 3.16236267913982 / FWHM of CCF (km/s)               
HIERARCH ESO DRS CCF RV = 26.5214356409763 / Baryc RV (no drift correction) (km/
CRVAL1  =                 6.