# Astrometry Notebook

In order to properly work with our data we need to know what part of the sky each image is of. Figuring this out is called *plate solving*; the easiest way for us to do so is to uplaod each of our images to nova.astronomy.net. 
Doing this by hand is  possible but tedious, so this notebook helps do it automatically.

The general process is fairly simple:
1. Use our API key to let the server know we're allowed to send it a bunch of things and that it can trust us.
2. Make a list of every file that we've placed in a specific 'input' folder.
3. Upload each of those files to Astrometry, and save the special number it assigns that submission.
4. Wait a bit to make sure it has time to process. (If uploading lots of files, this should not be a problem in the slightest, as the time it takes us to uplaod is enough for it to process)
5. Download a new, enchanced, fits file from Astrometry into the folder this notebook is in.
6. Move all those downloaded files to their final home; the 'output' folder.

# Import Statements and Required Functions

In [None]:
import sys
#My folder is at: /Users/aidanmcclung/Desktop/Summer_Exoplanets

codeFilePath = '/Users/aidanmcclung/Desktop/Summer_Exoplanets'
sys.path.append(codeFilePath) #this lets python know to look here for import statements

In [None]:
import QAOP.client as client #This is a .py file that should be in the QAOP folder

import os #OS is essentially just the basic terminal comand line interface tools in python. 
#We need it to move our files around.

#We need two packages in order to do our web operations:
import requests #The default python webserver interfacing package
import json #The python JavaScript Object Notation interfacing package

from IPython.display import clear_output #to save you from the worlds longest output cells :)

from QAOP.QAOP_utils import readConfigFile
codeFilePath,dataFilePath = readConfigFile()
print(dataFilePath)

## Function Definitions

I created each of these functions, and have attempted to add enough comments to describe what they do.
Knowing those specifics isn't necessary though, and if you're not interested you can absolutely **run each of these cells once** and move on to the **Main Program**

#### (You can run each of these cells and move on to the "Main Program")

In [None]:
def findFileName(fn,debug=False):
    ''' This function scans the end of a filename to see whether its a .fit or .fits, 
    and then takes the corresponding 3 numbers that it needs for the file name. 
    This is important because where those three characters we need are will change between the two file types.'''
    if fn[-4:] == ".fit": 
        name = fn[:-4]
        if debug: print(".fit file detected.  Saving filename as:",name)
    if fn[-5:] == ".fits": 
        name = fn[:-5]
        if debug: print(".fits file detected. Saving filename as:",name)
    return name

def getSubID(fn,debug=False):
    """This function uploads a file at the given fn (filename) to astrometry.net, 
    and returns the submission ID for that upload."""
    #Now we need to upload the file to astrometry.net
    # we do this using a method in the client class: upload()
    #This then returns a json status, but we only care about the info it gives us about where it went,
    # ie the "Submission ID", which we take out of the result.
    SUBIDforThis = cAPI.upload(dataFilePath+'input/'+fn)["subid"]
    if debug: print(f"SUBID for {fn} is {SUBIDforThis}")
    return SUBIDforThis

def submitFile(fn,debug=False):
    """This function is the wrapper for uploading files; it uses the above getSubID to upload the image, 
    but it also finds and saves the corresponding name alongside the subID, so that we can get them later."""
    name = findFileName(fn) #detect whether .fits or .fit so we can get the right characters for the name
    if debug: print("Submitting File:",name)
    subid = getSubID(fn) #This function is where we upload the file
    nameSubs[name] = subid #We add what it was saved as to our own dictionary, so we know whats what.
    return name #We return the name and add it to a list with this call, 
                #which we then use later to iterate through all the things we uploaded

In [None]:
def getJobID(SubID,debug=False):
    """When an image is done processing, it will have a JobID attached to the SubID, which we use to get the results. 
    This function retrieves the JobId given a SubID."""
    #When Astrometry does it's thing, we submit our image, but it hasn't been solved.
    #Once it has been solved, it's assigned a "jobID" which we need in order to access the results.
    R = requests.post('http://nova.astrometry.net/api/submissions/'+str(SubID))
    #R will be an object with the response to the web post we sent.
    #print(R.json()) #Uncomment to see this response
    JOBIDSforThis = R.json()["jobs"] #A submission *may* have multiple jobs. 
    JOBIDforThis = JOBIDSforThis[0] #We only care about the first one that was done
    if debug: print(JOBIDforThis)
    return JOBIDforThis


from urllib.request import urlretrieve as saveFromWeb #self explanatory import
def saveNewFits(JOBIDforThis,name):
    """This function uses a JobID to download an enhanced fits file with the new data in the header,
    and saves it to the current location using the name that was provided."""
    saveFromWeb("https://nova.astrometry.net/new_fits_file/"+str(JOBIDforThis), name+".fits")
    return name+".fits"

def retrieveFile(name,debug=False):
    """This function does the lookups in order to facilitate saving a file, given a name. It first gets the SubID,
    and then uses that to get the JobID and lastly save the file."""
    #print("--------:",nameSubs)
    subid = nameSubs[name] #First retrieve the SubID for this file
    jobid = getJobID(subid) #Next we acquire the JobID for it's succesful result
    newName = saveNewFits(jobid,name) #We use the JobID to get the file from astrometry, 
    #and then we need the name in order to save it as the right name
    return newName

# Main Program

The first step is to enter an api key. There should be a string of characters in a comment, you'll need to copy and past those into the spot that shows up when you run the cell.

In [None]:
#USERS NEED TO ENTER AN API KEY!!!!

APIKEY = input("Please enter your API Key")
#copy and paste the following string of characters into the textbox that appears when you run the cell
#     webboctjikkepcfj        #Aidan's API Key

cAPI = client.Client()
cAPI.login(APIKEY)

-----
Now that we're logged in, we next need to assemble a list of all the files we'd like to upload. 

These should have been put into the 'input' folder with the renaming section in the master notebook, so they should have already been filtered and only be images, but we double check because sometimes there's sneaky buggers.

In [None]:
inputFiles_all = os.listdir(dataFilePath+"input")
#print(inputFiles_all)
#The program will pull in ANY files that are there; 
#sometimes there are hidden os files or such we don't want, so we need to filter those out.
inputFiles = []
for file in inputFiles_all:
    if file[-4:] == '.fit': inputFiles.append(file)
#print(inputFiles)

numFiles = len(inputFiles)
print(numFiles,'files were identified:\n',inputFiles)

-----
Before we get going, we have a few dictionaries and lists that we will need, in addition to `inputFiles` which we just made, so we get those made now.

In [None]:
#This dictionary will contain the mapping between the names and what submission IDs they were assigned 
# when they were uploaded.
nameSubs = {}
#We define it in it's own cell so that it doesn't ever get deleted/cleared accidentally 
# if we need to run a part of the program again.
names = []

In [None]:
#Set these two here in a seperate cell to maintain progress upon hitting an error. 
# I wasn't fully able to test that it resumed under all circumstances, so I would advise not fully counting on it.
upload_progress = 0 
checkFiles = []

Now we need to upload all of our files! We simply loop through all of them and save the relevant bits as we go.

This WILL take a long time; long enough you can go do something else for a bit while it goes. 

Sorry if you left this for the last minute, but you're straight out of luck.

In [None]:
while upload_progress < numFiles:
    file = inputFiles[upload_progress]
    
    #Take care of some of the feedback stuff so we know how it's doing
    clear_output(wait=False)
    print('Upload Status:',upload_progress,'of',numFiles, "   Current File (Unordered):",file)
    print("")
    
    #submit file
    name = submitFile(file)
    #save name
    names.append(name)
    checkFiles.append(file)
    #print(file)
    upload_progress += 1
    
print("Uploading Complete.")
    
#And lastly we just run a check to make sure that all of them get uploaded right. you should see nothing
for file in inputFiles:
    if not checkFiles.count(file): print(file,"Was not present in the redundancy list.")

-----
Now we need to download the files. First we create a few items again like we did before to track our progress, and then we loop through.

This part WILL stick them in whatever directory this notebook is in, and it might get a little bit messy. The next part will involve us moving the files to the proper output directory that was specified in the config file.

In [None]:
savedFiles = []
download_progress = 0
numDown = len(names)

In [None]:
while download_progress < numDown:
    name = names[download_progress]
    
    #Take care of some of the feedback stuff so we know how it's doing
    clear_output(wait=False)
    print('Download Status:',download_progress,'of',numDown,"  Current File (Unordered):",name)
    print("Most Recent Download was:",savedFiles[-1])
    print("")
    
    if not savedFiles.count(name): #make sure we don't download twice if we added a name twice accidentally
        downFile = retrieveFile(name)
        savedFiles.append(downFile)
        print("")
        print("Downloaded File:",savedFiles[-1])
    
    #Take care of some of the feedback stuff so we know how it's doing
    download_progress += 1
    

-----
These next two cells don't really have much purpose if things have gone right. They're left here to help maybe guide you if things go wrong, but I haven't explained them very much so they might not be very helpful after all.

In [None]:
#uncomment either of these if you need to check something. 
#You can copy the 7-digit number and paste it into the following link to look at the image in a web browser.
#  https://nova.astrometry.net/user_images/[subid]#annotated

#print(nameSubs)
#print(len(nameSubs))

In [None]:
#if you want to look at what it saved:
#print(savedFiles)

-----
Lastly, we need to move all of the files that we downloaded into the proper location. Due to the way it was coded, there is a possibilty that the code might think a file exists, and try to move it twice, so we first do a little cast uncast to prevent that.

In [None]:
#If we ran the loop multiple times, when we redownloaded a file, it would have overwrote the old one.
#However, the name would still get readded to the list. So, we want to take out any duplicate names, 
#  which we can do by a sneaky trick; casting a dict and then back to a list:
savedFiles = list(dict.fromkeys(savedFiles)) #remove duplicates
#Note: this bug can be fixed by putting the savedFiles list in the same cell as it's loop, but then we can't continue if interrupted

#We can only download files to the same location as this python file, for cybersecurity reasons
# After we've downloaded all the files into our main folder, we want to put them into their own 
# folder, which is what this lil loop here does

#print(savedFiles) #troubleshooting
for fn in savedFiles:
    #if fn == '060.fits': continue #There was an error and 060.fits was corrupted. Replicate if you encounter problems
    os.rename(fn,dataFilePath+'output/'+fn)

#### We've now done everything that we need to with this notebook, and you can go back to the Master and continue.

# Before/After Comparison (Optional)

There are a few more things that may be interesting to you that we wanted to include for you to check out.

If you're interested in seeing what changed by doing this process, you can do so with the cells that follow. There are two versions, one which demonstrates the WCS info we added, and another one below that which prints the whole header for you to see.

In [None]:
#These two packages are only used to check on whether the program worked or not.
from astropy.io import fits #be able to interpret fits files
from astropy.wcs import WCS #convert/extract the information we added

In [None]:
inputFileName = "009.fit" #Change this to be what file you'd like to check

print("WCS Info Before Astrometry:")
print(" ")

with fits.open(dataFilePath+"input/"+inputFileName) as f:
    w = WCS(f[0].header)
    print(w)

print(" ")
print("----------------------------------------------------------------")
print("WCS Info After Astrometry:")
print(" ")

outputFileName = inputFileName + "s" #change this if the only change wasn't .fit -> .fits
with fits.open(dataFilePath+"output/"+outputFileName) as f:
    w = WCS(f[0].header)
    print(w)

In [None]:
inputFileName = "009.fit" #Change this to be what file you'd like to check

print("FITS Header Before Astrometry:")
print(" ")

with fits.open(dataFilePath+"input/"+inputFileName) as f:
    print(repr(f[0].header))

print(" ")
print("----------------------------------------------------------------")
print("FITS Header After Astrometry:")
print(" ")

outputFileName = inputFileName + "s" #change this if the only change wasn't .fit -> .fits
with fits.open(dataFilePath+"output/"+outputFileName) as f:
    print(repr(f[0].header))