# Database Searcher
## Find What you Want
### The First Step

This notebook searches the Cube Database stored locally on this machine and selects the cubes that meet a user-defined set of criteria. 

Currently, this notebook requires the Cube Database to be in .csv file formats with separate axes files. 

Can sift through the database for location, timme (via flyby number), and resolution limiters. 

A standard run is to find all cubes at a specific location or locaiton range at 25km resolution or lower. The location is usually what's important. 

In [1]:
#PARAMETER SETTING
#This is where you tell the program what restrictions it should have on its observations.

outputName = "northPoleCubeList.csv" #Best to put ".csv" at the end of the name. 
#IF YOU DO NOT CHANGE THAT IT WILL OVERRIDE THE FILE WITH THIS NAME

#Set values you want to check to True.
latLonCheck = True #Restricting to specific location ranges
flybyCheck = False #Temporal restriction by flyby number
resolutionCheck = True #Restrict to a certain resolution classificaiton.
#If all of these are set to False you really don't need to run this notebook, as the AcceptableCubes.csv list will have what you want. 

#latLonCheck
#Bounds are not inclusive, this is to avoid edge case error values. (A point exactly at "0" is likely not real). 
latUpperBound = 90.0
latLowerBound = 60.0
lonUpperBound = 360.0
lonLowerBound = 0.0

#flybyCheck
flybyUpperBound = "TB" #This is NOT inclusive, the upper bound here will NOT be kept. 
flybyLowerBound = "TA" #This IS inclusive, the lower bound here WILL be kept. 
#Note: since this is the database search we can assume every cube exists, so we won't accidentially skip over something.

#resolutionCheck
#NOTE: cubes are given resolution values of the smallest nonzero resolution contained within their pixels
resUpperBound = 25.0 #25.0 is considered standard. 
resLowerBound = 0.0

In [2]:
#SIFTER

#Imports
import csv
import math
import numpy as np
import scipy.misc

#Counter for our sanity, making sure the code hasn't gotten stuck and is progressing. 
counterForSanity = 0.0

#First, read in the data into a matrix. 
cubeList = []
with open('AcceptableCubes.csv') as csv_file:
    #AccpetableCubes.csv holds all the cubes deemed acceptable. 
    #All noodles and clear visual errors have been removed. 
    #If a new AcceptableCubes.csv needs to be generated, ceck the CubeDatabaseSearcher in SCRATCH WORK
    #Making an entirely new one is highly unlikely, though, at worst some of the higher number flybys will need to be manually added. 
    csv_reader = csv.reader(csv_file, delimiter=',')
    line_count = 0
    for row in csv_reader:
        paired = ["A","A"]
        paired[0] = row[0]
        paired[1] = row[1]
        cubeList.append(paired)
        line_count += 1

newCubeList = [] #The sorted cubes will go in here. 
#Now, for every item in cubeList, we find the file it poitns to and read it in. 
#Then we examine those files for *something*. Whatever it is. 

flybyRangeAcceptable = False

for item in cubeList:
    err = 0 #No error by default.

    if (flybyCheck == True): #Are we in the right flyby range?
        if (flybyLowerBound == item[0]):
            flybyRangeAcceptable = True
        elif (flybyUpperBound == item[0]): #Note that upper flyby bound is not inclusive. 
            flybyRangeAcceptable = False
    else: 
        flybyRangeAcceptable = True #If we're not checking for time, all of them are fine. 


    if (flybyRangeAcceptable == True): #Don't bother reading anything in if we don't want this flyby. 
        
        #This code is copy adapted from the VIMS Cube Visualisation Interface Notebook. 
        #It is complicated.
        #First, find the file. 
        filepath = "C:\\Users\\deran\\Desktop\\CubeCSVDatabase\\" + item[0] + "\\CM_" + item[1] + ".cub.csv"
    
        #Now we extract the axes file as well...
        cubeAxesfp = filepath.removesuffix(".csv") + ".axes.csv"
        
        #And grab the geo files. 
        cubeGeofpIR = filepath.removesuffix(".cub.csv") + "_ir_geo.cub.csv"
        cubeGeofpIRaxes = filepath.removesuffix(".cub.csv") + "_ir_geo.cub.axes.csv"
        
        #Skeleton code nabbed from https://realpython.com/python-csv/
    
        #Step 1: use the axes to determine the size of what we're dealing with.
        xAxisCube = []
        yAxisCube = []
        zAxisCube = []
        
        xAxisGeoIR = []
        yAxisGeoIR = []
        zAxisGeoIR = []
    
        try:
            with open(cubeAxesfp) as csv_file: #remember to tab.
                    csv_reader = csv.reader(csv_file, delimiter=',')
                    line_count = 0
                    for row in csv_reader:
                        i = 0
                        L = len(row)
                        while (i < L-1):
                            if (line_count == 0):
                                xAxisCube.append(row[i])
                            elif (line_count == 1):
                                yAxisCube.append(row[i])
                            elif (line_count == 2):
                                zAxisCube.append(row[i])
                            i = i+1
                        line_count += 1
        except:
            err = 1 #whoops.        
        try:
            with open(cubeGeofpIRaxes) as csv_file:
                csv_reader = csv.reader(csv_file, delimiter=',')
                line_count = 0
                for row in csv_reader:
                    i = 0
                    L = len(row)
                    while (i < L-1):
                        if (line_count == 0):
                            xAxisGeoIR.append(row[i])
                        elif (line_count == 1):
                            yAxisGeoIR.append(row[i])
                        elif (line_count == 2):
                            zAxisGeoIR.append(row[i])
                        i = i+1
                    line_count += 1
        except:
            err = 1 #whoops.
            
        #We now have an x, y, and z axis. x and y axes are just ordinal, but the z axis contains wavelength in microns.
        #The lengths of these arrays tell us how to extract the data.
        
        cubeData = [[[0 for x in range(len(zAxisCube))] for x in range(len(yAxisCube))] for x in range(len(xAxisCube))]
        geoIRData = [[[0 for x in range(len(zAxisGeoIR))] for x in range(len(yAxisGeoIR))] for x in range(len(xAxisGeoIR))]
        
        #The above holds the data of the cube itself. 
        try:
            with open(filepath) as csv_file:
                csv_reader = csv.reader(csv_file, delimiter=',')
                line_count = 0
                i, j, k = 0, 0, 0
                for row in csv_reader:
                    while (i < len(xAxisCube)):
                        cubeData[i][j][k] = float(row[i])
                        if (math.isnan(cubeData[i][j][k])):
                            cubeData[i][j][k] = 0 #We set nans to zero to allow plotting to take place, careful!
                        elif (cubeData[i][j][k] < 0):
                            cubeData[i][j][k] = 0 #Negative values are nonsense.
                        elif (cubeData[i][j][k] > 1):
                            cubeData[i][j][k] = 1 #Make saturation obvious? Keep it from overloading. 
                        i = i + 1
                    i = 0
                    j = j + 1
                    if (j >= len(yAxisCube)):
                        j = 0
                        k = k + 1
                    line_count += 1
        except:
            err = 1 #whoops.
        try:
            with open(cubeGeofpIR) as csv_file:
                csv_reader = csv.reader(csv_file, delimiter=',')
                line_count = 0
                i, j, k = 0, 0, 0
                for row in csv_reader:
                    while (i < len(xAxisGeoIR)):
                        geoIRData[i][j][k] = float(row[i])
                        if (math.isnan(geoIRData[i][j][k])):
                            geoIRData[i][j][k] = 0 #We set nans to zero to allow plotting to take place, careful!
                        elif (geoIRData[i][j][k] < -1000):
                            geoIRData[i][j][k] = 0 #The default value is an extremely negative number. Scrub it.
                        i = i + 1
                    i = 0
                    j = j + 1
                    if (j >= len(yAxisGeoIR)):
                        j = 0
                        k = k + 1
                    line_count += 1
        except:
            err = 1 #Whoops.
            
        #The data is now read in. 
        #Now we can do stuff with it.
        
        if (err == 0):
            i,j = 0,0
            bufferVal = 0.0
            locationFound = True
            if (resolutionCheck == True):
                bufferVal = resUpperBound+1 #Larger than the upper bound, always. 
            if (latLonCheck == True):
                locationFound = False #Can only fail to find a location if we're looking for one. 
            #If we're not checking resolution, keep this at zero, that means it will always pass. 
            if (resolutionCheck == True or latLonCheck == True): #only investigate pixels if necessary. 
                while (i < len(xAxisCube)):
                    j=0
                    while (j < len(yAxisCube)):
                        if (resolutionCheck == True): #Do we care about pixel resolution?
                            if (bufferVal > geoIRData[i][j][2] and geoIRData[i][j][2] != 0.0): #Resolutions of zero are errors, remove. 
                                bufferVal = geoIRData[i][j][2]
                        if (latLonCheck == True): #Do we care about pixel locaiton?
                            if (geoIRData[i][j][0] > latLowerBound and geoIRData[i][j][0] < latUpperBound and geoIRData[i][j][1] > lonLowerBound and geoIRData[i][j][1] < lonUpperBound):
                                locationFound = True
                        j=j+1
                    i=i+1
            if (bufferVal <= resUpperBound and bufferVal >= resLowerBound): # Is resolution satisfactory?
                if (locationFound == True): #Is the location we want in here somewhere?
                    newCubeList.append([item[0],item[1]]) #Only add if all criteria are met. 
        else:
            print("Error on", item[0], item[1])
            
    if(counterForSanity%50 == 0): print(counterForSanity)
    counterForSanity = counterForSanity + 1

print(len(newCubeList))
print(newCubeList)

Error on TA 1477222875_1
0.0
50.0
100.0
150.0
200.0
250.0
300.0
350.0
400.0
450.0
500.0
550.0
600.0
650.0
700.0
750.0
800.0
850.0
900.0
950.0
1000.0
1050.0
1100.0
1150.0
1200.0
1250.0
1300.0
1350.0
1400.0
1450.0
1500.0
1550.0
1600.0
1650.0
1700.0
1750.0
1800.0
1850.0
Error on T20 1540340637_1
1900.0
1950.0
Error on T21 1544570826_1
2000.0
2050.0
Error on T23 1547388863_1
2100.0
2150.0
Error on T23 1547419585_1
Error on T23 1547430037_1
Error on T23 1547433362_1
Error on T23 1547477310_1
Error on T23 1547478038_1
Error on T23 1547479488_1
Error on T23 1547480216_1
Error on T23 1547481576_1
2200.0
2250.0
2300.0
Error on T24 1548774526_1
2350.0
Error on T24 1548816037_1
2400.0
2450.0
Error on T26 1552225784_1
2500.0
2550.0
2600.0
2650.0
2700.0
2750.0
2800.0
2850.0
2900.0
2950.0
Error on T30 1557663911_1
3000.0
3050.0
3100.0
Error on T31 1559033025_1
3150.0
Error on T32 1560399813_1
Error on T32 1560400507_1
Error on T32 1560401224_1
Error on T32 1560418744_1
Error on T32 1560419795_1
Erro

In [4]:
#This cell saves the list. Remember to set the name of the file in the first cell.
#WILL OVERWRITE ANY FILE WITH THE SAME outputName
with open(outputName, 'w') as dataEntry: #w for write
    i,j,k = 0,0,0 #Iterators for the other lists.
    for item in newCubeList:
        dataEntry.write(str(item[0]) + "," + str(item[1]) + "," + "\n")