This notebook is to illustrate the development of the JSON catalog function for VIIRS SDS files that are organised in per-overpass directories as per GINA's setup, at least in 2015. 

First, imports

In [1]:
%matplotlib inline
from __future__ import print_function, division
import sys, os
import json

In [2]:
from shapely.geometry import Polygon

In [3]:
from pygaarst import raster
import viirstools as vt
import viirsswathtools as vst

During development, I usually want to reload the external module regularly, so let's put a cell to do so here. 

In [41]:
reload(vt)

<module 'viirstools' from 'viirstools.py'>

Then directories, and some helper definitions.

In [4]:
ALT1 = True
ALT2 = False 

basedir = '/Volumes/cwdata1/VIIRS/GINA/dds.gina.alaska.edu/NPP/viirs/'
outdir = '/Volumes/SCIENCE_mobile_Mac/Fire/DATA_BY_PROJECT/2015VIIRSMODIS/rasterout/'
if ALT1:
    basedir = '/Volumes/SCIENCE_mobile_Mac/Fire/DATA_BY_PROJECT/2015VIIRSMODIS/VIIRS/'
elif ALT2:
    basedir = '/Volumes/SCIENCE/Fire/DATA_BY_AREA/2015/VIIRS/'

if os.path.isdir(basedir):
    print(basedir, "exists")
else:
    print("Please check directory {}: cannot access it.".format(basedir))

/Volumes/SCIENCE_mobile_Mac/Fire/DATA_BY_PROJECT/2015VIIRSMODIS/VIIRS/ exists


In [5]:
overpasses = [
    u'2015_06_14_165_1148',
    u'2015_06_14_165_2144',
    u'2015_06_14_165_2325',
]

We can use the existing getfilesbygranule() function (which is a little misnamed as it actually retrieves files by overpass and then, one level lower, granule) to get some test files. 

In [6]:
myviirsfiles = vt.getfilesbygranule(basedir, overpasses)

In [7]:
allviirsfiles = vt.getfilesbygranule(basedir)

Step 1: Define a function to check if a set of files for a granule is complete, prima facie, that is, if all 25 files that make up a granule are present.

In [8]:
BANDFILES = {
    u'dnb': ['SVDNB', u'GDNBO'],
    u'iband': [u'SVI01', u'SVI02', u'SVI03', u'SVI04', u'SVI05', u'GITCO'],
    u'mband': [u'SVM01', u'SVM02', u'SVM03', u'SVM04', u'SVM05', 
               u'SVM06', u'SVM07', u'SVM08', u'SVM09', u'SVM10', 
               u'SVM11', u'SVM12', u'SVM13', u'SVM14', u'SVM15', 
               u'SVM16', u'GMTCO'],
}

def checkviirsganulecomplete(granuledict, dataset='iband'):
    dataset = dataset.lower()
    complete = True
    if dataset not in BANDFILES.keys():
        print("Unknown band type '{}' for viirs granule. Valid values are: {}.".format(
            dataset, ', '.join(BANDFILES.keys())))
        return
    complete = True
    for bandname in BANDFILES[dataset]:
        try:
            if not granuledict[bandname]:
                complete = False
                print("detected missing band {}".format(bandname))
                return complete
        except KeyError:
            complete = False
            print("detected missing key for band {}".format(bandname))
            return complete
    return complete

Here's how it is used:

In [16]:
checkviirsganulecomplete(myviirsfiles['2015_06_14_165_1148']['20150614_1152377'], 'mband')

True

Step 2: Retrieve a catalog of granules, by granule, based on first running the existing function, then adding the test, then adding metadata and supplementary information:

 * Whether the I-band, M-band and DNB files are complete
 * The granule ID and orbit number
 * A WKT string for the edge of the I-band raster geolocation, in native WGS84. Note that this is unlikely to be avalid polygon, but can be transformed into one by projecting. It's a useful string. The M-band and DNB locations won't be horribly different. We can add them at a later stage if we want.

In [31]:
def getgranulecatalog(basedir, overpassdirlist=None):
    intermediary = vt.getfilesbygranule(basedir, scenelist=overpassdirlist)
    catalog = {}
    for overpass in intermediary:
        for granule in intermediary[overpass]:
            if granule in ['dir', 'message']: continue
            print(granule)
            catalog[granule] = intermediary[overpass][granule]
            catalog[granule][u'dir'] = intermediary[overpass]['dir']
            for datasettype in BANDFILES:
                catalog[granule][datasettype + u'_complete'] = checkviirsganulecomplete(catalog[granule])
            if catalog[granule][u'iband_complete']:
                try:
                    viirs = raster.VIIRSHDF5(os.path.join(
                            catalog[granule][u'dir'], 
                            catalog[granule][u'SVI01']))
                except IOError:
                    print("cannot access data file for I-band in {}".format(granule))
                    catalog[granule][u'iband_complete'] = False
                    continue
                catalog[granule][u'granuleID'] = viirs.meta[u'Data_Product'][u'AggregateBeginningGranuleID']
                catalog[granule][u'orbitnumber'] = viirs.meta[u'Data_Product'][u'AggregateBeginningOrbitNumber']
                try:
                    catalog[granule][u'ascending_node'] = viirs.ascending_node
                    edgelons, edgelats = vt.getedge(viirs)
                except IOError:
                    print("cannot access geodata file for I-band in {}".format(granule))
                    catalog[granule][u'iband_complete'] = False
                    continue
                catalog[granule][u'edgepolygon_I'] = Polygon(zip(edgelons, edgelats)).wkt
                viirs.close()
    return catalog 

... and here is how it's used.

In [42]:
singlecata = getgranulecatalog(basedir, ['2015_06_14_165_1148'])

20150614_1152377
20150614_1151123
20150614_1155285
20150614_1156521
20150614_1149468
20150614_1158175
20150614_1148213
20150614_1154031


After adding the two functions and constant to the helper file `viirstools.py`, we can use it as follows:

In [33]:
cata = vt.getgranulecatalog(basedir)

20150527_2048058
20150527_2049312
20150527_2042242
20150527_2050566
20150527_2041002
20150527_2046404
20150527_2045150
20150527_2043496
20150527_2226152
20150527_2224498
20150527_2230314
20150527_2231568
20150527_2229060
20150527_2223242
20150527_2227406
20150528_1034340
20150528_1028524
20150528_1031432
20150528_1027270
20150528_1033086
20150528_1030178
20150528_2027447
20150528_2024539
20150528_2026193
20150528_2030355
20150528_2032010
20150528_2029101
20150528_2023302
20150528_2213011
20150528_2207195
20150528_2210103
20150528_2204287
20150528_2208449
20150528_2203032
20150528_2205541
20150528_2211357
20150528_2352359
20150528_2351105
20150528_2348197
20150528_2346543
20150528_2345289
20150528_2344033
20150528_2349451
20150529_2151147
20150529_2152401
20150529_2154055
20150529_2149493
20150529_2145330
20150529_2148239
20150529_2146585
20150529_2144093
20150529_2330495
20150529_2329240
20150529_2333403
20150529_2325077
20150529_2332149
20150529_2327586
20150529_2326332
20150614_10142

... and write out the result as JSON to a file. 

In [34]:
with open(os.path.join(basedir, 'viirsgranulecatalog.json'), 'w') as dest:
    dest.write(json.dumps(cata, indent=2))