## PVL
This notebook focuses on using the PVL library (https://github.com/planetarypy/pvl) to work with image labels.  

The link includes installation instructions, basically: `pip install pvl`.  Once installed, we are off to the races.

In [1]:
import pvl

The PVL library accesses the data set and extracts the PVL formatted header into a Python dictionary.  The python diction, like the PVL, is nested.  Therefore, we define a helper function that will recursively search a dictionary for a key.

Note: This function returns the **first** instance of the key.

In [2]:
def find_in_dict(obj, key):
    """
    Recursively find an entry in a dictionary

    Parameters
    ----------
    obj : dict
          The dictionary to search
    key : str
          The key to find in the dictionary

    Returns
    -------
    item : obj
           The value from the dictionary
    """
    if key in obj:
        return obj[key]
    for k, v in obj.items():
        if isinstance(v,dict):
            item = find_in_dict(v, key)
            if item is not None:
                return item

The PVL can be read in two ways (inline with other Pyhton io modules, like json):

* `pvl.load` - This is the primary method of access to read a label from an image.

* `pvl.loads` - Create a PVL object from a string.  This is nice if you are going to use the `with open()` syntax to manually parse out a subset of the label or you want to generate a custom label using Python and validate it via the PVL library.

The following example extracts the PVL header from a Kaguya Spectral Profiler (.spc) file. This is an 'image' file, in the sense that the header is preceeded by binary observation data.

In [4]:
label = pvl.load('data/SP_2C_02_07441_S136_E3584.spc')

The label is a PVL label object, that also acts as a Python dictionary.  Therefore, access is possible via keys.

In [5]:
label

Label([
  (u'PDS_VERSION_ID', u'PDS3')
  (u'RECORD_TYPE', u'UNDEFINED')
  (u'FILE_NAME', u'SP_2C_02_07441_S136_E3584.spc')
  (u'PRODUCT_ID', u'SP_2C_02_07441_S136_E3584')
  (u'DATA_FORMAT', u'PDS')
  (u'^ANCILLARY_AND_SUPPLEMENT_DATA', Units(value=24746, units=u'BYTES'))
  (u'^SP_SPECTRUM_WAV', Units(value=38524, units=u'BYTES'))
  (u'^SP_SPECTRUM_RAW', Units(value=39116, units=u'BYTES'))
  (u'^SP_SPECTRUM_REF2', Units(value=88252, units=u'BYTES'))
  (u'^SP_SPECTRUM_RAD', Units(value=137388, units=u'BYTES'))
  (u'^SP_SPECTRUM_REF1', Units(value=186524, units=u'BYTES'))
  (u'^SP_SPECTRUM_QA', Units(value=235660, units=u'BYTES'))
  (u'^L2D_RESULT_ARRAY', Units(value=284796, units=u'BYTES'))
  (u'SOFTWARE_NAME', u'RGC_SP')
  (u'SOFTWARE_VERSION', u'2.10.3')
  (u'PROCESS_VERSION_ID', u'L2C')
  (u'PRODUCT_CREATION_TIME',
   datetime.datetime(2012, 4, 21, 4, 29, 37, tzinfo=<UTC>))
  (u'PROGRAM_START_TIME',
   datetime.datetime(2012, 4, 21, 4, 26, 11, tzinfo=<UTC>))
  (u'PRODUCER_ID', u'LISM'

Here the PDS_VERSION_ID key is accessed and Python unicode object is returned.

In [9]:
label['PDS_VERSION_ID']

u'PDS3'

Accessing a key with associated units returns a pvl unit object.

In [12]:
print label['^ANCILLARY_AND_SUPPLEMENT_DATA']
print type(label['^ANCILLARY_AND_SUPPLEMENT_DATA'])

Units(value=24746, units=u'BYTES')
<class 'pvl._collections.Units'>


The unit object has two commonly used attributes: `.value` and `.units`.  These do exactly what you might expect, return the value and the unit, respectively.

In [16]:
anc = label['^ANCILLARY_AND_SUPPLEMENT_DATA']
print(anc.value)
print(anc.units)

24746
BYTES


The '^ANCILLARY_AND_SUPPLEMENT_DATA' key returns offset information for binary reading.  The 'ANCILLARY_AND_SUPPLEMENT_DATA' key is the first nested key.  The return from accessing this key is a LabelObject.

In [18]:
anc = label['ANCILLARY_AND_SUPPLEMENT_DATA']
print(type(anc))
print(anc)

<class 'pvl._collections.LabelObject'>
LabelObject([
  (u'INTERCHANGE_FORMAT', u'BINARY')
  (u'ROWS', 83)
  (u'COLUMNS', 43)
  (u'ROW_BYTES', 166)
  (u'COLUMN',
   LabelObject([
    (u'NAME', u'SPACECRAFT_CLOCK_COUNT')
    (u'DATA_TYPE', u'IEEE_REAL')
    (u'UNIT', u'sec')
    (u'START_BYTE', 1)
    (u'BYTES', 8)
  ]))
  (u'COLUMN',
   LabelObject([
    (u'NAME', u'VIS_FOCAL_PLANE_TEMPERATURE')
    (u'DATA_TYPE', u'IEEE_REAL')
    (u'UNIT', u'degC')
    (u'START_BYTE', 9)
    (u'BYTES', 4)
  ]))
  (u'COLUMN',
   LabelObject([
    (u'NAME', u'NIR1_FOCAL_PLANE_TEMPERATURE')
    (u'DATA_TYPE', u'IEEE_REAL')
    (u'UNIT', u'degC')
    (u'START_BYTE', 13)
    (u'BYTES', 4)
  ]))
  (u'COLUMN',
   LabelObject([
    (u'NAME', u'NIR2_FOCAL_PLANE_TEMPERATURE')
    (u'DATA_TYPE', u'IEEE_REAL')
    (u'UNIT', u'K')
    (u'START_BYTE', 17)
    (u'BYTES', 4)
  ]))
  (u'COLUMN',
   LabelObject([
    (u'NAME', u'SPECTROMETER_TEMPERATURE_1')
    (u'DATA_TYPE', u'IEEE_REAL')
    (u'UNIT', u'degC')
    (u

It is possible to access the 'INTERCHANGE_FORMAT' key using standard nested dictionary notation.

In [20]:
label['ANCILLARY_AND_SUPPLEMENT_DATA']['INTERCHANGE_FORMAT']

u'BINARY'

Alternatively, the `find_in_dict` function, described above can be used.

In [24]:
value = find_in_dict(label, 'INTERCHANGE_FORMAT')
print(value)

BINARY
