## Opening & Accessing Kaguya Spectral Profiler Data

First, it is necessary to import the `libpysat` module.  This notebook also imports a helper function `get_path` that makes working with the sample data shipped with `libpysat` easier.

In [2]:
import libpysat as psat
from libpysat.examples import get_path

  from ._conv import register_converters as _register_converters


To open a spectral profiler 'image', we use the `psat/Spectra.from_spectral_profiler` call.  If example data is not going to be used, the `get_path(<my_file_path>)` can be replaced with `<my_file_path>`, since our helper function does not know where your data is being stored. 

In [3]:
s = psat.Spectra.from_spectral_profiler(get_path('SP_2C_02_02358_S138_E3586.spc'))

The `s` object is based on a pandas data frame.  Therefore, anything that you might normally do with a pandas data frame, can be applied the the `libpysat.Spectra` object.  In this notebook we demo a few of the possible operations that Pandas provides.

## Viewing the data

To see the first or last rows in the `Spectra` object, one can use `head` or `tail`, respectively.

In [36]:
s.head(10)

Unnamed: 0_level_0,Unnamed: 1_level_0,512.6,518.4,524.7,530.4,536.5,542.8000000000001,548.7,554.5,560.5,566.7,...,CALIBRATION,SP_PELTIER,TC_MI_STATUS,CLOCK_COUNT_ERR_FLAG,SPATIAL_RESOLUTION_FLAG,GEOMETRIC_INFO_RECAL_FLAG,SUPPORT_IMAGE_LINE_POSITION,SUPPORT_IMAGE_COLUMN_POSITION,THUMBNAIL_LINE_POSITION,THUMBNAIL_COLUMN_POSITION
minor,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
RAW,0,5123.0,5887.0,6375.0,6806.0,7494.0,7585.0,7525.0,7860.0,8582.0,9196.0,...,0,1,1,0,65,67,27,480,13,228
REF1,0,0.0402,0.0487,0.0497,0.052,0.0532,0.0551,0.0559,0.0571,0.0594,0.0612,...,0,1,1,0,65,67,27,480,13,228
REF2,0,0.0397,0.0482,0.0492,0.0515,0.0527,0.0545,0.0553,0.0565,0.0588,0.0605,...,0,1,1,0,65,67,27,480,13,228
QA,0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,...,0,1,1,0,65,67,27,480,13,228
RAW,1,5164.0,5955.0,6461.0,6905.0,7614.0,7709.0,7642.0,7986.0,8730.0,9366.0,...,0,1,1,0,65,67,55,480,27,228
REF1,1,0.0414,0.0502,0.0513,0.0537,0.0549,0.0568,0.0576,0.0588,0.0612,0.0631,...,0,1,1,0,65,67,55,480,27,228
REF2,1,0.0409,0.0497,0.0507,0.0531,0.0544,0.0562,0.057,0.0582,0.0606,0.0624,...,0,1,1,0,65,67,55,480,27,228
QA,1,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,...,0,1,1,0,65,67,55,480,27,228
RAW,2,5080.0,5820.0,6296.0,6711.0,7375.0,7473.0,7416.0,7741.0,8445.0,9038.0,...,0,1,1,0,65,67,83,480,40,228
REF1,2,0.0391,0.0474,0.0484,0.0506,0.0518,0.0537,0.0545,0.0556,0.0579,0.0596,...,0,1,1,0,65,67,83,480,40,228


In [37]:
s.tail(5)

Unnamed: 0_level_0,Unnamed: 1_level_0,512.6,518.4,524.7,530.4,536.5,542.8000000000001,548.7,554.5,560.5,566.7,...,CALIBRATION,SP_PELTIER,TC_MI_STATUS,CLOCK_COUNT_ERR_FLAG,SPATIAL_RESOLUTION_FLAG,GEOMETRIC_INFO_RECAL_FLAG,SUPPORT_IMAGE_LINE_POSITION,SUPPORT_IMAGE_COLUMN_POSITION,THUMBNAIL_LINE_POSITION,THUMBNAIL_COLUMN_POSITION
minor,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
QA,36,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,...,0,1,1,0,65,67,1036,480,492,228
RAW,37,5046.0,5774.0,6233.0,6641.0,7289.0,7375.0,7325.0,7644.0,8335.0,8911.0,...,0,1,1,0,65,67,1064,480,505,228
REF1,37,0.0387,0.0468,0.0477,0.05,0.0511,0.0529,0.0538,0.0549,0.0571,0.0588,...,0,1,1,0,65,67,1064,480,505,228
REF2,37,0.0383,0.0464,0.0473,0.0495,0.0506,0.0524,0.0533,0.0544,0.0566,0.0582,...,0,1,1,0,65,67,1064,480,505,228
QA,37,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,...,0,1,1,0,65,67,1064,480,505,228


## Viewing the data at the observation level

Above, the data is viewed as if each row is a different observational unit.  In reality, each observation is composed of four rows:  (1) a team provided quality assurance row (QA), (2) the raw observered spectra (RAW), (3) a mare corrected continuum (REF1), and (4) a highlands corrected continuum (REF2).  If we want to work with each observation, we can group by the `id` and then loop over the observations like so:

In [6]:
sgroups = s.groupby('id')

In [38]:
# How many observations do we have?
len(sgroups)

38

Now it is possible to access each group by key.  In the case of spectral profiler, these keys are simply autoincrementing integers (0, 1, 2, ..., n).  The cell above (`len(sgroup)`), shows that this file contains 38 observations keyed 0 - 37. Below, we access the first group.

In [39]:
obs0 = sgroups.get_group(0)

To see just the metadata for this observation, we can access the `meta` attribute like so:

In [34]:
obs0.meta.head(4)
# obs0.meta is the correct call - I am using `.head(4)` because of a bug.

Unnamed: 0_level_0,Unnamed: 1_level_0,CALIBRATION,CENTER_LATITUDE,CENTER_LONGITUDE,CLOCK_COUNT_ERR_FLAG,DPU_TEMPERATURE,EMISSION_ANGLE,GEOMETRIC_INFO_RECAL_FLAG,HALOGEN_BULB_RADIANCE,HALOGEN_BULB_TEMPERATURE1,HALOGEN_BULB_TEMPERATURE2,...,SP_POWER_P5V,SP_TEMPERATURE,SUB_SPACECRAFT_LATITUDE,SUB_SPACECRAFT_LONGITUDE,SUPPORT_IMAGE_COLUMN_POSITION,SUPPORT_IMAGE_LINE_POSITION,TC_MI_STATUS,THUMBNAIL_COLUMN_POSITION,THUMBNAIL_LINE_POSITION,VIS_FOCAL_PLANE_TEMPERATURE
minor,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
RAW,0,0,-13.488591,358.607848,0,14.2704,0.60772,67,4.759,10.36,10.36,...,4.9735,20.808901,-13.517322,358.600799,480,27,1,228,13,21.059999
REF1,0,0,-13.488591,358.607848,0,14.2704,0.60772,67,4.759,10.36,10.36,...,4.9735,20.808901,-13.517322,358.600799,480,27,1,228,13,21.059999
REF2,0,0,-13.488591,358.607848,0,14.2704,0.60772,67,4.759,10.36,10.36,...,4.9735,20.808901,-13.517322,358.600799,480,27,1,228,13,21.059999
QA,0,0,-13.488591,358.607848,0,14.2704,0.60772,67,4.759,10.36,10.36,...,4.9735,20.808901,-13.517322,358.600799,480,27,1,228,13,21.059999


Likewise, it is possible to access just the observed information:

In [35]:
obs0.spectra.head(4)
# obs0.spectra is the correct call - I am using `.head(4)` because of a bug.

Unnamed: 0_level_0,major,512.6,518.4,524.7,530.4,536.5,542.8000000000001,548.7,554.5,560.5,566.7,...,2516.1000000000004,2524.1000000000004,2532.1000000000004,2540.0,2548.0,2556.0,2564.0,2572.0,2579.9,2587.9
minor,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
RAW,0,5123.0,5887.0,6375.0,6806.0,7494.0,7585.0,7525.0,7860.0,8582.0,9196.0,...,7388.0,11511.0,9068.0,10917.0,8617.0,6806.0,11534.0,7317.0,11412.0,6100.0
REF1,0,0.0402,0.0487,0.0497,0.052,0.0532,0.0551,0.0559,0.0571,0.0594,0.0612,...,0.0854,0.1308,0.0472,0.0,0.0222,0.0,0.0,0.0,0.0213,0.0
REF2,0,0.0397,0.0482,0.0492,0.0515,0.0527,0.0545,0.0553,0.0565,0.0588,0.0605,...,0.085,0.1302,0.047,0.0,0.0221,0.0,0.0,0.0,0.0212,0.0
QA,0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,288.0,...,288.0,288.0,288.0,296.0,288.0,296.0,296.0,296.0,288.0,296.0


## Querying for data of interest

Since the `Spectra` object is a pandas data frame, it is possible to perform SQL style queries on any fields. For example:

In [14]:
subs = s.query('INCIDENCE_ANGLE < 30 & CENTER_LATITUDE < -14')
len(subs)

## Accessing a subset of the spectral data by label

The columns of the `Spectra` object are labeled by wavelength.  Notice how, in the above cell, some of the wavelength lables have many trailing zeros (or .00000000000004).  We are seeing floating point precision issues that would normally make label based access a pain.  Who really wants to type all of those zeros?  For that reason, the `Spectra` object supports the idea of `tolerance`.  The user can supply a wavelength value within the `tolerance` and we round under the hood.

In [30]:
# What is the tolerance value?
s.tolerance

1

In [31]:
# Use .get to only get the rows labeled 'REF1' and then get the wavelength (if one exists) within the tolerance of 511.7
s.get['REF1'][511.7].head(5)

Unnamed: 0_level_0,512.6
id,Unnamed: 1_level_1
0,0.0402
1,0.0414
2,0.0391
3,0.0393
4,0.0393


In [33]:
s.tolerance = 0.1
# This should result in an error, because 511.7 plus or minus 0.1 is not an available wavelength.  We wrapped this in a try/except block to keep a nasty looking stack trace out of the tutorial.
try:
    s.get['REF1'][511.7].head(5)
except:
    print('Key Error: 511.7 is not in the index')

Key Error: 511.7 is not in the index


It is also possible to access a range of values in a similar manner.  For example, if we only want to work with data around the 1um absorption band.  

The syntax for grabbing the subset is called a slice.  In the first position we have the label of the rows that we want to grab, e.g., `REF1`.  In the second position we use a `:` to indicate that we want to grab everything and in the third position we use `start:stop` notation to indicate that all wavelengths between 700 and 1600 should be selected.

For example:

In [36]:
sub = s.get['REF1', :, 700:1600]
sub

Unnamed: 0_level_0,major,704.7,710.8000000000001,716.7,722.7,728.7,734.7,740.7,746.8000000000001,752.8000000000001,758.7,...,1523.8000000000002,1531.7,1539.7,1547.7,1555.5,1563.7,1571.7,1579.6000000000001,1587.7,1595.7
minor,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
REF1,0,0.0809,0.0819,0.083,0.0832,0.0844,0.0842,0.0861,0.0858,0.0864,0.0872,...,0.1673,0.1681,0.1697,0.1692,0.171,0.1698,0.1699,0.1716,0.1759,0.1733
REF1,1,0.0833,0.0843,0.0855,0.0858,0.0869,0.0867,0.0887,0.0883,0.089,0.0898,...,0.1696,0.1705,0.1719,0.1714,0.1731,0.1721,0.1717,0.1736,0.1777,0.1751
REF1,2,0.0793,0.0803,0.0814,0.0817,0.0828,0.0825,0.0845,0.0841,0.0848,0.0856,...,0.1655,0.1662,0.1676,0.1672,0.1689,0.1679,0.1677,0.1693,0.1736,0.1712
REF1,3,0.0797,0.0806,0.0818,0.082,0.0832,0.083,0.0849,0.0846,0.0852,0.086,...,0.1648,0.1656,0.1672,0.1668,0.1685,0.1675,0.1672,0.169,0.1734,0.1708
REF1,4,0.0799,0.0809,0.0821,0.0822,0.0834,0.0833,0.0852,0.0849,0.0855,0.0862,...,0.1694,0.1702,0.172,0.1715,0.1731,0.1724,0.1721,0.1741,0.1785,0.1759
REF1,5,0.0791,0.08,0.0812,0.0814,0.0825,0.0823,0.0842,0.0839,0.0845,0.0852,...,0.1647,0.1654,0.1671,0.1665,0.1681,0.1672,0.1669,0.1685,0.1727,0.1701
REF1,6,0.0797,0.0806,0.0818,0.0821,0.0832,0.083,0.085,0.0846,0.0852,0.0861,...,0.1675,0.1684,0.1698,0.1695,0.1712,0.1702,0.1699,0.1715,0.176,0.1735
REF1,7,0.08,0.081,0.0821,0.0824,0.0835,0.0834,0.0853,0.085,0.0857,0.0865,...,0.1683,0.1692,0.1707,0.1704,0.172,0.171,0.1708,0.1725,0.1767,0.1742
REF1,8,0.0823,0.0833,0.0845,0.0847,0.086,0.0858,0.0878,0.0875,0.0881,0.089,...,0.1722,0.1732,0.1748,0.1744,0.176,0.1752,0.1747,0.1768,0.1811,0.1785
REF1,9,0.0804,0.0813,0.0825,0.0828,0.084,0.0838,0.0858,0.0854,0.0861,0.087,...,0.1703,0.1713,0.1729,0.1723,0.174,0.1731,0.1729,0.1747,0.179,0.1767


## Format Conversion
Finally, it is possible to convert from a `libpysat` Spectra object into any number of formats support by Pandas.  For example, below, we convert the `.spc` file into CSV that can be opened and worked with in Excel.

In [25]:
s.to_csv('SP_2C_02_02358_S138_E3586.csv')

In [30]:
!head -n 5 SP_2C_02_02358_S138_E3586.csv 

minor,id,512.6,518.4,524.7,530.4,536.5,542.8000000000001,548.7,554.5,560.5,566.7,572.6,578.5,584.5,590.6,596.7,602.5,608.6,614.6,620.5,626.7,632.7,638.6,644.6,650.6,656.6,662.6,668.8000000000001,674.7,680.6,686.7,692.6,698.6,704.7,710.8000000000001,716.7,722.7,728.7,734.7,740.7,746.8000000000001,752.8000000000001,758.7,764.8000000000001,770.7,776.7,782.7,788.8000000000001,794.7,800.7,806.8000000000001,812.7,818.7,824.8000000000001,830.8000000000001,836.8000000000001,842.8000000000001,848.8000000000001,854.6,860.7,866.7,872.7,878.7,884.6,890.7,896.6,902.7,908.7,914.6,920.6,926.6,932.6,938.6,944.6,950.6,955.4000000000001,963.5,971.4000000000001,979.7,987.6,993.7,1013.1,1019.5,1027.7,1035.5,1043.6000000000001,1051.7,1059.7,1067.8,1075.8,1083.6000000000001,1091.8,1099.7,1107.7,1115.9,1123.8,1131.8,1139.7,1147.8,1155.7,1163.8,1171.8,1179.8,1187.8,1195.8,1203.9,1211.9,1219.8,1227.9,1235.9,1244.0,1252.0,1259.8000000000002,1267.8000000000002,1275.9,1284.2,1292.0,1299.8000000000002,1307.8000000