In [None]:
import hylite
import numpy as np

In [None]:
%matplotlib inline

### 1. Data types

Hyperspectral data can take many forms. *hylite* uses polymorphic data structures to make analyses as generic and as smooth as possible, such that most analyses conducted on e.g. image data can also be executed on e.g. hypercloud data.

Generally speaking, hyperspectral data comes in two forms: (1) spectral lists (e.g. spectral libraries, hyperclouds), or (2) spectral data cubes (e.g., hyperspectral images). *hylite* thus implements a generic `HyData` class that implements basic functions that are independent of spectral dimension, which is then extended in child classes to add specific functionality for different hyperspectral data types. These are:

* HyImage - for hyperspectral image cubes.
* HyCloud - for hypercloud data (spectra associated with spatially located points).
* HyLibrary - for spectral library data.

Additionally, collections of hyperspectral and associated data (e.g., geometry, illumination properties etc.) often need to be grouped together. This is simplified in *hylite* using a directory-like data structure *HyCollection* and its inheriting classes. 

In the following examples we will explore these data types and simple ways to interact with them.

---- 

**Exercise:** *Explore the documentation for the different data objects using the `?` operator*

In [None]:
from hylite import HyData, HyImage, HyCloud, HyLibrary, HyHeader, HyCollection, HyScene

In [None]:
HyImage?

----

### 2. Loading data

Functions for loading and saving data a variety of data types are implemented in the *hylite.io* package. To keep your code simple, these can all be called via the generic *io.load* and *io.save* functions in most situations. File formats that can be read / written this way include:

* **images**: *ENVI, .jpg, .png, .bmp, .tiff*
* **point clouds**: *.ply, .laz* ( [CloudCompare](https://cloudcompare.org/) is a handy open-source tool for converting / manipulating point cloud data ). Hyperclouds with associated scalar data (e.g. spectra, geometric attributes etc.) can be stored in *.ply* format.
* **HyCollection** and **HyScene** directories: *.hyc, .hys*
* header files: *.hdr* (though note that data associated with a *.hdr* file will be loaded also; use load_header to open only the header file)
* **spectral libraries**: *.csv, .sed*
* **numpy arrays**: *.npy*


-------------

 **Exercise:** Load some different file types.


In [None]:
# import IO functionality
from hylite import io

# open an ENVI image
image = io.load( 'test_data/image.hdr' )

# open a .ply point cloud
cloud = io.load( 'test_data/hypercloud.ply' )
cloud.decompress() # this was compressed from float to integer to save space; so we need to convert it back

----------------

### 3. Metadata and header files

In keeping with the standard ENVI file format, metadata associated with hyperspectral datasets (e.g. band widths and wavelengths, acquisition parameters, etc.) can be stored in a simple text header (.hdr) file. We have extended  the amount of information that can be stored in headers to include e.g. calibration panel data (see *hylite.header.get_panel*) and camera pose information (see *hylite.header.get_camera*).

All *hylite* data objects have a `foo.header` attribute that provides access to header data. This inherits from pythons inbuilt dictionary type, so specific keywords can be accessed or set using `foo.header['some_param']`. Note that keys and values in this dictionary are both stored as text (at least when the header is written to disk), so data types such as arrays need to be parsed using the appropriate function (e.g. *header.get_list*).

---- 

**Exercise:** View header file contents and add / remove keys.

In [None]:
cloud.header.print() # view the keys and associated data stored in the header file

In [None]:
#access data stored in the header file. Note that this is a string.
print(cloud.header['bands'], type(cloud.header['bands'])) 
assert int(cloud.header['bands']) == cloud.band_count() # check value in header matches number of bands in dataset

In [None]:
# get a list from the header file as a numpy array
wav = cloud.header.get_list('wavelength')
print("Hyperspectral bands range from: %s nm - %s nm" %( np.min(wav), np.max(wav)) )

In [None]:
# get camera pose information from the header file
cam = cloud.header.get_camera(0)
print("Sensor position is [%.1f,  %.1f,  %.1f] m" % tuple(cam.pos))

In [None]:
# Add some random information to the header
cloud.header['myname'] = 'Chuck'
cloud.header['wavenumber'] = 1. / (wav*1e-7)

In [None]:
cloud.header.print()

Note that these values will be saved on calling *io.save(...)*, providing an easy way to create persistent metadata.

----
### 4. Hyperspectal data arrays

Most hyperspectral analyses require some form of custom data munging, so the raw hyperspectral data arrays are left very exposed in *hylite*. Any `HyData` instance (including e.g. `HyImage` or `HyCloud` datasets) have a `foo.data` numpy array that contains the raw spectra. This provides easy access for processing, but can also make it easy to corrupt datasets - so use with care!

The shape of the data array will vary from 2-D (id, band) for spectral libraries and hyperclouds to 3-D (x,y,band) for image cubes.

In [None]:
print(cloud.data.shape, "= (pointID, bandID)")

In [None]:
print(image.data.shape, "= (x,y,bandID)")

In [None]:
image.data[ np.isnan( image.data ) ] = 0 # data arrays can be directly modified. Do this with care!

------
**Important note:** Hyperspectral bands can be referenced based on either their index (e.g. band number 10), or by the wavelength specified in the image metadata (e.g. 1000 nm). 

To seamlessly distinguish these two methods, hylite treats integer values (e.g., 1, 2, 1000) as indices and floating point values (e.g. 450.0, 750.0, 1000.0) as wavelengths (in the units defined in the header file, which are normally nanometers). 

**TLDR: Integers represent indices, floating point values represent wavelengths.**

----

Wavelengths can be converted to band indices using the `get_band_index` function, which returns the closest band to the specified wavelength.

In [None]:
print("Index of 2200. nanometers in hypercloud:", cloud.get_band_index(2200.0))

In [None]:
print("Index of 2200. nanometers in image:", image.get_band_index(2200.0))

Similarly, band indexes can be converted to wavelengths using the get_wavelengths() function, which returns an array containing the wavelength for each band.

In [None]:
print( image.get_wavelengths()[18])

Note that an error will be thrown if the difference between the requested wavelength and closest entry in the wavelength array is too large. This tolerance can be adjusted using the global *hylite.band_select_threshold* value.

In [None]:
print( "Provided wavelengths must be within %.1f nm of existing bands." % hylite.band_select_threshold )
#image.get_band_index( 2410. ) # throws an error
hylite.band_select_threshold = 20
image.get_band_index( 2410. ) # does not throw an error

Hyperclouds have several additional data arrays containing geometric attributes and (if defined) point colours and normal vectors. These can be accessed using:
* `foo.xyz` (point positions)
* `foo.rgb` (point colours from e.g. SfM model, if defined)
* `foo.normals` (point normals, if defined)

In [None]:
print("Point #10 colour: ", cloud.rgb[10,:])
print("Point #10 normal: ", cloud.normals[10,:])
print("Point #10 position: ", cloud.xyz[10,:])

### 5. Quick plotting

Visualisation will be covered in detail in a subsequent notebook, but the quick_plot(...) provides a good example
for the flexibility provided by these different indexing methods. 

-----

In [None]:
print( hylite.SWIR ) # some preset bands for false-colour visualisation with SWIR data. Note the floating point. 

In [None]:
fig,ax = image.quick_plot(hylite.SWIR)
fig.show()

In [None]:
fig,ax = image.quick_plot( 0, cmap='coolwarm' ) # plot with band index
fig.show()

In [None]:
fig,ax = image.quick_plot( 2340., cmap='coolwarm' ) # plot with wavelength
fig.show()

----

Unlike images, hyperclouds can have per-point colours that are independent of the hyperspectral data (e.g. based on the SfM pointcloud used to capture hypercloud geometry). These can be accessed by plotting `'rgb'`.

----

**Exercise**: *Experiment with 'rgb', 'klm' (normal vectors) and 'xyz' colour renders. Also try rearranging the orders of the letters to generate different colour mappings*

In [None]:
fig,ax = cloud.quick_plot('rgb', cloud.header.get_camera(0), fill_holes=True)
fig.show()

In [None]:
fig,ax = cloud.quick_plot('klm',cloud.header.get_camera(0), fill_holes=True)
fig.show()

Arbitrary visualisations of individual or ternary band combinations can also be easily created using *quick_plot*. Some convenient ternary combinations are included in `hylite.SWIR` and (if you have LWIR data) `hylite.LWIR`.

In [None]:
fig,ax = cloud.quick_plot(hylite.SWIR, cloud.header.get_camera(0), fill_holes=True )
fig.show()


----

**Exercise:** *Use the data visualisation tools to see if you can guess where in the hypercloud scene the rock samples pictured above (in the `image` variable) come from.*



In [None]:
# your funky code here!

----

### 6. Spectral libraries and spectral caterpillars  🐛

It is often convenient to quickly summarise all of the spectra in a dataset. This can be done using the `foo.plot_spectra` function, which generates a *spectral caterpillar* defined by the median (black) and 5th, 25th, 75th and 95th percentiles of each band (grey envelopes). Specific point or pixel spectra can also be added using the `indices` argument.

In [None]:
# plot a basic spectral caterpillar
fig,ax = cloud.plot_spectra(indices=[108113,82475,326198], colours=['r','g','b'])
fig.show()

In [None]:
# plot image and associated spectra
pixels = [(50,30), (150,30), (230,30)]

import matplotlib.pyplot as plt
fig,ax = plt.subplots(1,2,figsize=(18,5))
image.quick_plot(hylite.SWIR, ax=ax[0], ticks=True) # plot image to existing axes object, and plot x- and y- coords
ax[0].scatter([p[0] for p in pixels], [p[1] for p in pixels], color=['r','g','b'])

# add a spectral caterpillar
image.plot_spectra(band_range=(2100.,2400.), indices=pixels, colours=['r','g','b'], ax=ax[1])
fig.show()

Spectral libraries can also be loaded and plotted for reference. These will be explored in detail in a subsequent notebook.

In [None]:
lib = io.load( 'test_data/library.csv' )
fig,ax = lib.quick_plot(band_range=(2000.,2500.))
fig.show()

### 7. Saving results

Finally, once some form of processing has been achieved, data can be saved using the `io.save` function. This automatically determines the data type and generates the appropriate files.

In [None]:
io.save?

In [None]:
out = image.copy() # make a copy of the dataset as otherwise we modify it inplace
out.data = 1.0 - image.data # apply some voodoo magic

# save our processed dataset
io.save( './outputs/rocks.hdr', out )

### 8. Organising data using *HyCollection*

It is often also desirable to organise multiple related datasets (hyperspectral and otherwise) into one data structure. This can be achieved using the *HyCollection* instance, which provides an easy mapping between a directory storing data and variables in Python that you can load, manipulate and save.

In [None]:
from hylite import HyCollection

# initialise a collection
C = HyCollection("MyCollection", "./outputs" )

# put some data in it
C.image = image
C.image_adj = out
C.cloud = cloud
C.random_array = np.random.rand(100) # N.B. this will be stored as an .npy file
C.magicvalue = 42 # N.B. this will be stored in the HyCollection's header file
C.astring = 'foo' # And so will this

# save everything!
C.save() # n.b. you can also save a collection to a different folder using io.save('somepath', C)

Once created, data in a HyCollection can be easily reloaded using *io.load(...)*. Note that to save memory, each attribute of the HyCollection will not actually be loaded into memory until it is accessed the first time.

In [None]:
C2 = io.load('./outputs/MyCollection.hdr')

In [None]:
C2.print() # see what is in this collection, and note that no data has actually been loaded yet

In [None]:
# access an attribute (and load it into memory)
fig,ax = C2.image_adj.quick_plot( hylite.SWIR )
fig.show()

In [None]:
C2.print() # the image has now been loaded into RAM