# An Introduction to HyperSpy:
## The multi-dimensional data analysis toolbox

### <br/>
### Josh Taillon and Andy Herzing
#### *April 5, 2018*

## A quick note first:

## This isn't your parents' Powerpoint...

## ...because everything is interactive!

In [None]:
import datetime
import time
datestring = datetime.datetime.now().strftime('%B %d, %Y')
for c in 'Today is {}!'.format(datestring):
    print(c, end='')
    time.sleep(.2)

## Made possible with:

* Jupyter notebook &mdash; https://jupyter.org/

* RISE (Reveal.js IPython/Jupyter Slideshow Extension) &mdash; https://github.com/damianavila/RISE

# Introduction

## What is HyperSpy?

* Open-source Python library for interactive data analysis of multi-dimensional datasets

* Makes it easy to operate on multi-dimensional arrays as you would a single spectrum (or image)

* Easy access to cutting-edge signal processing tools 

* Modular structure makes it easy to add custom features

## History of HyperSpy

* Developed by [Francisco de la Peña](https://scholar.google.com/citations?user=5n2c_fYAAAAJ&hl=en) in 2007 &mdash; 2012 as part of Ph.D. Thesis

* Originally called EELSLab:

<center><img src="img/eelslab.png" width=500px></center>

* Open-sourced (on [Github](https://github.com/hyperspy/hyperspy)) in 2010

* Renamed to HyperSpy in 2011

* Now... over 100 citations, and rapidly growing!

## Design philosophy of HyperSpy

* HyperSpy is a Python library, rather than standalone program
    * Part of the greater scientific Python ecosystem

* Enables and requires Python scientific stack (i.e. `numpy` and `scipy`)

* Data storage is in an open hierarchical format (HDF5)

* Analysis done via reproducible notebooks

* Feature development is completely open-source

## How we came to love HyperSpy

### Josh:

* Became interested in multivariate statistical analysis of EELS spectrum images

* No easy way to do that in commercial software

* The entire scientific Python ecosystem is available from HyperSpy &mdash; <br/> machine learning, clustering, signal separation, etc.

* Came for the data analysis, stayed because of the community

### Andy:

* 

* 

* 

* 

# Getting Started

## Installation

* Easiest method on Windows &mdash; HyperSpy bundle
  * http://hyperspy.org/download.html#windows-bundle-installers
  * Installs a Python distribution with HyperSpy included
  * Best method if you have no prior Python experience

* For more control (on Windows, Mac, and Linux) &mdash; Anaconda Python
  * https://www.anaconda.com/download/
  * After installing Anaconda, simply run `conda install hyperspy`
  * This method is preferred by the developers

## How to use HyperSpy?

* Console/Command line

* Integrated development environment (IDE)

* **Jupyter Notebook**

* HyperSpyUI

## Console/Command line

# Supplementary information and setup code

#### Disable warnings for presentation:

In [58]:
import logging
hs_logger = logging.getLogger('hyperspy') 
hs_logger.setLevel(logging.ERROR)

#### Downloading and creating test signal:

In [59]:
import hyperspy.api as hs
from skimage.data import astronaut
s = hs.signals.Signal1D(astronaut())

# Calibrate the image
s.axes_manager[0].name = "width"
s.axes_manager[0].scale = 0.13
s.axes_manager[0].offset = -29.2
s.axes_manager[0].units = "cm"

s.axes_manager[1].name = "height"
s.axes_manager[1].scale = 0.13
s.axes_manager[1].offset = -12.9
s.axes_manager[1].units = "cm"

s.axes_manager[2].name = "RGB"
s.to_signal2D().save("astronaut.hdf5")

Overwrite 'astronaut.hdf5' (y/n)?
y


In [60]:
from urllib.request import urlretrieve, urlopen
from zipfile import ZipFile

# This line doesn't work at NIST, but we've packaged the files locally
# files = urlretrieve("https://www.dropbox.com/s/dt6bc3dtg373ahw/machine_learning.zip?raw=1", "./machine_learning.zip")

with ZipFile("../machine_learning.zip") as z:
    z.extractall()