# How to initialize a PeakTree object
This tutorial shows how to give input data to `PeakTree(arg)` that initializes an instance. PeakTree instantiation takes a single argument, which must be an iterable of `(x, h)` points. (Think of `x` as "position" and `h` as "height".) These points must be an alternating sequence of local maxima and minima.

How to do that is shown for a range of different data sources.

## Discrete data sets
If you want to find the hierarchical peaks in a discrete sequence of data points (e.g. sampled points, discrete 1-D functions, time series, etc.), it is easiest to use the function `filter_local_extrema()` to produce a sequence of all local maxima and minima from an iterable of your data. (You are free to use another method to preprocess the data yourself to extract the local maxima and minima.) The maxima and minima are used to initialize a PeakTree.

This is demonstrated for these cases:
1. An x-vector and an h-vector
2. Non-numeric x-coordinates
3. A list of (x,h) tuples
4. A dictionary of x: h
5. A h-vector only
6. A function h(x)
7. A generator function yielding (x, h)
8. x-values that are tuples themselves

In [1]:
# import the module
import hierarchical_peaks as hip

### Example 1: An x-vector and an h-vector
If data point coordinates are stored in two lists (`x_vector` and `h_vector`), then a `zip()` of the two lists can be used as input argument.

In [2]:
# small example data set:
x_vector = [n for n in range(2005,2022)]
h_vector = [8.5, 9.0, 5.0, 10.0, 7.5, 3.0, 6.0, 6.0, 2.0, 7.5, 7.5, 8.0, 6.0, 7.0, 4.0, 4.0, 5.0]


# initialize a PeakTree:
tree1 = hip.PeakTree(hip.filter_local_extrema(zip(x_vector, h_vector)))


You can double-check which data points that `PeakTree` used to initialize this instance, because they are stored (as a dict) in the data attribute called `elevation`.

In [3]:
print(tree1.elevation)

{2006: 9.0, 2007: 5.0, 2008: 10.0, 2010: 3.0, 2012: 6.0, 2013: 2.0, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2020: 4.0, 2021: 5.0}


### Example 2: Non-numeric x-coordinates
The h-values must be numeric. But the x-values can be any type of non-numeric object, as long as they can be used as keys in the `elevation` dict (must be unique and hashable).

In [4]:
x_nonnumeric = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"

print(*(zip(x_nonnumeric, h_vector)))

('A', 8.5) ('B', 9.0) ('C', 5.0) ('D', 10.0) ('E', 7.5) ('F', 3.0) ('G', 6.0) ('H', 6.0) ('I', 2.0) ('J', 7.5) ('K', 7.5) ('L', 8.0) ('M', 6.0) ('N', 7.0) ('O', 4.0) ('P', 4.0) ('Q', 5.0)


In [5]:
# initialize a PeakTree:
tree2 = hip.PeakTree(hip.filter_local_extrema(zip(x_nonnumeric, h_vector)))

print(tree2.elevation)

{'B': 9.0, 'C': 5.0, 'D': 10.0, 'F': 3.0, 'H': 6.0, 'I': 2.0, 'L': 8.0, 'M': 6.0, 'N': 7.0, 'P': 4.0, 'Q': 5.0}


### Example 3: A list of (x,h) tuples
If data points are available as a sorted list of `(x, h)`, then this list can be used directly as input argument.

In [6]:
xh_list = list(zip(x_vector, h_vector))
print("xh_list = ", xh_list)

xh_list =  [(2005, 8.5), (2006, 9.0), (2007, 5.0), (2008, 10.0), (2009, 7.5), (2010, 3.0), (2011, 6.0), (2012, 6.0), (2013, 2.0), (2014, 7.5), (2015, 7.5), (2016, 8.0), (2017, 6.0), (2018, 7.0), (2019, 4.0), (2020, 4.0), (2021, 5.0)]


In [7]:
# initialize a PeakTree:
tree3 = hip.PeakTree(hip.filter_local_extrema(xh_list))

print(tree3.elevation)

{2006: 9.0, 2007: 5.0, 2008: 10.0, 2010: 3.0, 2012: 6.0, 2013: 2.0, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2020: 4.0, 2021: 5.0}


### Example 4: A dict of x: h
If data points are stored in a dictionary, and they were inserted in sorted order, then the dict's `.items()` should be used as input argument.

In [8]:
xh_dict = dict(zip(x_vector, h_vector))
print("xh_dict = ", xh_dict)

xh_dict =  {2005: 8.5, 2006: 9.0, 2007: 5.0, 2008: 10.0, 2009: 7.5, 2010: 3.0, 2011: 6.0, 2012: 6.0, 2013: 2.0, 2014: 7.5, 2015: 7.5, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2019: 4.0, 2020: 4.0, 2021: 5.0}


In [9]:
# initialize a PeakTree:
tree4 = hip.PeakTree(hip.filter_local_extrema(xh_dict.items()))

print(tree4.elevation)

{2006: 9.0, 2007: 5.0, 2008: 10.0, 2010: 3.0, 2012: 6.0, 2013: 2.0, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2020: 4.0, 2021: 5.0}


### Example 5: An h-vector only
If the data set is a list of h-values alone, then the index of the list is a natural x-coordinate, and an `enumerate()` of the list can be used as input argument.

In [10]:
print("h_vector = ", h_vector)

h_vector =  [8.5, 9.0, 5.0, 10.0, 7.5, 3.0, 6.0, 6.0, 2.0, 7.5, 7.5, 8.0, 6.0, 7.0, 4.0, 4.0, 5.0]


In [11]:
# initialize a PeakTree:
tree5 = hip.PeakTree(hip.filter_local_extrema(enumerate(h_vector)))

print(tree5.elevation)

{1: 9.0, 2: 5.0, 3: 10.0, 5: 3.0, 7: 6.0, 8: 2.0, 11: 8.0, 12: 6.0, 13: 7.0, 15: 4.0, 16: 5.0}


### Example 6: A function h(x)
If h-data are not stored in memory, but can be calculated with a function for a given range of x-values, then a generator expression like `((x, h(x)) for x in range(2005,2022))` can be used as input argument. 

In [12]:
def h(x):
    return h_vector[x - 2005]

# For example,
print("In 2005, h was", h(2005))
print("In 2021, h was", h(2021))

In 2005, h was 8.5
In 2021, h was 5.0


In [13]:
# initialize a PeakTree:
tree6 = hip.PeakTree(hip.filter_local_extrema(((x, h(x)) for x in range(2005,2022))))

print(tree6.elevation)

{2006: 9.0, 2007: 5.0, 2008: 10.0, 2010: 3.0, 2012: 6.0, 2013: 2.0, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2020: 4.0, 2021: 5.0}


### Example 7: A generator function yielding (x, h)
Sometimes you don't want a generator expression as input argument, but a generator function instead.

In [14]:
def xh_generator():
    for x in range(2005,2022):
        yield x, h_vector[x - 2005]

In [15]:
# initialize a PeakTree:
tree7 = hip.PeakTree(hip.filter_local_extrema(xh_generator()))

print(tree7.elevation)

{2006: 9.0, 2007: 5.0, 2008: 10.0, 2010: 3.0, 2012: 6.0, 2013: 2.0, 2016: 8.0, 2017: 6.0, 2018: 7.0, 2020: 4.0, 2021: 5.0}


### Example 8: x-values that are tuples themselves
The x-values can be more complex objects. For example, they can be `(x, h)` coordinate pairs:

In [16]:
xh_pairs = zip(x_vector, h_vector)
print("xh_pairs = ", *xh_pairs)

xh_pairs =  (2005, 8.5) (2006, 9.0) (2007, 5.0) (2008, 10.0) (2009, 7.5) (2010, 3.0) (2011, 6.0) (2012, 6.0) (2013, 2.0) (2014, 7.5) (2015, 7.5) (2016, 8.0) (2017, 6.0) (2018, 7.0) (2019, 4.0) (2020, 4.0) (2021, 5.0)


In [17]:
# initialize a PeakTree:

xh_pairs = zip(x_vector, h_vector)

tree8 = hip.PeakTree(hip.filter_local_extrema(zip(xh_pairs, h_vector)))

print(tree8.elevation)

{(2006, 9.0): 9.0, (2007, 5.0): 5.0, (2008, 10.0): 10.0, (2010, 3.0): 3.0, (2012, 6.0): 6.0, (2013, 2.0): 2.0, (2016, 8.0): 8.0, (2017, 6.0): 6.0, (2018, 7.0): 7.0, (2020, 4.0): 4.0, (2021, 5.0): 5.0}


So it is straightforward to adapt data from discrete data sources.

## Continuous functions
For finding peaks in a continuous function, the function `filter_local_extrema()` can not be applied. Some other preprocessing routine, possibly based on zero derivatives (scipy tools?), could be developed to find the local maxima and minima required by `PeakTree`.

However, if you sample the function with high enough density to capture all hills and valleys, you are back to the discrete case.