In [None]:
import support as sp

import visualizations as vis
import plotly.graph_objects as go
from ipywidgets import interact, interact_manual, interactive

from starterkits.visualization import vis_plotly_widgets as vpw

%load_ext autoreload
%autoreload 2


# Starter Kit 3.4: Time-series preprocessing 




## Description

Time-series are characterised as a collection of data points obtained at successive times, most often with equal intervals between them. Since an increasing amount of assets is instrumented with sensors (e.g. machines, wearables), this type of data is omnipresent. Depending on the application domain, time-series data is exploited for various purposes, e.g. profiling energy consumption of households, predicting imminent failures of a production line, estimating the remaining useful lifetime of a machine, etc.  

Typically, time-series data cannot be used easily as-is by machine learning algorithms. This can be, for example, due to data quality issues such as missing values and sensor misreadings, or because some algorithms are not fit to deal with the continuous nature of this type of data or the associated (often high) frequency with which it is gathered.

## Business Goal

The goal of this Starter Kit is to present a number of typical **time-series preprocessing methods**. You will learn how these can be used to improve the quality of the dataset and to prepare it for further exploration, analysis and modelling purposes. This is a crucial step, as it will improve the completeness, reliability and accuracy of your results.


## Application context

Time-series preprocessing is useful to

- reduce the granularity of the data for easier interpretation, especially in case of high-frequency data
- impute missing values and remove noise and outliers, typically improving the quality of subsequent analysis steps
- divide the series into a collection of meaningful segments, e.g. segments corresponding to one full cycle in a process, etc.

## Starter kit outline

In this Starter Kit, we will use a publicly available real-world dataset that consists of wind turbine data, which is available via [opendata-renewables.engie.com](https://opendata-renewables.engie.com/explore/?sort=modified). It exhibits typical characteristics of industrial (time-series) datasets, i.e. it contains noise, outliers and missing values. Moreover, it contains seasonal patterns across multiple years, as the machine behaviour is influenced by the meteorological conditions. 

We will use this dataset to illustrate:
- how resampling and smoothing can be applied to gain a better understanding of the behaviour of the time-series,
- how the quality of the data can be improved through normalization and outlier detection, and
- how missing data can be imputed in various ways.

## Basic data understanding

The time-series dataset that we study in this Starter Kit is generated by the SCADA (supervisory control and data acquisition) system of a set of wind turbines. In modern turbines, such a data acquisition system can easily contain more than 100 sensors that keep track of temperature, pressure data, electrical quantities (e.g., currents, voltages), vibrations, etc. It is commonly used for performance monitoring and condition-based maintenance. 

The dataset originates from 4 wind turbines (type Repower MM82) located in the Northeast of France. It spans a period of 4 years with a sampling rate of 10 minutes. Although the original SCADA system records statistics for more than 30 sensors, we will focus our attention only on wind speed, wind direction and temperature to illustrate a variety of time-series preprocessing techniques. 

For more information on the characteristics of the wind farm, you can consult the official documentation via [thewindpower.net](https://www.thewindpower.net/windfarm_en_3354_la-haute-borne-vaudeville-le-haut.php).

The table below shows an excerpt of the data, with the following attributes:
- `Date_time`: the timestamp of the measurement, in 10 minute increments
- `Turbine`: the name of the turbine
- `Power`: the active power (effective produced power) measurement in kW
- `Temperature`: the outside temperature measurement in degrees Celsius
- `Wind speed`: the wind speed measurement in meters/second
- `Wind direction`: the wind direction measurement in degrees

The top rows show the average values for each turbine for the defined range. Note that in the case of wind direction, this is the _circular_ average, which takes into account the circular nature of the data (i.e. values are bound between 0 and 360 degrees, in which 0 degrees is identical as 360 degrees).

You can define a range of values for a given attribute and see how the values in the remaining attributes change. For example, try to increase wind speed to more than 17m/sec and see how that affects power production. Can you already spot any unexpected values in the dataset? Indeed, turbine R80790 on 2016-02-08 at 13:40 was producing only 600kW, significantly below the values that are seen at other times.

Note: when changing the attribute and value range in the interactive panel below, it might take some time to rerender the table.

In [None]:
ds, dst, df_missing, missing_events, missing_events_sum, col_labels = sp.load_data()

In [None]:
subset_table, col, slider = vis.table_data_overview(ds)
interact(subset_table, col_name=col, window=slider);

The following table provides a statistical summary of the dataset and shows that:
- for each turbine we have 4 years of data available
- the number of values for the `temperature` and `wind_speed` attributes differ across turbines, or in other words, the different turbines have some data missing.

In [None]:
ds.reset_index().\
    groupby('turbine').\
    agg({'datetime': ['min', 'max'], 'temperature': 'count', 'wind_speed': 'count', 'wind_direction': 'count', 'power': 'count'})

Unless mentioned otherwise, in the rest of the notebook we will focus on one of the turbines (R80721).

We can plot the values of a given variable in time to see how their long-term trend looks like. In the interactive graph below you can adjust the timeframe, such that you can also look at shorter-term fluctuations. You can select a subset of the turbines by clicking on their names on the right.

Hint: 
- Look at the seasonal trend of temperature and then look at power production: is there a seasonal component in power production as well? 
- Can you already spot any abnormal patterns in the attribute values of any of the turbines?

We will come back on the answers to these questions later on in this Starter Kit and introduce you to techniques to automatically detect this. 

In [None]:
plot_timeseries, var_selector, date_range_slider = vpw.timeseries_with_variable_selector(
    ds.groupby('turbine').resample('D').mean().reset_index().set_index('datetime').sort_index(),
    groupby='turbine', 
    select_multiple=False,
    kwargs_layout={'yaxis_title': col_labels})
    
interact(plot_timeseries,cols=var_selector, period=date_range_slider);

## Resampling

Visualizing time-series data with a high temporal granularity might make it difficult to interpret and detect underlying patterns. Resampling techniques allow to reduce the temporal granularity, thereby revealing longer-term trends and hiding sharp, fast fluctuations in the signal.

In the interactive panel below you can nudge the amount of resampling by specifying the time unit and the number of time units to resample to (e.g. to resample data to weekly, set the time unit to "Week" and the # of time units to 1). Play with the two inputs and see how the level of detail of the timeseries changes and how that influences the insights that you can derive from it.

First, try selecting a small resampling factor (e.g. daily resampling). You can observe that the visualization of the temperature shows a seasonal pattern in the data, even though it contains quite some noise. The visualization of the wind speed is not at all interpretable due to this noise. 
The noise is actually due to the high(er) frequency patterns of the data, especially when visualizing over such a long period of time. Specifically, if we are interested to visualize the long term trend of the time-series, we need to reduce the sampling rate.

To this end, try to increase the resampling rate (e.g. to weekly resampling). You will see how these high frequency patterns disappear and it becomes possible to analyse some long-term fluctuations in the power and wind speed variables. You can notably observe the strict correlation between the patterns of wind speed and power.  Note also that increasing the resampling of the time-series too much will discard most of the information it contains.

It is also important to understand that resampling aggregates the data for the specified period, so we need to specify how we want the data to be aggregated. For the wind speed and temperature data in our example, we can opt to aggregate the data using the median of the values within the selected period. This statistic is more robust against outliers (e.g. due to sensor misreadings) that could be present in our dataset. For other quantities, we might consider other statistics. For example, if we were considering energy production, it makes more sense to sum all the data samples rather than to average them. 

Try this by changing the aggregation function used. Note how the results look very similar. Indeed, since the resampling is done at regular intervals, the mean and the sum will offer similar results, despite at a completely different scale (note the changes in the y-axis values!)

In [None]:
resample_plots, unit, slider, aggfun = vis.plot_resampling(dst, col_labels)
interact(resample_plots,u=unit, c=slider, a=aggfun);

As we can see, a simple resampling makes more details explicit and the degree of resampling allows drawing different insights:
- with a weekly sampling rate, the temperature plot still shows a clear seasonal pattern. However, we can also see that in each year the weekly temperature evolves in a different manner: the highest and lowest temperatures are located in different weeks.
- the wind speed also seems to follow a seasonal pattern, albeit a less explicit one. We can notice that the wind speed is generally higher in winter. Yet, its evolution is much more dynamic than the one of temperature.
- power closely follows the wind speed profile, consistent with the latter being its main driver.

Being aware of the evolution of the time-series (e.g. the seasonal pattern) is helpful to recognize possible anomalies (e.g. outliers). We will come back to this in one of the following sections.

Resampling can be applied in a straighforward manner via the [pandas](https://pandas.pydata.org) library (pandas.DataFrame.resample).

## Smoothing

Besides the noise created by the high(er) frequency patterns of the data, another type of noise that can be observed in the plots above is due to small variations that might not be significant for what you want to detect. This type of noise can be real, e.g. sudden decreases in temperatures between summer nights, but can also be caused by inaccurate measurements of the sensor, for example.

If such details are not important to consider for your analysis, you can remove them by smoothing. Here we explore three different smoothing approaches:
- **Rolling window algorithm**: the algorithm defines a small fixed period (the window), runs over the data taking into account the consecutive time points covered by the window, and replaces each time point by an aggregate value computed within the window. Typically, this is the mean or the median. This has the effect of reducing short-term fluctuations while preserving long-term trends. 
- **Gaussian smoothing**: this method summarizes the values over a sliding window using a Gaussian function. The size of the sliding window is specified by the standard deviation of this Gaussian function, called the *kernel*. With this method, the values closest to the center of the sliding window will have a stronger impact on the smoothing. The impact of the remaining values is a function of the distance to the window center.
- **Savgol filter**: this is a popular filter in signal processing and is based on a convolution approach, where a low-degree polynomial is fitted to successive subsets of adjacent data points via the linear least squares method. Here we use a polynomial of degree 2.

A (simplified) version of the calculations and function calls involved for each of these methods is shown below:

```python
    from scipy.signal import savgol_filter
    from scipy.ndimage import gaussian_filter1d
    
    # function to get time period
    def get_time_period(x, time_unit):
        return np.median(np.diff(x.index)) / np.timedelta64(1, time_unit)
    
    def rolling(x, attribute, window, time_unit):
        "Rolling window smoothing"
        return x[attribute].rolling(f'{window}{time_unit}').median()

    def savgol(x, attribute, window, time_unit):
        "Savgol filter smoothing"
        # convert to number of rows
        window = window / get_time_period(x, time_unit)
        # convert to uneven, if not the case
        window = int(window + 1 if window % 2 == 0 else window)
        return pd.Series(savgol_filter(x[attribute], window, 2), 
                         index=x.index, 
                         name=x.name)
    def gaussian(x, c, window, time_unit):
        "Gaussian smoothing"
        # convert to number of rows
        window = window / get_time_period(x, time_unit)
        # width of filter accoring to scipy docs: width = 2*int(4*sigma + 0.5) + 1
        sigma = ((window - 1) / 2 - 0.5) / 4
        return pd.Series(gaussian_filter1d(x, sigma), 
                         index=x.index, 
                         name=x.name)
```

The panel below allows you to explore the three different methods on the different variables and adjust some method specific parameters, as well as the time frame. You can see how the smoothed trace follows the original trace.

Note: for speed purposes, only one month of data is shown.

In [None]:
ui, i = vis.plot_smoothing(dst)
display(ui, i)

## Seasonal patterns

We previously discussed the seasonal pattern that can be observed in certain variables such as temperature. One technique that allows analysing the seasonal pattern in a time-series signal and to _decompose_ the signal into different components is called __Seasonal Trend Decomposition__. This technique identifies cyclical patterns in the signal and decompose it into:
- The **trend**, which summarizes the long-term trend of the time-series in the considered time frame
- The **seasonal component**: this is the part of the signal that can be attributed to repeating patterns in the signal
- The **residuals**: The "left-over" after subtracting the trend and the seasonal factors from the signal. This is the component of the time-series that cannot be attributed to either the long-term trend evolution of the signal or the seasonal patterns.

The interactive panel below allows you to analyse the seasonal trend decomposition over two periods: a seasonal (i.e. covering the four yearly seasons) and a daily period. 
If you select a _seasonal_ decomposition over a sufficiently large period, you can clearly observe the yearly temperature pattern, with higher temperatures in the summer months and lower ones in the winter months. If you test a daily decomposition over the same period of time you will need to zoom-in on a short-time range (you can either use the mouse cursor to zoom in on the plot or the date selector) and similarly see a pattern in the *seasonal* component, namely the 24-hour temperature cycle.

You can also test how the seasonal trend decomposition performs on a less cyclical signal, such as power production. Indeed, since power production is mainly driven by the amount of wind and wind speed, each showing a weak seasonal modulation, power production can be only poorly expressed in terms of its seasonal component.

In [None]:
stl_decompose, choice, selection_range_slider, col_selector = vis.plot_stl(dst)
interact(stl_decompose, col=col_selector, c=choice, d=selection_range_slider);

## Outlier detection

Different techniques for outlier detection in time-series exist (see for example the [survey](https://www.microsoft.com/en-us/research/wp-content/uploads/2014/01/gupta14_tkde.pdf) here). Here, we will focus on _online_ outlier detection, i.e. the detection of an outlier as soon as it occurs, as opposed to an _offline_ detection, which happens retrospectively. We present two different approaches for outlier detection using temperature and wind speed variables.

### Approach 1: Interquartile Range-based outlier detection

A relatively simple and frequently used approach for outlier detection is based on the boxplot data distribution. For a given attribute, this method computes its interquartile range (IQR), which is the difference between the $25^{th}$ and $75^{th}$ percentiles. This value is then multiplied by a constant factor $\alpha$ which determines how stringent the outlier detection is. A typical value for $\alpha$ is 1.5, although this value can be adapted according to the level of stringiness desired - indeed, larger values will push the outlier boundaries further (thereby reducing the number of detected outliers). The resulting value is subtracted from the $25^{th}$ and added to the $75^{th}$ percentiles to obtain the lower and upper fences, respectively, which define the thresholds beyond which a given value is labeled as an outlier.

In [None]:
vis.plot_boxplot_outliers_demo(dst)

Considering the seasonal nature of the data, we should ensure that the outlier detection approach takes the impact of seasonality into account. It is known that the temperature has significant seasonal variation (e.g. it varies between day and night or between winter and summer) and the same temperature in winter and in summer can be considered as outlier in one case, but not in the other. Therefore, the seasonal trend decomposition described above is applied to the signal and the residuals (i.e. the remainder after removal of the seasonal components of the signal) are used as input for the outlier detection. 

In the example below, we identify outliers _events_ (detected outliers that are consecutive in time) based on a given $\alpha$ value (i.e. the number of IQRs). An additional input parameter allows to merge outlier events that are separated by less than a given amount of time. When using the most stringent alpha value (5), we detect a single outlier in the dataset: turbine R80721 observes a temperature of -273 degrees for a period of time.

You can change these values and see how this affects the detected outliers.

In the panel below the outlier summary, you can visualize the time-series around the time of a given outlier event. The *Flank duration* parameter allows you to control the time window around the outlier for the visualization. The left figure shows the time-series with the outlier event highlighted in blue, while the figure on the right shows the distribution of all the temperature residual values using a boxplot. Again, the points in blue indicate the outlier event depicted on the left. When adapting the parameters in the top interactive panel, make sure to rerun the cell below to see the updated list of detected outliers.

In [None]:
# this function needs to be here, otherwise setting the global variable in the script will not make it visible 
# in the notebook environment
def print_outlier_summary(n_iqr, int_mins):
    global outlier_events, df_outliers, N_IQR
    N_IQR = n_iqr
    outlier_events, df_outliers = sp.summarize_outliers(df_stl, n_iqr, int_mins)
    if outlier_events is None:
        print('Warning, no outliers detected with these settings')
    else:
        print('%d outlier event%s detected' % (len(outlier_events), '' if len(outlier_events) == 1 else 's'))
        fig = go.Figure(data=[go.Table(
            header=dict(values=list(outlier_events.columns)),
            cells=dict(values=[outlier_events[c] for c in outlier_events.columns]))])
        fig.show()

df_stl, n_iqr, int_mins = vis.get_outlier_events(dst)
im = interact(print_outlier_summary, n_iqr=n_iqr, int_mins=int_mins)

In [None]:
visualize_outlier_event, e_id, flank = vis.plot_outlier_events(df_outliers, outlier_events)
interact(visualize_outlier_event,e=e_id, f=flank);

#### The influence of outliers on normalization

Normalization is a typical preprocessing step where the range of a variable is standardized (i.e., rescaled) in order to make different variables with different ranges comparable. It is an important preprocessing step before the data is presented to a machine learning algorithm, as it ensures all variables have equal importance.

Different normalization approaches exist. Examples are rescaling the values in the 0-1 range, known as **min-max normalization**, or removing the mean and scaling to unit variance, known as **z-score or standard score normalization**. Most of these approaches are sensitive to outliers, e.g. in min-max, the minimum value is mapped to 0 and the maximum to 1 so obviously extreme outliers will have a large impact. 

The interactive panel below allows you to test these two normalization approaches on each of the attributes of the dataset. You can enable or disable the outlier removal in order to appreciate how this affects the normalization procedure. This effect is most striking when looking at the temperature attribute.

In [None]:
transform_plot, method_picker, controllers = vis.plot_normalization(dst, N_IQR)
interact(transform_plot, method=method_picker, column=controllers[0], remove_outliers=controllers[1]);

### Approach 2: Fleet-based outlier detection

For detecting outliers of the power attribute, we will use an alternative to the approach taken for temperature, even though in principle we could take the same approach. 

The approach we will use is based on exploiting the fleet aspect. At each point in time, we compute the median power recorded by the fleet and we consider any value that deviates _too much_ from that median value to be an outlier. In order to determine what constitutes _too much deviation_, we again consider the boxplot outlier definition. If a value is beyond 5 times the IQR from the $25^{th}$ or $75^{th}$ percentile, we consider the observation to be an outlier. 
To exclude periods when all the turbines in the fleet were not operational, we only consider time points when at least 3 of the 4 turbines recorded a power production above 0.

The figure below allows you to explore all the detected outliers in the power attribute using this definition. You can change the Flank amount parameter in order to see a larger time window around the outlier.

It can be observed that the outliers often happen in periods of time when the power of a given turbine dropped to 0 without it being the case for the remaining turbines. There are, nonetheless, other instances, when the produced power power of a given turbine was (statistically speaking) above what would be expected given the behavior of the remaining turbines.

Note that the grey square highlights only the fleet outlier event in question. Other points in the visualization might also be labeled as outliers but they are not part of the same event.

In [None]:
plot_outlier, events, slider = vis.plot_fleet_outliers(ds)
interact(plot_outlier, evt=events, flank=slider);

The fleet-based approach has the advantage of being able to capture outliers at a specific moment in time only relying on sensor values captured at that time. On the other hand, this approach can only be applied if the dataset contains a fleet of co-located assets, i.e. exposed to similar conditions.

## Data Imputation

Real-world industrial datasets often suffer from missing data due to several reasons, e.g. sensor malfunctioning or communication errors. In addition, if we have removed outliers, these will also show up as missing values in the data.

In both cases, we can fill the missing values using imputation techniques. Multiple techniques exist and which one to choose depends on the data characteristics, e.g. presence of trends, the length of the missing period, etc. In the next sections, we will use 3 different imputation techniques:

- Linear interpolation
- Fleet-based interpolation
- Pattern-based imputation

To evaluate the efficiency of the different methods, we will create a set of synthetic missing data events with varying durations. The main advantage of this approach is that it allows us to compare the outcome of the imputation procedure with the _real_, _observed_ values. 

### Linear interpolation and fleet median

**Linear interpolation** is a simple technique that is frequently used in case a small number of points are missing within a particular time period. It simply connects the last point before and the first point after the missing data episode with a straight line.

**Fleet median**, on the other hand, exploits the fleet-based aspect of our asset for imputing periods of missing data. For the dataset under investigation, we know the assets are co-located and are, therefore, exposed to similar conditions (e.g. wind direction and speed). Hence, we can compute the median value of the wind speed for the turbines that do have data and use those values as an estimation of the missing values.

While linear interpolation is sensitive to the event duration (the longer the event, the less likely a linear interpolation will follow the _real_ values), fleet median interpolation may result in unexpected results if there are too few assets in the fleet (or too few assets with non-missing values at the time). The latter method is, furthermore, dependent on the different assets being co-located and exposed to similar conditions.

In the plot below you can see how this method performs for missing data events of different durations. You can also change the number of turbines that are considered for the fleet median interpolation and see how that affects the accuracy of the prediction. In our dataset there are 4 turbines available, so you can choose between 1 and 3 turbines to use for the fleet median interpolation.

The red trace corresponds to actual observed values (remember that we are dealing with _synthetic_ missing data events) and, thus, the closer the blue (linear interpolation) and green (fleet mean) lines are from the red line, the better the interpolation.

Try to experiment with different values, and note how the performance of the linear interpolation degrades with increasing event size and how the performance of the fleet median interpolation is dependent on a decent number of assets considered. Remember that the data is sampled at 10-minute intervals when considering the event duration.

In [None]:
imputate_missing, events, slider, turbine_selector = vis.plot_imputation(df_missing, missing_events)
interact(imputate_missing, evt=events, nt=turbine_selector, s=slider);

Linear interpolation can easily be performed in [pandas](https://pandas.pydata.org) (pandas.Series.interpolate).

### Pattern-based interpolation 

When linear interpolation or fleet-based data imputation techniques cannot be used, we can still use pattern-based imputation provided the time-series follow a predictable pattern. In this section we illustrate this method on two attributes, namely the time-series for temperature and wind speed.

Pattern-based interpolation performs well on signals that follow a _predictable_ pattern, as is the case for signals that show a strong seasonal modulation. You can appreciate this by comparing the interpolation of the temperature and wind speed signals. The former, as we have seen before, follows a daily pattern (night vs. day) and a seasonal pattern (winter vs. summer) and shows a high-quality interpolation. The latter, however, has a weaker seasonal modulation reflected in a less accurate interpolation.

We can use seasonal ARIMA, an extension of the ARIMA model, for forecasting seasonal time-series data. More specifically, in our case we can use seasonal ARIMA for forecasting the evolution of the temperature based on the data coming from a single asset. The interested reader can find more information about the Seasonal ARIMA in following this [tutorial](https://otexts.org/fpp2/seasonal-arima.html).

Note how varying the duration of the missing data event affects the interpolation quality. You can also anticipate the start of the (synthetic) missing data event and see if that affects the prediction. Try to shift it by 36 hours. Notice how the prediction fails to mimic the original values in the beginning. Indeed, this type of interpolation might be sensitive to the starting point for the forecasting. Here we used one year of data prior to the missing data event for the training of the ARIMA model. Using a longer period (including multiple seasonal patterns), nonetheless will improve the forecasting and avoid the aforementioned shortcomings.

In [None]:
_plot_pattern_imputation, features, slider, slider_evt_st= vis.plot_pattern_imputation(ds)
interact(_plot_pattern_imputation, v=features, d=slider, s=slider_evt_st);

The pattern imputation implementation in this notebook uses the [statsmodels module](https://www.statsmodels.org/stable/index.html) (notably the [SARIMAX](https://www.statsmodels.org/stable/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html) function).

## Conclusion

In this Starter Kit, we have demonstrated techniques that can be used to preprocess time-series data in order to improve its quality for further exploitation. We have used a real-world dataset containing wind turbine data that exhibits the typical characteristics of industrial datasets, i.e. it contains noise, outliers, missing values and seasonal patterns. In particular, we have illustrated:

* how resampling and smoothing can be applied to better understand the behaviour of the time-series;
* how the quality of the data can be improved through normalization and outlier detection;
* how missing data can be imputed via 3 different imputation techniques: linear interpolation, fleet-based imputation and pattern-based interpolation, and outlined what are the advantages and disadvantages of these techniques.

The use of these methods allows to improve the quality of the data and to prepare it for further exploration, analysis and modelling purposes. This is a crucial step, as it will improve the completeness and reliability of the input data, and consequently the accuracy of your results.

## Additional information

Copyright © 2022 Sirris

This Starter Kit was developed in the context of the EluciDATA project (http://www.elucidata.be). For more information, please contact info@elucidata.be.

 
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Notebook"), to deal in the Notebook without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Notebook, and to permit persons to whom the Notebook is provided to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies of the Notebook and/or copies of substantial portions of the Notebook.

THE NOTEBOOK IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NON-INFRINGEMENT. IN NO EVENT SHALL SIRRIS, THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, DIRECT OR INDIRECT, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE NOTEBOOK OR THE USE OR OTHER DEALINGS IN THE NOTEBOOK.