<h1>TS format</h1>

The most typical use case is to load data from a locally stored .ts file. The .ts file format has been created for representing problems in a standard format for use with sktime. These files include two main parts:
* header information
* data

The header information is used to facilitate simple representation of the data through including metadata about the structure of the problem. A summary of each header is given in the table below.

<style>
table, th, td {
  border: 1px solid black;
  border-collapse: collapse;
}
</style>

<table style="width:90%">
    <tr>
        <th>Header</th>
        <th>Description</th> 
        <th>How to determine</th>
        <th>Syntax</th>
        <th>Example</th>
        <th>Date set example</th>
    </tr>
    <tr>
        <td>@problemName</td>
        <td>This header is a string of problem name</td>
        <td>Noramlly the data set name</td>
        <td>&#60;StringWithNoSpaces&#62;</td>
        <td>@problemName StandWalkJump</td>
        <td>Example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=StandWalkJump">StandWalkJump</a></td>
    </tr>
    <tr>
        <td>@timestamps</td>
        <td>This header describes if the dataset has timestamps in it</td>
        <td>True when the dataset contains timestamps and false when it does not</td>
        <td>&#60;true/false&#62;</td>
        <td>@timestamps true</td>
        <td>N/A</td>
    </tr>
    <tr>
        <td>@missing</td>
        <td>This header is used to determine if there are missing values in the dataset</td>
        <td>True if there are missing data points in series and false if no missing data points in series</td>
        <td>&#60;true/false&#62;</td>
        <td>@missing true</td>
        <td>True example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=DodgerLoopDay">DodgerLoopDay</a> <br>False example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=AtrialFibrillation">AtrialFibrillation</a></td>
    </tr>
    <tr>
        <td>@univariate</td>
        <td>This header describes if the dataset is univariate or multivariate</td>
        <td>True when the dataset is univariate (measurements are over 1 dimension) and false when the dataset is multivariate (measurements are over 2 or more dimensions)</td>
        <td>&#60;true/false&#62;</td>
        <td>@univariate true</td>
        <td>True example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ACSF1">ACSF1</a> <br>False example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a></td>
    </tr>
    <tr>
        <td>@equalLength</td>
        <td>This header describes if the dataset series are all of the same length</td>
        <td>True when all the series are of equal length and false when the series are not the same length.</td>
        <td>&#60;true/false&#62;</td>
        <td>@equalLength true</td>
        <td>True example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a> <br>False example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=AllGestureWiimoteX">AllGestureWiimoteX</a></td>
    </tr>
    <tr>
        <td>@classLabel</td>
        <td>This header describes if the dataset is labelled and then what these labels are</td>
        <td>True if the dataset has labels and false if the dataset has no labels. If true, the class labels come after with spaces between them</td>
        <td>&#60;true/false&#62; <classLabels></td>
        <td>@classLabel true 1 2 3 4 5 6 7 8 9 10</td>
        <td>True example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a></td>
    </tr>
    <tr>
        <td>@seriesLength</td>
        <td>This header describes how long each series is (if the series is unequal then it is the length of the longest series)</td>
        <td>It is the number of data points that make up one dimension of the series</td>
        <td>&#60;integer&#62;</td>
        <td>@seriesLength 36</td>
        <td>Example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a></td>
    </tr>
    <tr>
        <td>@dimensions</td>
        <td>This header describes (for multivariate series) how many dimensions the dataset has</td>
        <td>It is the number of dimensions recorded for each instance</td>
        <td>&#60;integer&#62;</td>
        <td>@seriesLength 144</td>
        <td>Example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a></td>
    </tr>
    <tr>
        <td>@data</td>
        <td>This header is used to mark the start of the raw data in the dataset</td>
        <td>On a new line below this header the raw data is put. Each instance is seperated by a new line</td>
        <td>&#60;long, long, long....,&#62;</td>
        <td>@data <br> 152.472131,151.149948,133.185425,105.026618 <br> 0.962,0.962,1.0,1.038</td>
        <td>Example: <a href="http://www.timeseriesclassification.com/description.php?Dataset=ArticularyWordRecognition">ArticularyWordRecognition</a></td>
    </tr>
</table>


# Data examples
The data tag represents the start of the time series data. The time series data should begin on the next line after this tag. Each series should be a comma-separated list and the index of each value will be relative to its position in the list (0, 1, ..., m). A simple example of a unlabelled dataset is given below

    @problemName ExampleBasicDataSet
    @univariate true
    @classLabel false
    @data
    1,2,3,4,5
    6,7,8,9,10

For classification problems, the dataset is normally labelled. The .ts format support class labels and to add them for a given series the class label for a series should be specified at the end of the series by marking a new dimension (using the ':' character) followed by the class label. This looks like:

    :<classLabel>

NOTE: When the dataset has class labels, this should be specified using the @classLabel header (see below). An example univariate labelled dataset is given below:

    @problemName ExampleUnivariateDataSet
    @univariate true
    @classLabel true 1 2
    @data
    1,2,3,4,5:1
    6,7,8,9,10:2
    11,12,13,14,15:1

Notice how the class label is given at the end of the series by splititng the series using the ':' character which marks a new dimension.

## Multivariate data

Sometimes when we have multivariate problems the data is recorded over multiple dimensions. In order to have multivariate data each dimension for a given series must be split using the ':' which marks the end of one dimension (left of the ':') and begining of a new dimension (right of the ':'). For example if we assume the data below is recorded over 4 different dimensions:

    dimension1 = 1,2,3,4,5
    dimension2 = 6,7,8,9,10
    dimension3 = 11,12,13,14,15
    dimension4 = 16,17,18,19,20

we can put this in .ts format by combining all the dimensions  into one line and then splitting each dimension using the ':'. Example:

    1,2,3,4,5:6,7,8,9,10:11,12,13,14,15:16,17,18,19,20

We would do this for each series in the dataset. A full example is given below:

    @problemName ExampleMultivariateDataset
    @univariate false
    @dimensions 4
    @classLabel false
    @data
    1,2,3,4,5:6,7,8,9,10:11,12,13,14,15:16,17,18,19,20
    21,22,23,24,25:26,27,28,29,30:31,32,33,34,35:36,37,38,39,40

NOTE: For multivariate series the 'dimensions' header is used (see below for details) to specify the number of dimensions a dataset has.

We can add a class label in a similar way as we do for a univariate series. At the end of the series we add a final ':' marking a new dimension and put the class label after. An example is given below:

    @problemName ExampleMultivariateLabelledDataset
    @univariate false
    @dimensions 4
    @classLabel true n m
    @data
    1,2,3,4,5:6,7,8,9,10:11,12,13,14,15:16,17,18,19,20:n
    21,22,23,24,25:26,27,28,29,30:31,32,33,34,35:36,37,38,39,40:m

## Missing values data

Sometimes our dataset has missing values. The .ts format supports this by marking these missing values with a '?' where the value should be. An example is given below for a univariate (multivariate supports missing values in the same way) dataset marking the missing values with a '?'.

    @problemName ExampleMissingUnivariateDataSet
    @univariate true
    @missing true
    @classLabel true 1 2
    @data
    1,2,?,?,5:1
    6,7,8,9,?:2
    11,?,13,14,15:1

## Timestamp data

Sometimes our dataset may contain timestamps for each data point. The .ts format supports this by representing the data with tuples in the form of (timestamp, value). The timestamp can be any format (e.g. a datetime string 12/24/2018, 04:59:31, a time 04:59:31 or simply an integer value) but some restrictions could be presented by the classifier/algorithm you desire to use the dataset with. Please check the relevent documentation for information on this. A simple example is given below:

    @problemName ExampleTimestampUnivariateDataSet
    @univariate true
    @missing true
    @timestamp
    @classLabel true 1 2
    @data
    (0,1),(2,2),(2,?),(7,?),(22,5):1
    (3,6),(3,7),(8,8),(8,9),(8,?):2

NOTE: The series must be in the correct time order (i.e. the lowest timestamp first and highest timestamp last)