# 2. Panel Tasks - Classification, Regression, Clustering & Pipelines

#### What are Panel tasks?

Panel tasks refers to a type of learning problem where a Panel of data is employed, simply refered to as Panel Data.

Panel Data comprises of multiple time series entities/instances, where a single time series component looks like:

<INSERT TIME-SERIES, Image-1>

Hence, Panel Data can be visualized as follows:

<INSERT PANEL-DATA, Image-2>

As per the kind of response variable and goal of the task, we can define different tasks - all of them are synonymous to time-independent (often called as Cross-sectional) data:
1. _Classification_: The response variable is a label (Good / bad, ratings between 0 and 5 - 0, 1, 2, 3, 4, 5)
    <INSERT CLASSIFICATION-BOXES, Image-3>
2. _Regression_: The response variable is continuous (floating point, integers)
    <INSERT REGRESSION-PLOT FROM `utils.load_experiments(variables="pressure"), Image-4`>
3. _Clustering_: There are no response variables here, the goal of this task is to group entities that are "similar" to each other.
    <INSERT CLUSTERING-PLOT, Image-5>
4. _Forecasting_: Given historical data, predict the (near future) values by capturing temporal dependencies and patterns within each panel.
    <TODO - Think of what image to add here, Image-6>
5. _TODO: Add more obscure tasks (causal inference, survival analysis) or something more relevant to the talk - distances and kernel based_
    <TODO - Think of what image to add here, Image-7>

## Table of Contents
    -- TODO --

## 2.1 Data Formats and its Types
   Before diving into the exciting world of experimenting with different estimators available in `sktime` for a variety of time series tasks, it is important to grasp how `sktime` handles the diverse range of data representations in the realm of time series analysis.

   The following section contains information on data types and loaders with moderate depth - to help you get started. Visit [In-memory data representations and data loading](https://www.sktime.net/en/latest/examples/AA_datatypes_and_datasets.html#In-memory-data-representations-and-data-loading) tutorial to get a comprehensive understanding of them.

### 2.1.1 `scitype` vs `mtype` in `sktime`
    
- **_scitype_** is short for Data Scientific Data Type and is defined by relational and statistical properties of the data being represented and common operations that can be applied to it. Think of scitype an abstract object, free from machine-specific implementations. For instance, there's a scitype for time series data, another for Panel data and so on.

- **_mtype_** is short for Machine Implementation Type which, for a defined scitype, specifies an exact python type and structure of representation. For instance, concrete time series is represented by a concrete `pandas.DataFrame` in `sktime`.

scitypes and mtypes are encoded by strings in `sktime`, and each estimator has `SUPPORTED_MTYPES` that is used as a tag to search and filter estimators - more on that later.

<INSERT SCITYPE-MYTYPE EXEMPLAR, Image-8>

### 2.1.2 Different `scitype` present in `sktime` and their utility

1. `"Series"`: Representing Time Series Data
    - Single-Level Index represenations in Dataframe, 2D representation in numpy.
    - Can represent uni or multivariate data **&ast;**
    - Can represent (un)equally spaced data **ast;**
    - Example: Classifying state of a quantum system over time-dependent evolution.

2. `"Panel"`: Representing Panel Data
    - Two-Level Index representations in Dataframe, 3D representation in numpy.
    - Can represent uni or multivariate data **&ast;**
    - Can represent (un)equally spaced data **&ast;**
    - Example: Prediction of Power Output of Solar Energy systems.

3. `"Hierarchical"`: Representing Data that has pre-defined Hierarchy
    - Multi-Level Index representations in Dataframe
    - Non-Time outer indices represent higher hierarchy
    - Example: C 

In [None]:
# TODO: Show basic examples of each scitype followed by a practical example

### 2.1.3 Brief overview to `"Series"` `scitype`
    -- TODO --

### 2.1.4 Introduction to `"Panel"` `scitype`
    -- TODO --

In [None]:
# TODO: Show practical examples of all major `mtype` and basic examples of minor ones

### 2.1.5 Data Loaders for Panel Data
    -- TODO --

In [None]:
# TODO-1: Load toy-dataset from sktime
# TODO-2: Load dataset from an external repo 

## 2.2 Classification Tasks

    -- TODO: A simple, motivating problem --

In [None]:
# TODO: Load the dataset and split

### 2.2.1 --TODO: Decide Topic Name--: A small list of famous & simple estimators to 'solve' this problem

In [None]:
# TODO-1: Estimator-1
# TODO-2: Estimator-2
# TODO-3: Estimator-3

### 2.2.2 Evaluation Metrics for Time Series Classification

In [None]:
# TODO: `scikit-learn` compatibility showcase for evaluation, provide in-house evaluation functions too

## 2.3 Regression Tasks
    -- TODO: A simple, motivating problem, relevant to the first one --

In [None]:
# TODO: Load the dataset and split

### 2.3.1 -- TODO: Decide Topic Name --: A small list of famous & simple estimators to 'solve' this problem

In [None]:
# TODO-1: Estimator-1
# TODO-2: Estimator-2
# TODO-3: Estimator-3

### 2.3.2 Evaluation Metrics for Time Series Regression

In [None]:
# TODO: `scikit-learn` compatibility showcase for evaluation, provide in-house evaluation functions too

## 2.4 Clustering Tasks
    -- TODO: A simple, motivating problem, relevant to both problems --

In [None]:
# TODO: Load the dataset and split

### 2.4.1 -- TODO: Decide Topic Name --: A small list of famous & simple estimators to 'solve' this problem

In [None]:
# TODO-1: Estimator-1
# TODO-2: Estimator-2
# TODO-3: Estimator-3

### 2.4.2 Evaluation Metrics for Time Series Clustering

In [None]:
# TODO: `scikit-learn` compatibility showcase for evaluation, provide in-house evaluation functions too

## 2.5 Introduction to Pipelines in `sktime`
    -- TODO: Decide subtopic distribution --

## 2.6 Advanced Topics
    -- TODO : Choose 2 or 3 of the following
        - DL estimators
        - Extending both classical and DL estimators for different tasks
        - Combining pipelines with DL estimators
        - GridSearch Pipelines
        - Reduction Pipelines-