# Gradual Pattern Labelling

We propose an approach for generating gradual pattern (GP) labels from the features of a data set. The nature of data sets used in GP mining do not provide labels which correspond to their features that may allow classification models to be applied on them. Therefore, most of the existing GP mining techniques rely on descriptive approaches in order to mine GPs. In order to create the possibility of employing machine learning classification algorithms to the task of predicting GPs, the need arises for labelling features of data sets using GP classes.

In this study, we propose an approach that produces GP labels for data set features. In order to test the effectiveness of our approach, we further propose and demonstrate how these labels may be used to extract estimated GPs with an acceptable accuracy. We test the accuracy of the estimated GPs using 2 measures:

* verity of each estimated pattern
* error margin of their estimated support values from the *true* values

# Demonstration

## 1. Original Data Set

We show the first 5 records of our test data sets *"breast_cancer.csv'' or "c2k.csv"*.

|Age   | BMI         | Glucose | Insulin | HOMA        | Leptin  | Adiponectin | Resistin | MCP.1   | Classification |
| -----| ----------- | ------- | ------- | ----------- | ------- | ----------- | -------- | ------- | -------------- |
| 48.0 | 23.5        | 70.0    | 2.707   | 0.467408667 | 8.8071  | 9.7024      | 7.99585  | 417.114 | 1.0 |  |
| 83.0 | 20.69049454 | 92.0    | 3.115   | 0.706897333 | 8.8438  | 5.429285    | 4.06405  | 468.786 | 1.0 |  |
| 82.0 | 23.12467037 | 91.0    | 4.498   | 1.009651067 | 17.9393 | 22.43204    | 9.27715  | 554.697 | 1.0 |  |
| 68.0 | 21.36752137 | 77.0    | 3.226   | 0.612724933 | 9.8827  | 7.16956     | 12.766   | 928.22  | 1.0 |  |
| 86.0 | 21.11111111 | 92.0    | 3.549   | 0.8053864   | 6.6994  | 4.81924     | 10.57635 | 773.92  | 1.0 |  |


## 2. GP Labelling

We show the modified data set with the generated GP labels.

|Age   | BMI         | Glucose | Insulin | HOMA        | Leptin  | Adiponectin | Resistin | MCP.1   | Classification | GP Label |
| -----| ----------- | ------- | ------- | ----------- | ------- | ----------- | -------- | ------- | -------------- | ------- |
| 48.0 | 23.5        | 70.0    | 2.707   | 0.467408667 | 8.8071  | 9.7024      | 7.99585  | 417.114 | 1.0 | 1+2+3+4+5+6+7-8+ |
| 83.0 | 20.69049454 | 92.0    | 3.115   | 0.706897333 | 8.8438  | 5.429285    | 4.06405  | 468.786 | 1.0 | 1-2+4+5+6+7+8+ |
| 82.0 | 23.12467037 | 91.0    | 4.498   | 1.009651067 | 17.9393 | 22.43204    | 9.27715  | 554.697 | 1.0 | 1-2+4+5+7-8+ |
| 68.0 | 21.36752137 | 77.0    | 3.226   | 0.612724933 | 9.8827  | 7.16956     | 12.766   | 928.22  | 1.0 | 1-2+3+4+5+6+7+9- |
| 86.0 | 21.11111111 | 92.0    | 3.549   | 0.8053864   | 6.6994  | 4.81924     | 10.57635 | 773.92  | 1.0 | 1-2+4+5+6+7+9- |

## 3. Estimate GPs using Labels
We present the preliminary results that show the accuracy of our extracted estimated GPs.

Gradual Pattern      | Estimated Support  |  True Support | Percentage Error | Standard Deviation  |
| ------------------ | ------------------ | ------------- | ---------------- | ------------------- |
['2+', '8-'] | 0.216 | 0.871 | -75.201% | 0.463 |
['0+', '2+'] | 0.216 | 0.836 | -74.163% | 0.438 |
['7-', '8-'] | 0.5 | 0.931 | -46.294% | 0.305 |
['5-', '8-'] | 0.224 | 0.966 | -76.812% | 0.525 |
['7-', '5-'] | 0.207 | 0.922 | -77.549% | 0.506 |
['7-', '4-'] | 0.414 | 0.94  | -55.957% | 0.372 |
['5-', '4-'] | 0.509 | 0.966 | -47.308% | 0.323 |
['7-', '3-'] | 0.216 | 0.905 | -76.133% | 0.487 |
['5-', '3-'] | 0.5 | 0.845  | -40.828% | 0.244 |
['4-', '3-'] | 0.759 | 0.897 | -15.385% | 0.098 |
['7-', '4-', '3-'] | 0.207 | 0.862 | -75.986% | 0.463 |
['5-', '4-', '3-'] | 0.483 | 0.802 | -39.776% | 0.226 |
['1-', '8-'] | 0.466 | 0.862 | -45.94% | 0.28 |
['7-', '1-'] | 0.414 | 0.836 | -50.478% | 0.298 |
['5-', '1-'] | 0.569 | 0.888 | -35.923% | 0.226 |
['4-', '1-'] | 0.457 | 0.905 | -49.503% | 0.317 |
['3-', '1-'] | 0.466 | 0.81  | -42.469% | 0.243 |
['4-', '3-', '1-'] | 0.431 | 0.733 | -41.201% | 0.214 |
['2+', '4+'] | 0.25 | 0.897|  -72.129% | 0.457 |
['2+', '3+', '4+'] | 0.224 | 0.828 | -72.947% |  0.427 |
['8-', '6+'] | 0.44 | 0.957|  -54.023% |      0.366 |
['7-', '6+'] | 0.448 | 0.914 | -50.985% |      0.33 |
['1-', '6+'] | 0.422 | 0.879 | -51.991% |      0.323 |