In [62]:
import numpy as np
import pandas as pd

In [63]:
df = pd.DataFrame(data=[f"2022-10-{x}" for x in ("03 10:30", "03 12:40", "04 15:00", "06 18:00", "10 11:33", "10 18:34")], columns=["date"])
df["X1"] = np.random.normal(6, 1, size=6)
df["X2"] = np.random.normal(50, 10, size=6)

visits = pd.DataFrame(data=[["2022-10-06", "depression"]], columns=["date", "state"])

Our unsupervised sensors data

In [64]:
df

Unnamed: 0,date,X1,X2
0,2022-10-03 10:30,5.751966,44.846856
1,2022-10-03 12:40,6.989865,63.691903
2,2022-10-04 15:00,4.6409,37.858959
3,2022-10-06 18:00,6.30058,53.067628
4,2022-10-10 11:33,4.854741,58.22848
5,2022-10-10 18:34,6.768292,57.930207


Label information from visits

In [65]:
visits

Unnamed: 0,date,state
0,2022-10-06,depression


Two approaches to extrapolating info from `visits` onto `df`

<img src='images/conf1.png' width='400' />

In [66]:
df["conf1"] = [1. for _ in range(6)]

In [67]:
df

Unnamed: 0,date,X1,X2,conf1
0,2022-10-03 10:30,5.751966,44.846856,1.0
1,2022-10-03 12:40,6.989865,63.691903,1.0
2,2022-10-04 15:00,4.6409,37.858959,1.0
3,2022-10-06 18:00,6.30058,53.067628,1.0
4,2022-10-10 11:33,4.854741,58.22848,1.0
5,2022-10-10 18:34,6.768292,57.930207,1.0


<img src='images/conf2.png'>

Both `conf1` and `conf2` are function of `time` only!

In [68]:
df["conf2"] = [0.5, 0.5, 1.0, 1.0, 0.5, 0.5]

In [69]:
df

Unnamed: 0,date,X1,X2,conf1,conf2
0,2022-10-03 10:30,5.751966,44.846856,1.0,0.5
1,2022-10-03 12:40,6.989865,63.691903,1.0,0.5
2,2022-10-04 15:00,4.6409,37.858959,1.0,1.0
3,2022-10-06 18:00,6.30058,53.067628,1.0,1.0
4,2022-10-10 11:33,4.854741,58.22848,1.0,0.5
5,2022-10-10 18:34,6.768292,57.930207,1.0,0.5


Note that calls from one day have the same `confidence factor` assigned as the key is `date`, not `datetime`. In fact

In [70]:
df.insert(0, 'level1_day', [1, 1, 2, 3, 4, 4])
df.insert(1, 'level2_hour', [1, 2, 1, 1, 1, 2])

In [71]:
df

Unnamed: 0,level1_day,level2_hour,date,X1,X2,conf1,conf2
0,1,1,2022-10-03 10:30,5.751966,44.846856,1.0,0.5
1,1,2,2022-10-03 12:40,6.989865,63.691903,1.0,0.5
2,2,1,2022-10-04 15:00,4.6409,37.858959,1.0,1.0
3,3,1,2022-10-06 18:00,6.30058,53.067628,1.0,1.0
4,4,1,2022-10-10 11:33,4.854741,58.22848,1.0,0.5
5,4,2,2022-10-10 18:34,6.768292,57.930207,1.0,0.5


---

We use one of the `confidence factor` columns in a modelling procedure

```python
model1 = SSFCM_with_CPR(data=df, confidence_factor=conf1)
model1["estimated_conf"]
```

It yields `adjusted confidence factor`

In [78]:
s = np.round(np.random.uniform(0, 1, 6), 2)

In [79]:
s

array([0.47, 0.58, 0.05, 0.02, 0.44, 0.95])

In [80]:
df["conf1_adjusted"] = s

In [81]:
df

Unnamed: 0,level1_day,level2_hour,date,X1,X2,conf1,conf2,conf1_adjusted
0,1,1,2022-10-03 10:30,5.751966,44.846856,1.0,0.5,0.47
1,1,2,2022-10-03 12:40,6.989865,63.691903,1.0,0.5,0.58
2,2,1,2022-10-04 15:00,4.6409,37.858959,1.0,1.0,0.05
3,3,1,2022-10-06 18:00,6.30058,53.067628,1.0,1.0,0.02
4,4,1,2022-10-10 11:33,4.854741,58.22848,1.0,0.5,0.44
5,4,2,2022-10-10 18:34,6.768292,57.930207,1.0,0.5,0.95


Note this is a first step of CPR procedure - estimating `adjusted confidence factor` based on data. <br>

We may want now to use in a final model.

```python
model2 = SSFCM(data=df, confidence_factor=conf1_adjusted)
```

In such a model, we are no longer interested in estimating `adjusted confidence factor` - now it's a classifier and accuracy that are of interest.

We use adjusted confidence factor just to reflect the true (estimated) label uncertainty impact on the key mechanism of interest - the accuracy of a SSFCM classifier.

Note that the `conf1_adjusted` values are now on `level2_hour` - unique value of `adjusted confidence` for each single call, not for each single day!

It now has nothing (directly) to do with the `confidence factor`, but the mechanism of assigning label certainty should be the same, although discrete
:

<img src="images/conf3.JPG" width="600">

Note: two calls from "2022-10-10" have different values of `confindence factor` fed to the algorithm!

Technically, we want:

<img src="images/conf4.png" width="800">