<!-- WARNING: THIS FILE WAS AUTOGENERATED! DO NOT EDIT! -->

In [1]:
#|output: asis
#| echo: false
show_doc(HashableDataFrame)

  else: warn(msg)
  else: warn(msg)


---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L24){target="_blank" style="float:right; font-size:smaller"}

### HashableDataFrame

>      HashableDataFrame (obj)

Two-dimensional, size-mutable, potentially heterogeneous tabular data.

Data structure also contains labeled axes (rows and columns).
Arithmetic operations align on both row and column labels. Can be
thought of as a dict-like container for Series objects. The primary
pandas data structure.

In [None]:
from gpt3forchem.data import get_photoswitch_data
from functools import lru_cache

Let's define a cached function.

In [None]:
@lru_cache
def cached_function(df): 
    return df['Extinction'].sum()

In [None]:
data = get_photoswitch_data()

In [None]:
data = HashableDataFrame(data)

In [None]:
cached_function(data)

336213.03

In [2]:
#|output: asis
#| echo: false
show_doc(picp)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L38){target="_blank" style="float:right; font-size:smaller"}

### picp

>      picp (y_true, y_lower, y_upper)

Prediction Interval Coverage Probability (PICP). Computes the fraction of samples for which the grounds truth lies
within predicted interval. Measures the prediction interval calibration for regression.
Args:
    y_true: Ground truth
    y_lower: predicted lower bound
    y_upper: predicted upper bound
Returns:
    float: the fraction of samples for which the grounds truth lies within predicted interval.

In [None]:
picp(np.array([1, 2, 3]), np.array([0, 1, 2]), np.array([2, 3, 4]))

1.0

If we use test-time augmentation for classification we predict one class label per augmented example. A handwavy way of converting this to multiclass probabilities, would be to just get the frequency of the occurance of each class.

In [None]:
example_prediction_frame = pd.DataFrame(
    {
        'y_true': [1] * 10 + [2] * 10 + [3] * 10,
        'repr': ['a'] * 10 + ['b'] * 10 + ['c'] * 10,
        'y_pred': [1, 1, 2, 3, 1, 1, 1, 2, 4, 1] + [2, 2, 2, 2, 2, 2, 2, 2, 2, 2] + [3, 3, 3, 3, 3, 3, 3, 3, 3, 3],
    })

In [None]:
example_prediction_frame

Unnamed: 0,y_true,repr,y_pred
0,1,a,1
1,1,a,1
2,1,a,2
3,1,a,3
4,1,a,1
5,1,a,1
6,1,a,1
7,1,a,2
8,1,a,4
9,1,a,1


Let's convert the multiclass vote we create with test-time augmentation or ensemble models to "multiclass probabilities" by computing the frequency of every class.

In [3]:
#|output: asis
#| echo: false
show_doc(multiclass_vote_to_probabilities)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L55){target="_blank" style="float:right; font-size:smaller"}

### multiclass_vote_to_probabilities

>      multiclass_vote_to_probabilities
>                                        (prediction_frame:pandas.core.frame.Dat
>                                        aFrame, prediction_colum:str,
>                                        representation_column:str,
>                                        classes:Optional[Iterable]=None)

|    | **Type** | **Default** | **Details** |
| -- | -------- | ----------- | ----------- |
| prediction_frame | DataFrame |  | input dataframe with predictions and representations |
| prediction_colum | str |  | name of the column with predictions |
| representation_column | str |  | name of the column with representations |
| classes | typing.Optional[typing.Iterable] | None | names of all possible classes |
| **Returns** | **DataFrame** |  |  |

In [None]:
class_probabilities = multiclass_vote_to_probabilities(example_prediction_frame, 'y_pred', 'repr')
class_probabilities

Unnamed: 0,0,1,2,3,4,repr
0,0.0,0.6,0.2,0.1,0.1,a
1,0.0,0.0,1.0,0.0,0.0,b
2,0.0,0.0,0.0,1.0,0.0,c


In [None]:
assert class_probabilities[class_probabilities['repr'] == 'b'][0].values[0] == 0
assert class_probabilities[class_probabilities['repr'] == 'b'][1].values[0] == 0
assert class_probabilities[class_probabilities['repr'] == 'b'][2].values[0] == 1
assert class_probabilities[class_probabilities['repr'] == 'b'][3].values[0] == 0
assert class_probabilities[class_probabilities['repr'] == 'b'][4].values[0] == 0

We can also extract a numpy array in this way

In [None]:
class_probabilities[np.arange(5)].values

array([[0. , 0.6, 0.2, 0.1, 0.1],
       [0. , 0. , 1. , 0. , 0. ],
       [0. , 0. , 0. , 1. , 0. ]])

We can then use this array to compute classification scores, e.g., the Brier score or the expected calibration error:

$$ 
\underset{\hat{P}}{\mathbb{E}}[\left|\mathbb{P}(\hat{Y}=Y \mid \hat{P}=p)-p\right|]
$$

In [4]:
#|output: asis
#| echo: false
show_doc(expected_calibration_error)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L304){target="_blank" style="float:right; font-size:smaller"}

### expected_calibration_error

>      expected_calibration_error (y_true, y_prob, y_pred=None, num_bins=10,
>                                  return_counts=False)

Computes the reliability curve and the  expected calibration error [1]_ .

References:
    .. [1] Chuan Guo, Geoff Pleiss, Yu Sun, Kilian Q. Weinberger; Proceedings of the 34th International Conference
        on Machine Learning, PMLR 70:1321-1330, 2017.

The expected calibration error is the difference in expectation between the confidence and accuracy. 

Args:
    y_true: array-like of shape (n_samples,)
        ground truth labels.
    y_prob: array-like of shape (n_samples, n_classes).
        Probability scores from the base model.
    y_pred: array-like of shape (n_samples,)
        predicted labels.
    num_bins: number of bins.
    return_counts: set to True to return counts also.

Returns:
    float or tuple:
        - ece (float): expected calibration error.
        - confidences_in_bins: average confidence in each bin (returned only if return_counts is True).
        - accuracies_in_bins: accuracy in each bin (returned only if return_counts is True).
        - frac_samples_in_bins: fraction of samples in each bin (returned only if return_counts is True).

In [5]:
#|output: asis
#| echo: false
show_doc(multiclass_brier_score)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L287){target="_blank" style="float:right; font-size:smaller"}

### multiclass_brier_score

>      multiclass_brier_score (y_true, y_prob)

Brier score for multi-class.

Args:
    y_true: array-like of shape (n_samples,)
        ground truth labels.
    y_prob: array-like of shape (n_samples, n_classes).
        Probability scores from the base model.
Returns:
    float: Brier score.

In [None]:
multiclass_brier_score(example_prediction_frame.groupby('repr').mean()['y_true'].values.astype(int), class_probabilities[np.arange(5)].values)

0.07333333333333335

0 is the best Brier score, 1 is the worst.

In [None]:
expected_calibration_error(example_prediction_frame.groupby('repr').mean()['y_true'].values.astype(int), class_probabilities[np.arange(5)].values)

0.13333333333333333

Let's create a wrapper that orchestrates the whole process.

We will need to convert potential string labels into numerical ones and then compute both metrics. 

In [6]:
#|output: asis
#| echo: false
show_doc(only_mode)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L371){target="_blank" style="float:right; font-size:smaller"}

### only_mode

>      only_mode (x)

In [7]:
#|output: asis
#| echo: false
show_doc(augmented_classification_scores)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L375){target="_blank" style="float:right; font-size:smaller"}

### augmented_classification_scores

>      augmented_classification_scores (repr, true, predictions,
>                                       cat_encode_func:Optional[Callable]=<func
>                                       tion encode_categorical_value>,
>                                       class_names=array([0, 1, 2, 3, 4]))

In [None]:
augmented_classification_scores(example_prediction_frame['repr'], example_prediction_frame['y_true'], example_prediction_frame['y_pred'], cat_encode_func=None)

  return mode(x)[0][0]
  return mode(x)[0][0]


(pycm.ConfusionMatrix(classes: [1, 2, 3]),
 0.07333333333333335,
 0.13333333333333333)

In [8]:
#|output: asis
#| echo: false
show_doc(make_if_not_exists)

---

[source](https://github.com/kjappelbaum/gpt3forchem/blob/main/gpt3forchem/helpers.py#L404){target="_blank" style="float:right; font-size:smaller"}

### make_if_not_exists

>      make_if_not_exists (path)