# Usage in custom model 

## Use as base class / Mixin

The most straightforward way to use `formulaic-contrasts` with a custom model is to use {class}`~formulaic_contrasts.FormulaicContrasts`
as a base class or mixin class.

As an example, let's wrap an Ordinary Least Squares ({class}`~statsmodels.api.sm.OLS`) linear model into a custom class
for the use with `formulaic-contrasts`. The aim is to build a model that takes a pandas DataFrame and a formulaic formula as input, e.g.

```python
model = StatsmodelsOLS(data, "~ treatment + response")
```

and allows to fit the model to a continuous variable from the dataframe and perform a t-test:

```python
model.fit("variable")
model.t_test(model.contrast("variable", "baseline", "treatment"))
```

In [45]:
from formulaic_contrasts import FormulaicContrasts
import numpy as np
import statsmodels.api as sm


class StatsmodelsOLS(FormulaicContrasts):
    def fit(self, variable: str):
        self.mod = sm.OLS(self.data[variable], self.design)
        self.mod = self.mod.fit()

    def t_test(self, contrast: np.ndarray):
        return self.mod.t_test(contrast)

In [28]:
df = datasets.treatment_response(80)
df

Unnamed: 0,treatment,response,biomarker
0,drugA,non_responder,6.595490
1,drugA,non_responder,7.071509
2,drugA,non_responder,8.537421
3,drugA,non_responder,6.787991
4,drugA,non_responder,10.109717
...,...,...,...
75,drugB,responder,11.167627
76,drugB,responder,9.493773
77,drugB,responder,5.027817
78,drugB,responder,9.800762


In [33]:
model = StatsmodelsOLS(df, "~ treatment * response")
model.fit("biomarker")

In [34]:
model.t_test(model.cond(treatment = "drugA", response="non_responder") -model.cond(treatment = "drugA", response="responder") )

<class 'statsmodels.stats.contrast.ContrastResults'>
                             Test for Constraints                             
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0             1.6492      0.935      1.764      0.082      -0.213       3.512

In [None]:
 datasets, get_factor_storage_and_materializer
import pandas as pd

import numpy as np
from pprint import pprint
import random

## Use as an attribute

Alternatively, if you prefer to work without inheritance, you can use `FormulaicContrast` as a class attribute

In [None]:
class StatsmodelsOLS:
    def __init__(self, data: pd.DataFrame, design: str) -> None:
        self.data = data
        self.formulaic_contrasts = FormulaicContrasts(data, design)

    def fit(self, variable: str):
        self.mod = sm.OLS(self.data[variable], self.formulaic_contrasts.design)
        self.mod = self.mod.fit()

    def cond(self, **kwargs):
        return self.formulaic_contrasts.cond(**kwargs)
    
    def t_test(self, contrast: np.ndarray):
        return self.mod.t_test(contrast)


In [35]:
model = StatsmodelsOLS(df, "~ treatment * response")
model.fit("biomarker")

In [36]:
model.t_test(model.cond(treatment = "drugA", response="non_responder") -model.cond(treatment = "drugA", response="responder") )

<class 'statsmodels.stats.contrast.ContrastResults'>
                             Test for Constraints                             
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
c0             1.6492      0.935      1.764      0.082      -0.213       3.512

## Manual usage

You can also use the lower-level interface {func}`~formulaic_contrasts.get_factor_storage_and_materializer` to 
introspect formulaic models if the `FormulaicContrasts` class doesn't fit your needs. 

In [39]:
factor_storage, variables_to_factors, materializer_class = get_factor_storage_and_materializer()

`factor_storage` will keep track of *factors* used in the formula, while `variables_to_factors` will keep 
track of *variables* used in the formula, whenever a formula is materialized into a design matrix using the `materializer_class`. 

In [41]:
design_mat = materializer_class(df, record_factor_metadata=True).get_model_matrix("~ treatment * response")

In [46]:
pprint(factor_storage)

defaultdict(<class 'list'>,
            {'response': [FactorMetadata(name='response',
                                         reduced_rank=True,
                                         custom_encoder=False,
                                         categories=('non_responder',
                                                     'responder'),
                                         kind=<Kind.CATEGORICAL: 'categorical'>,
                                         drop_field='non_responder',
                                         column_names=('non_responder',
                                                       'responder'),
                                         colname_format='{name}[T.{field}]')],
             'treatment': [FactorMetadata(name='treatment',
                                          reduced_rank=True,
                                          custom_encoder=False,
                                          categories=('drugA', 'drugB'),
                          

In [43]:
variables_to_factors

defaultdict(set, {'treatment': {'treatment'}, 'response': {'response'}})

In [52]:
factor_storage, variables_to_factors, materializer_class = get_factor_storage_and_materializer()
design_mat = materializer_class(df, record_factor_metadata=True).get_model_matrix("~ biomarker + np.log(biomarker) + C(treatment, contr.treatment(base='drugB'))")

  result = getattr(ufunc, method)(*inputs, **kwargs)


In [54]:
pprint(factor_storage.keys())

dict_keys(['biomarker', 'np.log(biomarker)', "C(treatment, contr.treatment(base='drugB'))"])


In [56]:
pprint(variables_to_factors)

defaultdict(<class 'set'>,
            {'C': {"C(treatment, contr.treatment(base='drugB'))"},
             'biomarker': {'biomarker', 'np.log(biomarker)'},
             'contr.treatment': {"C(treatment, contr.treatment(base='drugB'))"},
             'np.log': {'np.log(biomarker)'},
             'treatment': {"C(treatment, contr.treatment(base='drugB'))"}})
