# SciUnit Demo

K. Rosenfeld

Measles Team Meeting

12/13/2021

To create slides:
```
jupyter nbconvert demo-20211213.ipynb --to slides --TagRemovePreprocessor.remove_input_tags={\"to_remove\"} --post serve --SlidesExporter.reveal_transition=none --SlidesExporter.reveal_scroll=True 
```

- https://medium.com/learning-machine-learning/present-your-data-science-projects-with-jupyter-slides-75f20735eb0f
- https://www.michaelkam.id/data-visualisation/2020/06/28/creating-an-interactive-presentation-with-jupyter-notebook-and-plotly.html
- https://blog.kdheepak.com/jupyter-notebook-revealjs-and-github-pages.html
- https://nbconvert.readthedocs.io/en/latest/config_options.html

## Code vs. Model Review

### Code review
1. What functionality is a component expected to have?
2. What functionality has been adequately implemented? What remains to be done?
3. Does a candidate code contribution cause regressions in other parts of a program?

<span style="color:white">br</span>

### Model review
1. What is a model's scope and how is validity measured?
2. Which observations are already explained by existing models? What are the best models of a particular quantity? What data has yet to be explained?
3. What effect do new observations have on the validity of previously published models? Can new models explain previously published data?

<img src="images/sci-unit-square-small.png"
     width= "100"
     height= auto
     align="left"
     style="float: left; margin-right: 10px;" /> 
     
This presentation focuses on [SciUnit](https://scidash.org/sciunit.html), a Pythonic framework for data-driven unit testing. SciUnit is used to create a *domain specific* model review package. The package can then be applied across models.  For each model you create a `sciunit.Model` for a `sciunit.Capability` that will be judged by a `sciunit.Test`.     

```python
my_model = MyModel(**my_args) # Instantiate a class that wraps your model of interest.  
my_test = MyTest(**my_params) # Instantiate a test that you write.  
score = my_test.judge() # Runs the test and return a rich score containing test results and more.  
```

SciUnit contributions 2012-2021:
<img src="images/sciunit_history_12132021.png"
     width= "800"
     height= auto
     align="center"
     style="float: left; margin-right: 10px;" /> 
     
ISSE conference paper from 2014 and an active repository. Heavy users in neuroscience ([NeuronUnit](https://scidash.org/neuronunit.html))

## Hypothetical example for orbital mechanics



## orbital mechanics: tests

*Test classes* are data-agnostic

```python
from cosmounit import PositionTest, VelocityTest, EccentricityyTest # Cosmounit is an external library.

```

and *test instances* encode the data you want a model to recapitulate.
```python
from . import saturn_data # Hypothetical library containing Saturn data.  
position_test = PositionTest(observation=saturn_data.position)
velocity_test = VelocityTest(observation=saturn_data.velocity)
eccentricity_test = EccentricityTest(observation=saturn_data.eccentricity)
```

## orbital mechanics: models

Orbital models can predict any planet, but we are interested in Saturn:

```python
from cosmounit import PtolemyModel, CopernicusModel, KeplerModel, NewtonModel  
ptolemy_model = PtolemyModel(planet='Saturn')
copernicus_model = CopernicusModel(planet='Saturn')
kepler_model = KeplerModel(planet='Saturn')
newton_model = NewtonModel(planet='Saturn')
```

## orbital mechanics: test suite

```python
from saturn_suite.suites import saturn_motion_suite
saturn_motion_suite.judge([ptolemy_model, copernicus_model, kepler_model, newton_model])
```

<img src="images/cosmo_example.png"
     width= "800"
     height= auto
     align="center"
     style="float: left; margin-right: 10px;" />      

## Hypothetical example for measles / epi

## measles epi: tests

Tests could be location specific or a theoretical result.

```python
from . import measles_data # Hypothetical library containing measles data.  
from epiunit import CCSTest, AgeAtInfTest # epiunit is an external library.
ccs_test = CCSTest(observation=measles.ccs)
age_at_inf_test = AgeAtInfTest(observation=measles.Nigeria)
seas_test = SeasonalityTest(observation=measles.Nigeria)
```

and packaged up into a location specific test suite:

```python
nigeria_epi_suite = sciunit.TestSuite([ccs_test, age_at_inf_test, seas_test])
```


## measles epi: models

Models could be differentiated by version, features, type, etc...

```python
from enod_package import EmodModel
from tsir_package import TSIRModel, DynaMICEModel 
emod_model = EmodModel(location='Nigeria')
tsir_model = TSIRModel(location='Nigeria')
mice_model = DynaMICEModel(location='Nigeria')
```

## measles epi: test suite

```python
from epi_suite.suites import nigeria_epi_suite
nigeria_epi_suite.judge([emod_model, tsir_model, mice_model])
```

Models do not need to be capable across the entire suite.

## Example test suite: constant number generation

## Capababilities

Every model has a capability it aims to implement:

```python
class ProducesNumber(sciunit.Capability):
    """An example capability for producing some generic number."""

    def produce_number(self):
        """The implementation of this method should return a number."""
        raise NotImplementedError("Must implement produce_number.")
```

Each model may have a unique method for that particular capability:
    
```python
class ConstModel(sciunit.Model, ProducesNumber):
"""A model that always produces a constant number as output."""

def __init__(self, constant, name=None):
    self.constant = constant 
    super(ConstModel, self).__init__(name=name, constant=constant)

def produce_number(self):
    return self.constant
```

and create a model instance:
```python
const_model_37 = ConstModel(37, name='Constant Model 37')
```

## Tests

A `sciunit.Test` class must contain:
    
1. Required model capabilities and type of score
3. `generate_prediction` to get model prediction
4. `compute_score` to compute a `sciunit.Score`

```python
## Example test class
class EqualsTest(sciunit.Test):
    """Tests if the model predicts the same number as the observation."""   
    
    required_capabilities = (ProducesNumber,) # The one capability required for a model to take this test.  
    score_type = sciunit.scores.BooleanScore # This test's 'judge' method will return a BooleanScore.  
    
    def generate_prediction(self, model):
        return model.produce_number() # The model has this method if it inherits from the 'ProducesNumber' capability.
    
    def compute_score(self, observation, prediction):
        score = self.score_type(observation['value'] == prediction) # Returns a BooleanScore. 
        score.description = 'Passing score if the prediction equals the observation'
        return score
```       
```python
## Example test instance
# create test instances
equals_37_test = EqualsTest({'value': 37}, name='=37') # Test model output equals 37.
equals_1_test = EqualsTest({'value': 1}, name='=1') # Test model output equals 1.  

# create test suite
equals_suite = sciunit.TestSuite([equals_1_test, equals_2_test, equals_37_test], name="Equals test suite")

# run suite
score_matrix = equals_suite.judge(const_model_37)
```

In [3]:
import sciunit
from sciunit.capabilities import ProducesNumber # One of many potential model capabilities.

class ConstModel(sciunit.Model, 
                 ProducesNumber):
    """A model that always produces a constant number as output."""
    
    def __init__(self, constant, name=None):
        self.constant = constant 
        super(ConstModel, self).__init__(name=name, constant=constant)

    def produce_number(self):
        return self.constant
    
const_model_37 = ConstModel(37, name="Constant Model 37")    

from sciunit.scores import BooleanScore # One of several SciUnit score types.  

class EqualsTest(sciunit.Test):
    """Tests if the model predicts 
    the same number as the observation."""   
    
    required_capabilities = (ProducesNumber,) # The one capability required for a model to take this test.  
    score_type = BooleanScore # This test's 'judge' method will return a BooleanScore.  
    
    def generate_prediction(self, model):
        return model.produce_number() # The model has this method if it inherits from the 'ProducesNumber' capability.
    
    def compute_score(self, observation, prediction):
        score = self.score_type(observation['value'] == prediction) # Returns a BooleanScore. 
        score.description = 'Passing score if the prediction equals the observation'
        return score
    
# create test instances
equals_37_test = EqualsTest({'value': 37}, name='=37') # Test model output equals 37.
equals_1_test = EqualsTest({'value': 1}, name='=1') # Test model output equals 1.  

# create test suite
equals_suite = sciunit.TestSuite([equals_1_test,  equals_37_test], name="Equals test suite")

# run suite
score_matrix = equals_suite.judge(const_model_37)

[38;2;230;78;52m[48;2;50;50;50mScore: Fail for Constant Model 37 on =1[0m
[38;2;60;169;88m[48;2;50;50;50mScore: Pass for Constant Model 37 on =37[0m


In [2]:
score_matrix

Unnamed: 0,=1,=37
Constant Model 37,Fail,Pass


### Score types

Complete [score types](https://sciunit.readthedocs.io/en/latest/basics.html#score) in SciUnit are: 
1. [`BooleanScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.BooleanScore): true or false
2. [`ZScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.ZScore): standardized difference from the mean
3. [`CohenDScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.CohenDScore): normalized difference between two means
4. [`RatioScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.RatioScore): ratio of two numbers
5. [`PercentScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.PercentScore): float between 0 and 100
6. [`FloatScore`](https://sciunit.readthedocs.io/en/latest/sciunit.scores.html#sciunit.scores.complete.FloatScore): a float

Incomplete score types are also included (`NoneScore`, `TBDScore`, `NAScore`, `InsufficientDataScore`).

SciUnit does not include statistical tests (ex. Kolmogorov–Smirnov test, Cramér–von Mises test) for comparing distributions.

SciUnit is only one way of tackling this problem. It requires support and time to write the tests but they can be wrapped around the final model. Test can be applied across models.

### Model validation:

- sciunit (python)
- scinunits (linux)
- idm-test (python)

*Paper reproduction*:

- [showyourwork](https://github.com/rodluger/showyourwork)

**References**:
- SciUnit [paper](https://github.com/cyrus-/papers/raw/master/sciunit-icse14/sciunit-icse14.pdf)
- SciUnit [docs and tutorials](https://sciunit.readthedocs.io/en/latest/)
