# <span style="background-color: yellow; color: black">Seismogram Title</span>

✨✨<span style="background-color: yellow; color: black">This is a short blurb on what model is being analyzed</span>✨✨

### Documentation
To find out more about ``seismometer``, see the [documentation](https://epic-open-source.github.io/seismometer/) on GitHub.

### Usage
Explore data from your organization's model including predictions, outcomes, interventions, and sensitive cohorts.  
Use ```sm.show_info()``` to explore what is available.

In [None]:
%matplotlib inline

import seismometer as sm
sm.run_startup(config_path='.')

In [None]:
sm.show_info(plot_help=True)

| Term                     | Definition                                                                                                                             |
|--------------------------|----------------------------------------------------------------------------------------------------------------------------------------|
| Sensitivity (Recall)     | Sensitivity is a model's true positive rate: The proportion of entities in the dataset that met the target criteria and were correctly scored above the threshold set for the model. |
| Specificity              | Specificity is a model's true negative rate: The proportion of entities in the dataset that did not meet the target criteria and were correctly scored below the threshold set for the model. |
| PPV (Precision)          | PPV is the positive predictive value of a model: How likely an entity above the selected threshold is to have met the target criteria.   |
| NPV                      | NPV is the negative predictive value of a model: How likely an entity below the selected threshold is to not have met the target criteria. |
| AUROC                    | AUROC is the area under the receiver operating characteristic curve, otherwise known as the C-stat of a model. This metric is an aggregate measure of how well a model performs across all possible thresholds for a given model. The AUROC does not assess performance at a specific threshold. |
| Flag Rate                | The flag rate is the proportion of entities that would be identified as positive cases by the model at the selected threshold.           |
| Observed Rate            | The observed rate is the proportion of entities that met the target criteria.                                                          |

## Overview

#### ℹ Info

✨✨<span style="background-color: yellow; color: black">**Model Overview:** Is a more thorough description of the model, intended usage, and expected output</span>✨✨

#### Selection

✨✨<span style="background-color: yellow; color: black">An explanation of cohorts and their definitions can be inserted here</span>✨✨

In [None]:
sm.cohort_list()

## Feature Monitor

### ℹ Info

✨✨<span style="background-color: yellow; color: black">Add guidance here on what features are most useful to always look at and which might be ok even when there are normal levels for alerting.  Rare labs or other documentation could be very indicative of the target but have unusually high missingness.</span>✨✨

**Tips**: 
 - See [feature monitoring](https://epic-open-source.github.io/seismometer/user_guide/index.html#feature-monitor) for more details.
 - This section provides insight into model inputs, demographics, and the set of interventions and outcomes. During early stages this will help validate configuration; afterwards it will assist with detecting feature and population drift. Read through the alerts identified for your data, dig deeper using the feature, demographic, and event summaries, or by comparing across targets or demographics.
 - **Other Warnings:** The variable profiles below will identify any concerning trends in feature distributions. Depending on the model, you may want to do additional configuration to silence these alerts until certain thresholds are met. 
 - Run the `sm.feature_summary()`/`sm.cohort_comparison_report()`/`sm.target_feature_summary()` functions in the cells below to get a report for the corresponding dataset.

### Reports

#### Feature Alerts
View automatically identified data quality issues for the model inputs in your dataset

In [None]:
sm.feature_alerts()

#### Feature Summary Statistics and Plots
View the summary statistics and distributions for the model inputs in your dataset. 

In [None]:
sm.feature_summary()

#### Summarize Features by Cohort Subgroup
Run `sm.cohort_comparison_report()`, select two different groups to compare, and hit `Generate Report` to generate a comparative feature report.

In [None]:
sm.cohort_comparison_report()

#### Summarize Features by Target
Run `sm.target_feature_summary()` to get a link to a breakdown of your features stratified by the different target values.

✨✨<span style="background-color: yellow; color: black">Describe the default target being used for ground truth and any differences between it and how the model predictions might be used in practice.</span>✨✨

In [None]:
sm.target_feature_summary()

## Model Performance

### Overall

####  ℹ Info

**Model Performance Plots**

See [model performance plots](https://epic-open-source.github.io/seismometer/user_guide/index.html#model-performance) for more details.

**Tips:**
 - Thresholds configured for the model are highlighted on the graphs.
 - Use `sm.ExploreModelEvaluation()` to get model evaluation plots for your model.

✨✨<span style="background-color: yellow; color: black">Add reference points for what performance has been seen in literature or other baselines for comparison; Explain what counts as good/bad.  The plots themselves have been described in a prior markdown cell</span>✨✨

#### Visuals

In [None]:
sm.ExploreModelEvaluation()

### Fairness Overview

#### ℹ Info

See [fairness audit](https://epic-open-source.github.io/seismometer/user_guide/index.html#fairness-audit) for more details.

✨✨<span style="background-color: yellow; color: black">Any expected cohorts or metrics that are particularly important as well as understood differences that are not expected to be tied to disparities should be called out</span>✨✨

In [None]:
sm.ExploreFairnessAudit()

### Cohort Analysis 

#### ℹ Info

**Cohort Performance Plots**

See [cohort comparisons](https://epic-open-source.github.io/seismometer/user_guide/index.html#cohort-analysis) for more details.

**Tips:**
 - Thresholds configured for the model are highlighted on the graphs.
 - Use `sm.ExploreCohortEvaluation()` to get model evaluation plots for your model split by cohort subgroups.

✨✨<span style="background-color: yellow; color: black">Add reference points for what performance has been seen in literature. Similar to model performance but focused on a selected cohortThe plots themselves have been described in a prior markdown cell</span>✨✨

#### Visuals

In [None]:
sm.show_cohort_summaries(by_target=False, by_score=False)

In [None]:
sm.ExploreCohortEvaluation()

## Outcomes

Success of integrating a predictive model depends on more than just the model's performance. Often, it can be determined by how well the model is integrated and how effectively (and equitably) interventions are applied. This section is intended to help analyze interventions and outcomes across sensitive groups or risk categories. See [analyzing outcomes](https://epic-open-source.github.io/seismometer/user_guide/index.html#outcomes) for more details.

### Trend comparison

✨✨<span style="background-color: yellow; color: black">Add information on the first configured intervention and outcome, what a successful workflow is expected to look like and what would need further review</span>✨✨

In [None]:
sm.ExploreCohortOutcomeInterventionTimes()

### Lead-time Analysis 

#### ℹ Info

Lead-time analysis is focused on revealing the amount of time that a high prediction gives before an event of interest.  These analyses implicitly restrict data to the positive cohort, as that is expected to be the place time the event occurs.
The visualization uses violin plots, where each distribution of the subpopulation is represented as a vertical, mirrored density plot. The inner box within the violin plot highlights the interquartile range, while the central line indicates the median. When the distributions overlap significantly, it indicates that the model is providing equal opportunity for action to be taken based on the scores across the cohort groups.

✨✨<span style="background-color: yellow; color: black">Add information on what leadtime means in the context of your model; Re-emphasize any windowing done for the target that filters the perspective .</span>✨✨

#### Visuals

In [None]:
sm.ExploreCohortLeadTime()