# Filtering application studies

This notebook filters the main dataset to only include empirical studies that apply the STRESS guidelines to documenting a computer simulation model.

## 1. Imports

### 1.1 Standard imports

In [1]:
import numpy as np
import pandas as pd

### 1.2 Review preprocessing imports

For convenience, the data pipeline described in [the pre-processing notebook](./01_cleaning.ipynb) can be imported for in a local Python module `data_pipeline.py`

In [2]:
from data_pipeline import load_review_dataset

## 2. Constants

Constants in this notebook are used for the filtering of application studies.

In [3]:
APP_FILTER_COLUMN_NAME = "used"
APP_FILTER_VALUE = "Yes"

## 2. Function to filter to empirical application studies only 

The main review dataset contains a Yes/No categorical column called `used`. The **Yes** category indicates that this was an empirical study where STRESS was applied to document a model. 

In [4]:
def filter_to_application_studies(clean_df: pd.DataFrame) -> pd.DataFrame:
    """Filter the cleaned dataset down to studies that used stress to report
    a simulation study.

    Parameters:
    ----------
    clean_df: pd.DataFrame
        Review dataframe. The main assumption is that this has passed through
        the main cleaning pipeline

    Returns:
    -------
    out: pd.DataFrame
    """
    # Used?: a Yes/No variable.
    filtered_df = clean_df[
        clean_df[APP_FILTER_COLUMN_NAME] == APP_FILTER_VALUE
    ]    
    return filtered_df

## 3. Example use and number of returned studies 

In [5]:
empirical_studies = (
    load_review_dataset()
    .pipe(filter_to_application_studies)
)
empirical_studies.shape

(73, 25)

In [6]:
empirical_studies.head(2).T

No,5,8
publication,Estimation of Viral Aerosol Emissions From Sim...,Developing sustainable tourism destinations th...
authors,"Riediker Michael, Dai-Hua Tsai","Shafiee Sanaz, Saeed Jahanyan, Ali Rajabzadeh ..."
year,2020,2023
type_of_paper,Journal,Journal
journal,JAMA,Journal of Simulation
name_of_univerity,-,-
type_of_study,Empirical,Empirical
pre_prints,No,No
doi,https://jamanetwork.com/journals/jamanetworkop...,https://www.tandfonline.com/doi/pdf/10.1080/17...
used,Yes,Yes
