# Example usage

Here are some examples to demonstrate how to use the HYPEHD package. The test datasets are open source data from [https://github.com/insightsengineering/scda.2022](https://github.com/insightsengineering/scda.2022) website.

## Imports

In [1]:
from hypehd import visualization as vis
from hypehd import data_manipulation as da

ModuleNotFoundError: No module named 'hypehd'

## Read test data
dm is a dataset including a set of essential standard variables (age, sex, race...) that describe each subject.
vs is a longitudinal dataset including a set of vital signs records per each patient, each visit.

In [None]:
dm = da.read("csv", "data/demographic.csv")
vs = da.read("csv", "data/vital_signs.csv")
dm.head()

In [None]:
vs.head()

## Filter the dataset
filter vs dataset to select only weight records and merge it with dm dataset using`data_selection()` function in data_manipulation

In [None]:
test = da.data_selection(keep_col=["USUBJID", "PARAMCD", "AVAL", "AVISITN"], sort_by=["SEX", "AGE"], sort_asc=True,
                         input_data=vs, cond='PARAMCD=="WEIGHT"', merge_data=dm, merge_by="USUBJID",
                         merge_keep_col=["USUBJID", "ITTFL", "SEX", "AGE", "TRT01P"])
test.head()

## Derive baseline info
using `derive_baseline()` function in data_manipulation calculate change from baseline, percent change from baseline of weight per each subject

In [None]:
test = da.derive_baseline(input_data=test, by_vars=["USUBJID", "PARAMCD"], value="AVAL", chg=True, pchg=True,
                          base_visit='AVISITN==0')
test.head()

## Generate demographic plots
using `demo_graph()` function in visualization to generate plots of AGE, SEX by different treatment group

In [None]:
vis.demo_graph(var=["AGE", "SEX"], input_data=dm, group='TRT01P')

## Generate line plots for longitudinal data
using `longitudinal_graph()` function in visualization to generate plots of change from baseline, percent change from baseline by different visits

In [None]:
vis.longitudinal_graph(outcome=["chg", "pchg"], time="AVISITN", group="TRT01P", input_data=test)

## Derive extreme flags
using `derive_extreme_flag()` function to get the last and max records per each patient

In [None]:
df = da.derive_extreme_flag(input_data=vs, by_vars=['USUBJID', 'PARAMCD'], sort_var=['AVISITN'], new_var="last_flag", mode="last", value_var="AVAL")
df = da.derive_extreme_flag(input_data=df, by_vars=['USUBJID', 'PARAMCD'], sort_var=['AVISITN'], new_var="max_flag", mode="max", value_var="AVAL")
df.head(20)

## Survival analysis
using `time_to_event()` in data_manipulation to process the time to event variable and using `survival_analysis()` in visualization to generate the KM plot

In [None]:
dm2 = da.time_to_event(input_data=dm, start_date="TRTSDTM", end_date="DTHDT", censor_date="TRTEDTM",
                       new_var='time_to_death', unit='year')
dm2.head()

In [None]:
vis.survival_analysis(time="time_to_death", censor_status="censor_status", group="TRT01P", input_data=dm2)