# Generating output figures

**A note on Birth Death Skyline Models**
The results in this notebook are from a phylodynamics pipeline using Birth Death Skyline Models. Reading material on Birth Death Skyline Models can be found at:
* [Taming the BEAST Tutorial: Skylineplots](https://taming-the-beast.org/tutorials/Skyline-plots/) 
* [Stadler et al. 2012 PNAS](https://www.pnas.org/doi/full/10.1073/pnas.1207965110)


In [None]:
save_dir = 'all_runs/ex_data_for_Martin'

In [None]:
from beast_pype import (read_xml_set_logs_for_plotting, plot_comparative_box_violine,
                            plot_skyline, plot_comparative_origin, hdi_pivot)
from beast_pype import date_to_decimal, decimal_to_date
import pandas as pd
import os
from IPython.core.getipython import get_ipython

In [None]:
# This cell retrieves all the log files for the samples you selected.
sample_dirs= {item: f'{save_dir }/{item}' for item in os.listdir(save_dir) 
              if (os.path.isfile(f'{save_dir }/{item}/merged.log') or
                  os.path.isfile(f'{save_dir }/{item}/merged_log.csv'))}
trace_path_dict = {}
for sample, directory in sample_dirs.items():
    log_file = f'{directory}/merged.log'
    csv_file = f'{directory}/merged_log.csv'
    if os.path.isfile(log_file):
        trace_path_dict[sample] = log_file
    else:
        trace_path_dict[sample] = csv_file
youngest_tip_dates = {sample: pd.read_csv(f'{directory}/metadata.csv', parse_dates=['date'])['date'].max()
                      for sample, directory in sample_dirs.items()}


df, df_melted_for_seaborn = read_xml_set_logs_for_plotting(
    file_path_dict=trace_path_dict,
    convert_become_uninfectious_rate=True,
    youngest_tips_dict=youngest_tip_dates)

## Infection Period 

BD Skyline models estimate the rate of becoming uninfectious (whose inverse if the average infection period). 

In [None]:
ax = plot_comparative_box_violine(df_melted_for_seaborn, 'Infection period (per day)')
display(ax)

In [None]:
infection_period_hdi_df = hdi_pivot(df, 'Infection period (per day)')
display(infection_period_hdi_df)

# $R_T$


## True Skyline

The effective reproductive number, Re, is estimated in serial intervals for each variant. Note that for computational speed, the resident variant less resolution is given prior to the arrival of the newly emerging lineages (if of interest this could be changed). 

**Note** Lower values are 0.05 Highest Posterior Density (HPD), higher values are 0.95 HPD.

In [None]:
youngest_tip_year_decimals = {key: date_to_decimal(value) for key,value in youngest_tip_dates.items()}

In [None]:
fig, ax = plot_skyline(df,
                           youngest_tip_year_decimals,
                           parameter_start='reproductiveNumber',
                           ylabel='$R_t$',
                           grid_size=100,
                           x_tick_freq='yearly',
                           include_grid=True)


# Origin

The origin is the time at which the index case (the first Canadian case) became infected, which is slightly earlier than the time-to-the-most-recent-common-ancestor (tMRCA). This parameter is used to investigate the detection delay from emergence to first detection in Canada.

In [None]:
fig = plot_comparative_origin(df_melted_for_seaborn, tick_freq='quarterly', one_figure=True)


In [None]:
fig = plot_comparative_origin(df_melted_for_seaborn)

In [None]:
orign_hdi_df = hdi_pivot(df, 'Origin')
orign_hdi_df['Lower HDI Date'] =  orign_hdi_df['Lower HDI'].map(decimal_to_date).dt.strftime("%Y-%m-%dir")
orign_hdi_df['Median Date'] =  orign_hdi_df['Median'].map(decimal_to_date).dt.strftime("%Y-%m-%dir")
orign_hdi_df['Upper HDI Date'] =  orign_hdi_df['Upper HDI'].map(decimal_to_date).dt.strftime("%Y-%m-%dir")
orign_hdi_df

In [None]:
columns_not_to_plot = ['Strain_and_Sample_Size', 'Sample', 'origin_BDSKY_Serial', 'Origin',
                       'Rate of Becoming Uninfectious (per day)','becomeUninfectiousRate_BDSKY_Serial',
                       'Infection period (per day)']
columns_not_to_plot += df.columns[df.columns.str.startswith('reproductiveNumber')].tolist()
columns_not_to_plot += df.columns[df.columns.str.startswith('Unnamed')].tolist()

columns_to_plot = [column for column in df.columns if column not in columns_not_to_plot]

def create_new_cell(contents):
    shell = get_ipython()
    payload = dict(
        source='set_next_input',
        text=contents,
        replace=False,
    )
    shell.payload_manager.write_payload(payload, single=False)

def plot_next_parameter(columns_to_plot):
    if len(columns_to_plot) > 0:
        parameter = columns_to_plot[0]
        content = f"ax = plot_comparative_box_violin(df_melted_for_seaborn, '{parameter}')\nhdi_df = hdi_pivot(df, '{parameter}')\nplot_next_parameter(columns_to_plot)\ndisplay(Markdown(f'# {parameter }'), ax, hdi_df)"
        create_new_cell(content)
        del columns_to_plot[0]

plot_next_parameter(columns_to_plot)