# Phase 5: Diagnosing Outputs from Running XML-sets and Generate Report

This notebook  attempts to do what the software [Tracer](https://beast.community/tracer) does and make some improvements. 

## Instructions

Code cells of this Jupyter notebook should be run sequentially via shift+enter. Several cells will produce widgets that allow you to make various selections to select for MCMC chains that have convreged. Once you have made that selection left click on the cell below and press shift+enter.

## Suggested Reading

Up to and including "**x% HPD interval**" of:

Drummond, Alexei J., and Bouckaert, Remco R. ‘Ch 10: Posterior Analysis and Post Processing.’ In Basian Evolutionary Analyses with BEAST. Cambridge University Press, 2015. https://www.cambridge.org/core/books/bayesian-evolutionary-analysis-with-beast/81F5894F05E87F13C688ADB00178EE00.

The authors have been kind enough to make a draft copy of the book avialable at http://alexeidrummond.org/assets/publications/2015-drummond-bayesian.pdf.

## Setup

In [None]:
save_dir = None
report_template = None
add_unreported_fields = True
collection_date_field = "date"

Import necessary packages.

In [None]:
from copy import deepcopy
import json
import papermill as pm
from beast_pype.mcmc_diagnostics import BEASTDiag
from beast_pype.report_gen import add_unreported_outputs
from beast_pype.workflow import get_slurm_job_stats
import warnings
import os
import importlib.resources as importlib_resources
# stop annoying matplotlib warnings
warnings.filterwarnings("ignore", module="matplotlib\*")

In [None]:
if report_template is None:
    report_template =  importlib_resources.path('beast_pype', 'report_templates') / 'BDSKY-Report.ipynb'

if save_dir is None:
    save_dir=os.getcwd()


In [None]:
with open(save_dir + "/pipeline_run_info.json", "r") as file:
    data = file.read()
file.close()
pipeline_run_info = json.loads(data)
pipeline_run_info["Chains Used"] = {}
pipeline_run_info["Burn-In"] = {}

## Get Intormation on Run of Pipeline
### Slurm Job Stats

In [None]:
try:
    slurm_job_stats = get_slurm_job_stats(pipeline_run_info['slurm job IDs'])
    slurm_job_stats.to_csv(f"{save_dir}/slurm_job_stats.csv", index=False)
    to_display = slurm_job_stats
except:
    job_ids_request = ','.join([f"{entry}.batch" for entry in pipeline_run_info['slurm job IDs']])
    request = f"sacct --jobs={job_ids_request} --format=JobID,AllocTres,Elapsed,CPUTime,TotalCPU,MaxRSS -p --delimiter='/t'"
    to_display = ('The function for summarising slurm job statistics into a table may not work properly with certain slurm configurations (formating issues). \n' +
                    'We suggest you attempt the following from the command line on the terminal in which you ran this beast_pype workflow:\n' +
                  request)

display(to_display)