# Logging Reptilia Workflow

In [1]:
import pandas as pd

## MCMC Commands

File Name Notes:
- All Gamma (-mG models) have _G in the file name
- Gibbs models have Gibbs in the name

First date = when .pkl and sum.txt are outputted
Second date = when ex_rates, sp_rates, per_species_rates, mcmc are ouptutted 

CoVar Model (not BDNN): 8/15, 8/19
- python PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -trait_file data/reptilia/Reptilia_species_traits.txt -mCov 5 -logT 1 -pC 0 -fixShift data/Time_bins_CrossStage.txt -qShift data/Time_bins_ByStages.txt -mG -A 0 -n 20000000 -s 2000
    - This is: a Covar BD model with fixed times of rate shifts, log transformed traits, TPP and Gamma preservation model, parameter estimation MCMC

BDNN run 1: 8/15, 8/19
- python PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_CrossStage.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_ByStages.txt -mG -A 0 -n 20000000 -s 2000
    - Traits file needed to be: normalized continuous variables, no nulls, consistent data types, tab separated .txt

BDNN run 2: 8/27, 8/28
- python PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_CrossStage.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_CrossStage.txt -A 0 -n 20000000 -s 20000 -BDNNnodes 8 4 -BDNNupdate_f 0.05 0.05 0.25 -singleton 
- python PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_ByStages.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_ByStages.txt -A 0 -n 20000000 -s 2000 -BDNNnodes 8 4 -BDNNupdate_f 0.05 0.05 0.25 -singleton 1
    - Removed -mG flag
    - Removed singletons using -singleton 1
    - Reduced network complexity:
        - -BDNNnodes 8 4
        - -BDNNupdate_f 0.05 0.05 0.25
    - Shifted dates towards the present to remove empty space from LAD to present day: -translate 175.0

BDNN run 3: Torsten Reduced Complexity + no -mG, 8/28, 8/29
- python ../PyRate/PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_CrossStage.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_CrossStage.txt -n 50000000 -s 50000 -BDNNnodes 8 4 -translate -175
    - *Result*: low ESS prior, BD_lik. Burn-in ~ 15%
- python ../PyRate/PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_ByStages.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_ByStages.txt -n 50000000 -s 50000 -BDNNnodes 8 4 -translate -175
    - Starting to use PyRate from PyRate repo, not Arielli repo
    - Removed -mG flag
    - Reduced network complexity: -BDNNnodes 8 4

BDNN run 4: Gibbs + no -mG
- python ../PyRate/PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_CrossStage.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_CrossStage.txt -n 50000000 -s 50000 -se_gibbs -translate -175
- python ../PyRate/PyRate.py reptilia/Reptilia_cleaned_pyrate_input_PyRate.py -j 1 -fixShift data/Time_bins_ByStages.txt -BDNNmodel 1 -trait_file data/reptilia/Reptilia_species_traits.txt -qShift data/Time_bins_ByStages.txt -n 50000000 -s 50000 -se_gibbs -translate -175
    - Removed -mG flag
    - Uses Gibbs sampler: -se_gibbs True

NOTE:
- Cross Stages from BDNN run 4.a and 3.b. are the only ones currently running as of 8/30!


In [2]:
mcmc = pd.read_csv('pyrate_mcmc_logs/Reptilia_cleaned_pyrate_input_1_G_BDS_BDNN_16_8Tc_mcmc.log', sep='\t')
mcmc.head()

Unnamed: 0,it,posterior,prior,PP_lik,BD_lik,q_0,q_1,q_2,q_3,q_4,...,Yelaphomte_TE,Yimenosaurus_TE,Youngetta_TE,Youngina_TE,Youngosuchus_TE,Yunguisaurus_TE,Yunnanosaurus_TE,Zanclodon_TE,Zhongjiania_TE,Zupaysaurus_TE
0,0,-19488.475162,-519.246613,-11382.427246,-7586.801303,0.283842,0.283842,0.283842,0.283842,0.283842,...,210.773269,199.882408,247.366545,251.166559,247.083426,237.072325,199.3755,191.859012,255.134883,222.219523
1,2000,-15455.251899,-528.505992,-10340.130915,-4586.614992,0.517565,0.428381,0.416735,0.545225,0.394254,...,210.461578,199.579867,247.366545,250.991207,247.083426,237.072325,199.3755,192.214232,255.248001,222.219523
2,4000,-15151.722528,-537.285448,-10387.004305,-4227.432776,0.264763,1.172631,0.658385,0.29891,0.709248,...,210.461578,199.36223,247.281448,250.991207,246.858753,237.102069,198.961248,191.736214,255.628004,222.219523
3,6000,-15143.049096,-543.92294,-10399.280736,-4199.84542,0.599488,1.129137,0.600612,0.443968,0.798106,...,210.461578,199.539161,247.400523,251.372468,246.858753,237.030775,199.055066,191.370345,255.615157,222.016264
4,8000,-15161.341155,-550.906728,-10443.5752,-4166.859227,0.56897,1.591494,0.669723,0.427113,0.798312,...,210.281404,199.919074,247.350455,252.446115,246.858753,235.9386,199.055066,190.719868,255.615157,222.202844


In [4]:
# Checking to see if any columns are a list
list_columns = mcmc.columns[mcmc.applymap(lambda x: isinstance(x, list)).any()].tolist()

## Post-Processing Commands

* = Done

### Marginal RTT Plot
BDNN run 3:
- *python ../PyRate/PyRate.py -plotBDNN reptilia/pyrate_mcmc_logs/bdnn3_cross/Reptilia_cleaned_pyrate_input_1_BDS_BDNN_8_4Tc_mcmc.log -b 0.15 -translate -175
- python ../PyRate/PyRate.py -plotBDNN reptilia/pyrate_mcmc_logs/bdnn3_by/  _mcmc.log -b 0.1 -translate -175

BDNN run 4 (gibbs):
- python ../PyRate/PyRate.py -plotBDNN reptilia/pyrate_mcmc_logs/bdnn4_cross/  _mcmc.log -b 0.1 -translate -175

### Partial Dependence Plots (PDP)
BDNN run 3:
- *python ../PyRate/PyRate.py -plotBDNN_effects reptilia/pyrate_mcmc_logs/bdnn3_cross/Reptilia_cleaned_pyrate_input_1_BDS_BDNN_8_4Tc_mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt -translate -175 -b 0.15 -resample 100
- python ../PyRate/PyRate.py -plotBDNN_effects reptilia/pyrate_mcmc_logs/bdnn3/    mcmc.log -translate -175 -b 0.15 -resample 100
    - Testing without backscale.txt to see difference

BDNN run 4:
- python ../PyRate/PyRate.py -plotBDNN_effects reptilia/pyrate_mcmc_logs/bdnn4      mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt

### Partial Dependence Rates
BDNN run 3:
- *python ../PyRate/PyRate.py -BDNN_interaction reptilia/pyrate_mcmc_logs/bdnn3_cross/Reptilia_cleaned_pyrate_input_1_BDS_BDNN_8_4Tc_mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt -b 0.15 -resample 100

BDNN run4:
- python ../PyRate/PyRate.py -BDNN_interaction reptilia/pyrate_mcmc_logs/bdnn4      mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt -b 0.1 -resample 100

### Predictor Importance
BDNN run 3:
- python ../PyRate/PyRate.py -BDNN_pred_importance reptilia/pyrate_mcmc_logs/bdnn3       _mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt -b 0.15 -resample 100

BDNN run 4:
- python ../PyRate/PyRate.py -BDNN_pred_importance reptilia/pyrate_mcmc_logs/bdnn4       _mcmc.log -plotBDNN_transf_features data/reptilia/reptilia_backscale.txt -b 0.1 -resample 100