### Generate an environment
Currently, this just copies data/default_env.pickle into data/env.pickle.

**To do**: Add a DVC parameter such as `env_name` and if this isn't set to "default" then generate a new environment.

In [None]:
!dvc run --force -n gen_env -w .. -d data/default_env.pickle -o data/env.pickle python scripts/gen_env.py

### Generate observations
Currently this checks the values of the parameters `target_area`, `true_norm`, `num_observations` and `obs_data_set`. If they are equal to the settings of the observations in data/default_observations.pickle (and `obs_data_set` has value 1), then that file is copied into data/observations.pickle.

**To do**: Generate a new set of observations if the parameter settings don't match the default one, or `obs_data_set` is not 1.


In [None]:
!dvc run --force -n gen_observations -w .. -d data/env.pickle -p target_area,true_norm,num_observations,obs_data_set -o data/observations.pickle python scripts/gen_observations.py

### Generate MCMC chains
Creates data/chains_and_log_likelihoods.pickle. Also metrics/starts_info.txt, but that is not currently recorded in the "dvc run" command below.

In [None]:
!dvc run --force --always-changed -n gen_mcmc_chains -w .. -d data/env.pickle -d data/observations.pickle -p n,m,rf,colour_specific,shape_specific,target_area -o data/chains_and_log_likelihoods.pickle python scripts/gen_mcmc_chains.py

### Analyse the chains
Generate file metrics/chain_likelihoods.csv and append more information to metrics/chain_info.txt (note: DVC may not handle changing a file very well - perhaps a separate file should be created).

**Temporary hack**: chain_likelihoods contains posterior probabilities now, even though the field name has not changed.

In [None]:
!dvc run --force --always-changed -n analyse_chains -w .. -d data/chains_and_log_likelihoods.pickle -d data/env.pickle -d data/observations.pickle -p n,m,rf,colour_specific,shape_specific,target_area -o metrics/chain_info.txt --plots metrics/chain_likelihoods.csv python scripts/analyse_chains.py

### Generate a plot showing norms ordered by likelihood (or now posterior prob.?)
The file plots.html is generated in the root folder, based on data in metrics/chain_likelihoods.csv

In [None]:
!pushd ..; dvc plots show -t plots/norm_exp_histogram.json metrics/chain_likelihoods.csv

### Perform a convergence test to compute $\hat{R}$
The result is written to metrics/conv_test.txt. Also, the posterior sample (after discarding initial warm-up segments of chains and combining the remains) is writen to data/posterior.pickle. 


In [None]:
!dvc run --force --always-changed -n conv_test -w .. -d data/chains_and_log_likelihoods.pickle -p rhat_step_size -o data/posterior.pickle -o metrics/conv_test.txt python scripts/conv_test.py

### Extract the top norms and compute precision and recall
The results are written to metrics/precision_recall.txt.


In [None]:
!dvc run --force --always-changed -n extract_top_norms -w .. -d data/posterior.pickle -d data/env.pickle -p colour_specific,shape_specific,target_area,true_norm -o metrics/precision_recall.txt python scripts/extract_top_norms.py

### The cells below are for testing and are not part of the workflow

In [None]:
!cd ..; python scripts/test_overdispersed_starts.py

In [None]:
!dvc repro

In [None]:
!rm ../.dvc/tmp/rwlock

In [None]:
!cd ..; python scripts/conv_test_no_warmup.py

In [None]:
!cd ..; python scripts/test_obs_likelihood.py

In [None]:
import os
os.getcwd()