# Parameter recovery and computation performance with smaller state step size

In the `grid_search_benchmarks_outputs.ipynb` we looked at performance of different parallelization schemes (over trials, over parameter combinations and using either Base or Transducers).

Regardless of how long they took none of the parallelization combinations recovered the true parameters. Looking at the trialswise posteriors we saw that the posterior for the true model never gained traction.

To see if we might be more successful in recovering the true parameters we reduced the 

Additionally, we also tested the effect of different parallelization schemes to see if there would performance differences in this scenario, where we knew 10x more computations were going to necessary.

To download data run:

```
rsync -av zenkavi@login.hpc.caltech.edu:/central/groups/rnl/zenkavi/ADDM.jl/performance/outputs/ ./performance/outputs/
```

In [1]:
library(tidyverse)

-- [1mAttaching core tidyverse packages[22m ------------------------ tidyverse 2.0.0 --
[32mv[39m [34mdplyr    [39m 1.1.2     [32mv[39m [34mreadr    [39m 2.1.4
[32mv[39m [34mforcats  [39m 1.0.0     [32mv[39m [34mstringr  [39m 1.5.0
[32mv[39m [34mggplot2  [39m 3.4.2     [32mv[39m [34mtibble   [39m 3.2.1
[32mv[39m [34mlubridate[39m 1.9.2     [32mv[39m [34mtidyr    [39m 1.3.0
[32mv[39m [34mpurrr    [39m 1.0.2     
-- [1mConflicts[22m ------------------------------------------ tidyverse_conflicts() --
[31mx[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31mx[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mi[39m Use the conflicted package ([3m[34m<http://conflicted.r-lib.org/>[39m[23m) to force all conflicts to become errors


## Parameter recovery

The dataset consisted of 1500 trials generated with parameters d = 0.007, sigma = 0.03, theta = 0.6. The parameter space consisted of 8000 combinations (20 per parameter) where d was sampled between .001 and .020 with a step size of .001, sigma was sampled between .01 and .20 with a step size of .01 and theta was sampled between .27 and .85 with a step size of .03.

In [2]:
files_path = "../outputs/"
files_path

In [3]:
param_files = list.files(files_path, pattern = "small_stepsize.*best_pars.csv")
param_files

In [4]:
fn = param_files[1]
strsplit(fn, "_")[[1]]

In [5]:
best_pars = tibble()
for (fn in param_files) {
  cur_pars = read.csv(paste0(files_path, fn))
  if(!("likelihood_fn" %in% names(cur_pars))){
    cur_pars$likelihood_fn = "ADDM.aDDM_get_trial_likelihood"
  }
  cur_pars = cur_pars %>% select(barrier,bias,d,decay,likelihood_fn,nonDecisionTime,sigma,theta)
  fn_info = strsplit(fn, "_")[[1]]
  cur_pars$grid_fn = "floop"
  cur_pars$grid_exec = fn_info[3]
  cur_pars$trials_exec = "thread"
  best_pars = bind_rows(best_pars, cur_pars)

}

Both versions recover the correct d and theta and are one step off the theta.

In [6]:
best_pars

barrier,bias,d,decay,likelihood_fn,nonDecisionTime,sigma,theta,grid_fn,grid_exec,trials_exec
<int>,<dbl>,<dbl>,<int>,<chr>,<int>,<dbl>,<dbl>,<chr>,<chr>,<chr>
1,0,0.007,0,ADDM.aDDM_get_trial_likelihood,100,0.03,0.63,floop,seq,thread
1,0,0.007,0,ADDM.aDDM_get_trial_likelihood,100,0.03,0.63,floop,thread,thread


## Computation time

We tried two parallelizations schemes: 1. Parallelize over both trials and parameter combinations. Parallelization over parameter combinations was done using ThreadedEx() in FLoops.jl, part of the Transducers.jl ecosystem. Parallelization over trials used Based.threads. 2. Parallelize over trials only. This used the same functions as the first but with SequentialEx() in the grid_search function.  

Previous results showed no gains in computation time when parallelizing over both trials and parameter combinations, which suggested that using Transducers was not setting up a hierarchical structure across threads. It also implied that giving the same resources to either setup should yield comparable comparable computation times, provided that all threads are utilized efficiently.  

Hypothetically, two aspects of the data would affect computation times:  
- How many likelihood computations? RT/time step for each trial x 8000 - should affect computation time  
- How many things to computate for each timestep boundary*2/state step - should affect memory (more)  

Despite previous results, we find that parallelizing over both the trials and parameter combinations was about 5.5 hours faster.

## Memory

How does memory usage change with decrease state step size?