# Adaptive Stress Testing: Walk1D Example
## Monte Carlo Tree Search (MCTS)
This is the main entry to run AST using the MCTS algorithm.

In [None]:
include("../test/Walk1D.jl");

### Setup/create AST planner
* `planner` is used to play out the policy
* `mdp::ASTMDP` is the main MDP problem formulation object for AST (this holds reward metrics)
* `sim::Walk1DSim` is the main simulation object, holding all simulation information (e.g., current x position, settings for the simulation, etc)

In [2]:
(planner, mdp, sim) = setup_ast();

### Run AST with MCTS

In [3]:
tree = playout(mdp, planner; return_tree=true);

[32mProgress: 100%|█████████████████████████████████████████| Time: 0:00:02[39m


State = 0xc586ef06d338b447	:	Q = 2.6037780639021992	:	Action = UInt32[0x96ee0e8e, 0xd13db9ef]
State = 0xb6f0ad4d4f9a2001	:	Q = 2.8611261857810737	:	Action = UInt32[0x9c630f51, 0x1818ba67]
State = 0xd0775b7d0dd9850a	:	Q = 3.839585264730356	:	Action = UInt32[0x88f9f496, 0x59c86b50]
State = 0xc7612f13a54c8ea3	:	Q = 3.3735641759786943	:	Action = UInt32[0x17fc3002, 0x42c20b36]
State = 0x651a1aa00f41d033	:	Q = 3.104968934031712	:	Action = UInt32[0x57a1dd61, 0xcb3ac5b0]
State = 0x7524203408f0b6d1	:	Q = 3.1642805530230294	:	Action = UInt32[0x5d4ea03b, 0xbb1d2866]
State = 0x6af952ea5b4c24e7	:	Q = 2.7173739635763248	:	Action = UInt32[0x14e3a101, 0x27b731ea]
State = 0xa93313af3cc5f982	:	Q = 2.5542173139167748	:	Action = UInt32[0x0914ec64, 0x46ee830c]
State = 0xa13a9172a7c06b10	:	Q = 2.2781302094893356	:	Action = UInt32[0x028e68a1, 0xefdb0b6a]
State = 0xa1b705a73163b908	:	Q = 2.4496423512924372	:	Action = UInt32[0x4897efd3, 0x1f0c3515]
State = 0x8028a13a8b20ac2	:	Q = 2.185288500924474	:	Action = U

### Visualize interactive MCTS tree (using D3.js)

In [4]:
d3tree = visualize(tree)

# Plots/Figures (MCTS)

### Episodic metric plots
##### (TODO: no KDE and more efficient running mean)
Plots the episodic metrics, including running miss distance mean, minimum miss distance, and cumulative failures all over episode (i.e. iteration)

In [None]:
episodic_figures(mdp.metrics)

### Distribution plots
Plots miss distance distribution and log-likelihood distribution.

In [None]:
distribution_figures(mdp.metrics)

# Playback

Functions to playback the policy, and look at failure trajectories.

In [9]:
display(collect(values(mdp.top_paths)))

for A in keys(mdp.top_paths)
    println("———[", mdp.top_paths[A] ,"]|[", length(A)+1 ,"]———")
    AST.playback(mdp, A)    
end

10-element Array{Float64,1}:
 6.756205719236479 
 6.780392755023052 
 6.7870874545683675
 6.867272424154319 
 6.867684600468296 
 6.954277237621274 
 7.224441637788982 
 7.2555630153005675
 7.457212501897868 
 8.0122983361149   

———[6.756205719236479]|[23]———
———[6.780392755023052]|[26]———
———[6.7870874545683675]|[25]———
———[6.867272424154319]|[26]———
———[6.867684600468296]|[26]———
———[6.954277237621274]|[26]———
———[7.224441637788982]|[26]———
———[7.2555630153005675]|[25]———
———[7.457212501897868]|[26]———
———[8.0122983361149]|[26]———


In [11]:
# Follow MCTS policy online.
online_actions = AST.online_path(mdp, planner, verbose=true);

Sim. state: 1.0 -> Action: UInt32[0x5b27463d, 0x1dc0edbc]
Sim. state: 2.364408607263219 -> Action: UInt32[0xee889c22, 0x51eb69e6]
Sim. state: 3.7119249392535227 -> Action: UInt32[0xad691a35, 0x8859b7f9]
Sim. state: 5.145283660136754 -> Action: UInt32[0xba950fb2, 0x38b6a8b3]
Sim. state: 5.838897488294437 -> Action: UInt32[0x0229001c, 0x5bf02620]
Sim. state: 7.543633479339865 -> Action: UInt32[0x913d37ff, 0xe57238b7]
Sim. state: 7.130242613901661 -> Action: UInt32[0xd45adaf5, 0x7e6adc8d]
Sim. state: 6.344868435275079 -> Action: UInt32[0xa5568360, 0x9ac2ad6f]
Sim. state: 6.640720867950724 -> Action: UInt32[0x6b82bc67, 0x42114d0d]
Sim. state: 5.663469807953609 -> Action: UInt32[0x5a937920, 0xe0ab64af]
Sim. state: 5.771283124657753 -> Action: UInt32[0xacc198e9, 0x5649ec6f]
Sim. state: 7.7959419199823525 -> Action: UInt32[0xbcea8aea, 0x7e14a8ac]
Sim. state: 7.96280943717775 -> Action: UInt32[0x8f89a88c, 0x0caf18de]
Sim. state: 8.009060794140947 -> Action: UInt32[0xd90c258c, 0xd2ec7b19]
Sim. 

In [12]:
final_state = AST.playback(mdp, online_actions);
# AST.playback(mdp, collect(keys(mdp.top_paths))[end]);
online_actions

25-element Array{ASTAction,1}:
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0x5b27463d, 0x1dc0edbc]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0xee889c22, 0x51eb69e6]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0xad691a35, 0x8859b7f9]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0xba950fb2, 0x38b6a8b3]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0x0229001c, 0x5bf02620]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0x913d37ff, 0xe57238b7]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0xd45adaf5, 0x7e6adc8d]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0xa5568360, 0x9ac2ad6f]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0x6b82bc67, 0x42114d0d]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(UInt32[0x5a937920, 0xe0ab64af]))
 ASTAction(POMDPStressTesting.AST.RandomSeedGenerator.RSG(U