# Filoc Examples

## Read (Analysis)

Imagine you run a few simulations in order to fine tune the hyper-parameters `learning_rate` and `bellman_gamma`
of a reinforcement learning algorithm. you stored both the hyperparameters and the result in the following folder structure:

- `simulations/simid={simid}/hyperparameters.json`
- `simulations/simid={simid}/epid={epid}/result.json`

where `{simid}` and `{epid}` are an integer defining the simulation ID and the episode ID.

Here is how filoc can help analysing the simulation data and take further decisions.

In [2]:
from IPython.core.display import display
from filoc import filoc

Let's see the hyperparameters:

In [3]:
loc_hyp = filoc('./examples/simulations/simid={simid:d}/hyperparameters.json')
df_hyp = loc_hyp.read_contents()
display(df_hyp)

Unnamed: 0,learning_rate,bellman_gamma,simid
0,0.001,0.95,0
1,0.01,0.95,1
2,0.001,0.9,3


*filoc* scans the folder structure, reads every hyperparameters.json file and builds a dataframe with all values.

Now let's see the result.json files:

In [4]:
loc_res = filoc('./examples/simulations/simid={simid:d}/epid={epid:d}/result.json')
df_res = loc_res.read_contents()
display(df_res)

Unnamed: 0,score,duration,simid,epid
0,-10.45,0.7,0,0
1,-4.56,30.0,0,1
2,1.15,1.5,0,2
3,30.7,12.45,0,3
4,-0.1,12.45,1,0
5,100.45,12.45,1,1
6,100.45,12.45,1,2
7,100.45,12.45,1,3
8,100.45,12.45,3,0
9,100.45,12.45,3,1


Now let's combine both files together:

In [5]:
loc_all = filoc(hyp=loc_hyp, res=loc_res)
df_all  = loc_all.read_contents()
display(df_all)

Unnamed: 0,index.simid,index.epid,hyp.learning_rate,hyp.bellman_gamma,res.score,res.duration
0,0,0,0.001,0.95,30.7,12.45
1,0,1,0.001,0.95,30.7,12.45
2,0,2,0.001,0.95,30.7,12.45
3,0,3,0.001,0.95,30.7,12.45
4,1,0,0.01,0.95,100.45,12.45
5,1,1,0.01,0.95,100.45,12.45
6,1,2,0.01,0.95,100.45,12.45
7,1,3,0.01,0.95,100.45,12.45
8,3,0,0.001,0.9,100.45,12.45
9,3,1,0.001,0.9,100.45,12.45


This time, the column names are prefix with the provided prefix `hyp` and `res` as well `index` for
the keys, that were used to join the two tables. By default, filoc uses the keys defined in the path as
join keys. If a sub-filoc has fewer keys in its path, then it joins this table with only the available key.

Alternatively, you could directly pass the path instead of a sub-filoc:

In [7]:
loc_all = filoc(
    hyp='./examples/simulations/simid={simid:d}/hyperparameters.json',
    res='./examples/simulations/simid={simid:d}/epid={epid:d}/result.json'
)
df_all  = loc_all.read_contents()
display(df_all)

### Front-end alternative to pandas

In some cases, the *pandas DataFrame* has some nasty issues, that prevents you to properly visualize the result.
For example, it transforms an integer column to a float as soon as a value is missing in a row. Therefore, filoc
is not constructed on pandas, but only uses it for the visualization, as a kind of front-end.
There is currently another "front-end" based entirely on python simple dict and list, called "json" front-end:

In [9]:
loc_hyp = filoc('./examples/simulations/simid={simid:d}/hyperparameters.json', frontend='json')
json_hyp  = loc_hyp.read_contents()
display(json_hyp)

[{'learning_rate': 0.001, 'bellman_gamma': 0.95, 'simid': 0},
 {'learning_rate': 0.01, 'bellman_gamma': 0.95, 'simid': 1},
 {'learning_rate': 0.001, 'bellman_gamma': 0.9, 'simid': 3}]

You can then implement your own representation or pass this content to another framework for further analysis
or visualization. But we decided to set the `'pandas'` as default frontend, because it is easier to explore and
visualize your data in only a few steps.

### Read binary files
Simulation results are not necessarily in a nice chewed json format but rather in optimized binary format, like
tensorflow summary files or numpy arrays.

TODO (introduce caching here too)

## Writing
You can use filoc from the very beginning of your simulation campaign in order to
prepare the simulations and manage the execution state of your simulation. Let's
see an example

TODO


