In [8]:
from notebook.services.config import ConfigManager
cm = ConfigManager()
cm.update('livereveal', {
              'start_slideshow_at': 'selected',
              'width': 1024,
              'height': 768,
})

{u'height': 768, u'start_slideshow_at': 'selected', u'width': 1024}

# datreant

### persistent, Pythonic trees for heterogeneous data

**David L. Dotson**, Sean L. Seyler, Max Linke,  
Richard Gowers, Oliver Beckstein

## the problem

Scientific research often proceeds organically.

[Need an image of a directory tree, perhaps randomly generated]

Though portions are planned, the process is largely messy; this is especially true for simulation work.

## possible solutions?

* RDBMS?
* document databases?
* HDFS?

Rarely are these a good fit for the data one needs to store, including simulation parameters, system description, etc. Existing tools often require customary file formats.

## why not use the filesystem itself?

Cons:
* littered with irrelevant files
* hierarchical, but perhaps inconsistently strunctured

Pros:
* already stores anything we need (by definition)
* existing tools work with existing formats

**`datreant`** is an attempt to take advantage of the universality of the filesystem while minimizing its inconveniences

## Treants: discoverable directories with metadata

A ``Treant`` is a directory with a special **state file**:

In [12]:
import datreant.core as dtr

t = dtr.Treant('maple')
t.draw()

maple/
 +-- Treant.7da6e0a9-64db-431d-9141-e997945e05a6.json


The state file:
1. serves as a bookmark marking the directory as a ``Treant``.
2. stores metdata elements, such as *tags* and *categories*.

## introspecting and manipulating a Treant's tree

We can use a `Treant` to create directory structures:

In [15]:
t['a/place/for/data/'].makedirs()
t['a/place/for/text/'].makedirs()

t.draw()

maple/
 +-- Treant.7da6e0a9-64db-431d-9141-e997945e05a6.json
 +-- a/
     +-- place/
         +-- for/
             +-- data/
             +-- text/


And we can manipulate directories and files with `Tree` and `Leaf` objects, respectively.

For example, we could store a `pandas` DataFrame somewhere in the tree for reference later:

In [22]:
import pandas as pd
df = pd.DataFrame(pd.np.random.randn(3, 2),
                  columns=['A', 'B'])

In [23]:
data = t['a/place/for/data/']
data

<Tree: 'maple/a/place/for/data/'>

In [24]:
df.to_csv(data['random_dataframe.csv'].abspath)
data.draw()

data/
 +-- random_dataframe.csv


And we can introspect the file directly:

In [25]:
csv = data['random_dataframe.csv']
csv

<Leaf: 'maple/a/place/for/data/random_dataframe.csv'>

In [26]:
print(csv.read())

,A,B
0,-0.574553718574,-0.516982117727
1,-2.26093891758,0.58054828901
2,-0.0669276516294,-0.956296412749



## Aggregating and splitting on Treant metadata