Skip to content

Commit

Permalink
Hammering on docs. Lots of little updates.
Browse files Browse the repository at this point in the history
Finally have basic Tree and Leaf docs. Not much on Leaves yet. Also
starting work on View docs.
  • Loading branch information
dotsdl committed Mar 16, 2016
1 parent 913c873 commit a9d1ea2
Show file tree
Hide file tree
Showing 7 changed files with 110 additions and 81 deletions.
14 changes: 7 additions & 7 deletions docs/Tags-Categories.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,14 @@ Tags are individual strings that describe a Treant. Using our Treant
``sprout`` as an example, we can add many tags at once ::

>>> from datreant import Treant
>>> t = Treant('sprout')
>>> t.tags.add('elm', 'mirky', 'misty')
>>> t.tags
>>> s = Treant('sprout')
>>> s.tags.add('elm', 'mirky', 'misty')
>>> s.tags
<Tags(['elm', 'mirky', 'misty'])>

They can be iterated through as well ::

>>> for tag in t.tags:
>>> for tag in s.tags:
>>> print tag
elm
mirky
Expand All @@ -31,15 +31,15 @@ analysis code. For example, if we have Treants with different shades of bark
(say, "dark" and "light"), we can make a category that reflects this. In this
case, we categorize ``sprout`` as "dark" ::
>>> t.categories['roots'] = 'dark'
>>> t.categories
>>> s.categories['roots'] = 'dark'
>>> s.categories
<Categories({'roots': 'shallow'})>

Perhaps we've written some analysis code that will take both "dark" and "light"
Treants as input but needs to handle them differently. It can see what variety
of **Treant** it is working with using ::

>>> t.categories['roots']
>>> s.categories['roots']
'shallow'

The keys for categories must be strings, but the values may be strings, numbers
Expand Down
6 changes: 3 additions & 3 deletions docs/Treants.rst
Original file line number Diff line number Diff line change
Expand Up @@ -45,16 +45,16 @@ session (go ahead!) and regenerate this Treant immediately there ::

>>> # python session 2
>>> import datreant.core as dtr
>>> t = dtr.Treant('sprout')
>>> s = dtr.Treant('sprout')

Making a modification to the Treant in one session, perhaps by adding a tag,
will be reflected in the Treant in the other session ::

>>> # python session 1
>>> t.tags.add('elm')
>>> s.tags.add('elm')

>>> # python session 2
>>> t.tags
>>> s.tags
<Tags(['elm'])>

This is because both objects pull their identifying information from the same
Expand Down
60 changes: 56 additions & 4 deletions docs/Trees.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@
Filesystem manipulation with Trees and Leaves
=============================================
A Treant functions as a specially marked directory, having a state file with
identifying information that makes it a Treant at all. What's a Treant without
a state file? It's just a **Tree**.
identifying information. What's a Treant without a state file? It's just a
**Tree**.

datreant gives pythonic access to the filesystem by way of **Trees** and
**Leaves** (directories and files, respectively). Say our current working
Expand All @@ -29,16 +29,68 @@ of a Tree or Leaf can point to the same place.

Working with paths
==================
**Tree** objects can be used to introspect downward into their directory
structure. Since a Tree is essentially a container for its own child Trees and
Leaves, we can use getitem syntax to dig around ::

>>> t = dtr.Tree('moe')
>>> t['a/directory/']
<Tree: 'moe/a/directory/'>

>>> t['a/file']
<Leaf: 'moe/a/file'>

Paths that resolve as being inside a Tree give `True` for membership tests ::

>>> t['a/file'] in t
True

Note that neither of these items need exist ::

getitem, existence, making
>>> t['a/file'].exists
False

in which case whether a Tree or Leaf is returned is dependent on an ending
``/``. We can create directories and empty files easily enough, though ::

>>> adir = t['a/directory/'].make()
>>> adir.exists
True

>>> afile = t['a/file'].make()
>>> afile.exists
True

.. note:: For accessing directories and files that exist, getitem syntax isn't
sensitive to ending ``/`` separators to determine whether to give a
Tree or a Leaf.

A Treant is a Tree
==================
The **Treant** object is a subclass of a Tree, so the above all applies to
Treant behavior. Some methods of Trees are especially useful when working with
Treants. One of these is ``draw`` ::

>>> s = dtr.Treant('sprout')
>>> s['a/new/file'].make()
>>> s['a/.hidden/directory/'].make()
>>> s.draw()
sprout/
+-- Treant.839c7265-5331-4224-a8b6-c365f18b9997.json
+-- a/
+-- new/
| +-- file
+-- .hidden/
+-- directory/

which gives a nice ASCII-fied visual of the Tree. We can also obtain a
collection of Trees and/or Leaves in the Tree with globbing ::

globbing, drawing,
>>> s.glob('a/*')
<View([<Tree: 'sprout/a/.hidden/'>, <Tree: 'sprout/a/new/'>])>

See :ref:`Views` for more about the **View** object, and how it can be used to
manipulate many Trees and Leaves as a single logical unit.

Reference: Tree
===============
Expand Down
11 changes: 11 additions & 0 deletions docs/Views.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
======================================================
Using Views to work with Trees and Leaves collectively
======================================================
A **View** makes it possible to work with arbitrary Trees and Leaves as a
single logical unit.

Reference: View
===============
.. autoclass:: datreant.core.View
:members:
:inherited-members:
2 changes: 1 addition & 1 deletion docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -285,7 +285,7 @@
napoleon_google_docstring = False
napoleon_numpy_docstring = True
napoleon_include_private_with_doc = False
napoleon_include_special_with_doc = True
napoleon_include_special_with_doc = False
napoleon_use_admonition_for_examples = False
napoleon_use_admonition_for_notes = False
napoleon_use_admonition_for_references = False
Expand Down
84 changes: 21 additions & 63 deletions docs/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,43 +10,37 @@ Frequently Asked Questions
remote resources, where it might not be possible to set up one's own daemon
to talk to.

*treants are portable.*
Because they store their state in HDF5 (which itself is portable across
systems, handling endianess, etc.), and because treants store their
*Treants are portable.*
Because they store their state in JSON, and because treants store their
data in the filesystem, they are easy to move around piecemeal. If you
want to use a treant on a remote system, but don't want to drag all
its stored datasets with it, you can copy only what you need.
want to use a treant on a remote system, but don't want to drag all its
stored datasets with it, you can copy only what you need.

Contrast this with many database solutions, in which you either copy the
whole database somehow, or slurp the pieces of data out that you want.
Most database solutions can be rather slow to do this, to my knowledge.
Most database solutions can be rather slow to do this.

*treants are independent.*
Although Groups are aware of their members, treants work
independently from one another. If you want to use only basic
treants or Sims, that works just fine. If you want to use Groups,
that works, too. If you want to use the Coordinator (not yet
implemented; think a thin database that treants share their info
with so they can be quickly queried from one place in the filesystem),
then you can, but you don't have to. You don't have to buy the whole
farm to ride the horse, in a sense.
*Treants are independent.*
Although Groups are aware of their members, Treants work independently
from one another. If you want to use only basic Treants, that works
just fine. If you want to use Groups, that works, too.

*treants have a structure in the filesystem.*
*Treants have a structure in the filesystem.*
This means that all the shell tools we know and love are available to
work with their contents, which might include plaintext files, figures,
topology files, trajectories, random pickles, ipython notebooks, html
files, etc. Basically, treants are as versatile as the filesystem
is, at least when it comes to storage.
topology files, simulation trajectories, random pickles, ipython
notebooks, html files, etc. Basically, treants are as versatile as the
filesystem is, at least when it comes to storage.

2. What are some disadvantages of datreant's design?

*treants could be anywhere in the filesystem.*
*Treants could be anywhere in the filesystem.*
This is mostly a problem for Groups, which allow aggregation of other
treants. If a member is moved, the Group has no way of knowing where
it went; we've built machinery to help it find its members, but these
will always be limited to filesystem search methods (some quite good,
but still). If these objects lived in a single database, this wouldn't
be an issue.
Treants. If a member is moved, the Group has no way of knowing where it
went; we've built machinery to help it find its members, but these will
always be limited to filesystem search methods (some quite good, but
still). If these objects lived in a single database, this wouldn't be
an issue.

*Queries on object metadata will be slower than a central database.*
We want Groups and Bundles (in-memory Groups, basically) to be able to
Expand All @@ -55,49 +49,13 @@ Frequently Asked Questions
these objects and not against a single table somewhere, it will be
relatively slow.

The Coordinator is an answer to this problem, albeit an imperfect one.
The idea is that you can make a Coordinator, which is a small
daemonless database (perhaps SQLite, but could be HDF5), and you can
add treants for it to be aware of with something like::

import mdsynthesis as mds

co = mds.Coordinator('camelot')
co.add('moe', 'larry', 'curly')

# could also let the coordinator do a downward search and add all
# treants it finds
co.discover()
This awareness is bi-directional: a Coordinator is aware of its
members, and its members are aware of the Coordinator, and where it
lives. The Coordinator will store tables of member attributes for fast
queries, and these tables will be updated by members as they themselves
are updated. So whenever we have::

python

c = mds.treant('moe')
c.categories['bowlcut'] = True
the treant updates both its state file and the Coordinator(s) it is
affiliated with. This is in contrast to Groups, of which members are
unaware. This is by design: the idea is that treants are likely to be
members of many different Groups all over the filesystem; there would be
comparatively fewer Coordinators in use, which have a performance hit to a
treant for each affiliation.

Obvious problem: there are probably a lot of ways for Coordinators and their
members to get out-of-sync. A single database with everything inside avoids
this entirely.

*File locking is less efficient for multiple-read/write under load than a smart daemon process/scheduler.*
The assumption we make is that treants are primarily read, and only
The assumption we make is that Treants are primarily read, and only
occasionally written to. This is assumed for their data and their
metadata. They are not designed to scale well if the same parts are
being written to and read at the same time by many processes.

Having treants exist as separate files (state files and data all
Having Treants exist as separate files (state files and data all
separate) does mitigate this potential for gridlock, which is one
reason we favor many files over few. But it's still something to be
aware of.
14 changes: 11 additions & 3 deletions src/datreant/core/trees.py
Original file line number Diff line number Diff line change
Expand Up @@ -389,7 +389,15 @@ def glob(self, pattern):
return View(out)

def draw(self, depth=None, hidden=False):
"""Print an asciified visual of the tree.
"""Print an ASCII-fied visual of the tree.
Parameters
----------
depth : int
Maximum directory depth to display. ``None`` indicates no limit.
hidden : bool
If False, do not show hidden files; hidden directories are still
shown if they contain non-hidden files or directories.
"""
if not self.exists:
Expand Down Expand Up @@ -447,7 +455,7 @@ def make(self):

@property
def limbs(self):
"""A list of this Tree's attached limbs.
"""A set of this Tree's attached limbs.
"""
return list(self._classlimbs | self._limbs)
return self._classlimbs | self._limbs

0 comments on commit a9d1ea2

Please sign in to comment.