Skip to content

Commit

Permalink
Still hammering on Bundle category docs. Various other fixes.
Browse files Browse the repository at this point in the history
Also, now Categories can return values for multiple keys in same way as
AggCategories. This was important that we get this congruency in
behavior.
  • Loading branch information
dotsdl committed Mar 21, 2016
1 parent db75d9b commit 27880c0
Show file tree
Hide file tree
Showing 9 changed files with 181 additions and 70 deletions.
3 changes: 0 additions & 3 deletions docs/api_bundle.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ They can also be created directly from any number of Treants:
.. autoclass:: datreant.core.Bundle
:members:
:inherited-members:
:noindex:

AggTags
```````
Expand All @@ -33,7 +32,6 @@ Bundles to access their members' tags.
.. autoclass:: datreant.core.agglimbs.AggTags
:members:
:inherited-members:
:noindex:

AggCategories
`````````````
Expand All @@ -43,4 +41,3 @@ by Bundles to access their members' categories.
.. autoclass:: datreant.core.agglimbs.AggCategories
:members:
:inherited-members:
:noindex:
3 changes: 0 additions & 3 deletions docs/api_filesystem.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ filesystem.
.. autoclass:: datreant.core.Tree
:members:
:inherited-members:
:noindex:

.. _Leaf_api:

Expand All @@ -25,7 +24,6 @@ filesystem.
.. autoclass:: datreant.core.Leaf
:members:
:inherited-members:
:noindex:

.. _View_api:

Expand All @@ -38,4 +36,3 @@ as providing mechanisms for filtering and subselection.
.. autoclass:: datreant.core.View
:members:
:inherited-members:
:noindex:
4 changes: 0 additions & 4 deletions docs/api_treants.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ The class :class:`datreant.core.Treant` is the central object of ``datreant.core
.. autoclass:: datreant.core.Treant
:members:
:inherited-members:
:noindex:

.. _Tags_api:

Expand All @@ -27,7 +26,6 @@ access their tags.
.. autoclass:: datreant.core.limbs.Tags
:members:
:inherited-members:
:noindex:

.. _Categories_api:

Expand All @@ -39,7 +37,6 @@ Treants to access their categories.
.. autoclass:: datreant.core.limbs.Categories
:members:
:inherited-members:
:noindex:

.. _Group_api:

Expand All @@ -51,7 +48,6 @@ member locations as a persistent Bundle within its state file.
.. autoclass:: datreant.core.Group
:members:
:inherited-members:
:noindex:

Members
```````
Expand Down
99 changes: 99 additions & 0 deletions docs/bundles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,10 @@ characteristics beyond just their path in the filesystem. Tags are one of these
distinguishing features, and Bundles can use them directly to filter their
members.

.. note:: For a refresher on using tags with individual Treants, see
:ref:`Tags_guide`. Everything that applies to using tags with
individual Treants applies to using them in aggregate with Bundles.

The aggregated tags for all members in a Bundle are accessible via
:attr:`datreant.core.Bundle.tags`. Just calling this property gives a view of
the tags present in every member Treant::
Expand Down Expand Up @@ -169,11 +173,106 @@ these tags earlier.

Fuzzy matching for tags
-----------------------
Over the course of a project spanning years, you might add several variations
of essentially the same tag to different Treants. For example, it looks like we
might have two different tags that mean the same thing among the Treants in our
Bundle::

>>> b.tags
{'building',
'firewood',
'for building',
'furniture',
'huge',
'paper',
'plant',
'shady',
'syrup'}

Chances are good we meant the same thing when we added 'building' and
'for building' to these Treants. How can we filter on these without explicitly
including each one in a tag expression?

We can use fuzzy matching::

>>> b.tags.fuzzy('building', scope='any')
('building', 'for building')

which we can use directly as an "or"-ing in a tag expression::

>>> b[b.tags[b.tags.fuzzy('building', scope='any')]]
<Bundle([<Treant: 'oak'>, <Treant: 'elm'>])>

The threshold for fuzzy matching can be set with the ``threshold`` parameter.
See the API reference for :meth:`datreant.core.agglimbs.AggTags.fuzzy` for more
details on how to use this method.

Grouping with Treant categories
===============================
Besides tags, categories are another mechanism for distinguishing Treants from
each other. We can access these in aggregate with a Bundle, but we can also use
them to build groupings of members by category value.

.. note:: For a refresher on using categories with individual Treants, see
:ref:`Categories_guide`. Much of what applies to using categories
with individual Treants applies to using them in aggregate with
Bundles.

The aggregated categories for all members in a Bundle are accessible via
:attr:`datreant.core.Bundle.categories`. Just calling this property gives a
view of the categories with keys present in every member Treant::

>>> b.categories
<AggCategories({'age': ['adult', 'young', 'young', 'old'],
'type': ['evergreen', 'deciduous', 'deciduous', 'deciduous'],
'bark': ['fibrous', 'smooth', 'mossy', 'mossy']})>

We see that here, the values are lists, which each member of the list giving
the value for each member, in member order. This is how categories behave when
accessing from Bundles, since each member may have a different value for a
given key.

But just as with tags, our Treants probably have more than just the keys 'age',
'type', and 'bark' among their categories. We can get a dictionary of the
categories with each key present among at least one member with ::

>>> b.categories.any
{'age': ['adult', 'young', 'young', 'old'],
'bark': ['fibrous', 'smooth', 'mossy', 'mossy'],
'health': [None, None, 'good', 'poor'],
'nickname': ['redwood', None, None, None],
'type': ['evergreen', 'deciduous', 'deciduous', 'deciduous']}

Note that for members that lack a given key, the value returned in the
corresponding list is ``None``. Since ``None`` is not a valid value for a
category, this unambibuously marks the key as being absent for these members.

Likewise we have ::

>>> b.categories.all
{'age': ['adult', 'young', 'young', 'old'],
'bark': ['fibrous', 'smooth', 'mossy', 'mossy'],
'type': ['evergreen', 'deciduous', 'deciduous', 'deciduous']}

which we've already seen.

Accessing and setting values with keys
--------------------------------------
Consistent with the behavior shown above, when accessing category values in
aggregate with keys, what is returned is a list of values for each member, in
member order::

>>> b.categories['age']
['adult', 'young', 'young', 'old']

And if we access a category with a key that isn't present among all members,
``None`` is given for those members in which it's missing::

>>> b.categories['health']
[None, None, 'good', 'poor']




API Reference: Bundle
=====================
Expand Down
1 change: 0 additions & 1 deletion docs/groups.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
====================================
Leveraging Groups for aggregate data
====================================

A **Group** is a special type of Treant that can keep track of any number of
Treants it counts as members. Just as a normal Treant can be used to manage
data obtained from a single study, a Group is useful for managing data obtained
Expand Down
2 changes: 2 additions & 0 deletions docs/tags-categories.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ that tailors its approach according to the Treant it encounters, we can use
tags and categories.


.. _Tags_guide:

Using tags
==========
Tags are individual strings that describe a Treant. Using our Treant
Expand Down
80 changes: 33 additions & 47 deletions src/datreant/core/agglimbs.py
Original file line number Diff line number Diff line change
Expand Up @@ -297,40 +297,31 @@ def __str__(self):
def __getitem__(self, keys):
"""Get values for a given key, list of keys, or set of keys.
If *keys* is a string specifying one key (for a single category),
return a list of the values among all Treants (in this collection) that
have that category.
If *keys* is a list of keys, return a list of lists whose order
corresponds to the order of the elements in *keys*. Each element in
*keys* is a key specifying a category; each element in the output is
a list of the values among all Treants (in this collection) that have
the category specified by the respective key in *keys*.
If *keys* is a set of keys, return a dict of lists whose keys are the
same as those provided in *keys*; the value corresponding to each key
If `keys` is a string specifying one key (for a single category),
return a list of the values among all Treants (in this collection) for
that category.
If `keys` is a list of keys, return a list of lists whose order
corresponds to the order of the elements in `keys`. Each element in
`keys` is a key specifying a category; each element in the output is
a list of the values among all Treants (in this collection) for the
category specified by the respective key in `keys`.
If `keys` is a set of keys, return a dict of lists whose keys are the
same as those provided in `keys`; the value corresponding to each key
in the output is a list of values among all Treants (in this
collection) that have the category corresponding to that key.
collection) for the category corresponding to that key.
Parameters
----------
keys
keys : str, list, set
Valid key(s) of Categories in this collection.
Returns
-------
list, list of list, dict of list
Values for the (single) specified category when *keys* is str.
Groupings of values, each grouping a list, where the first grouping
contains the values for all members of the collection corresponding
to the first value in *keys*, the second grouping contains values
for all members of the collection corresponding the second value in
*keys*, etc. when *keys* is a list of str.
Values in the dict corresponding to each of the provided *keys*
is a grouping (list) of Treants that have the Category specified by
that key when *keys* is a set of str.
list, list of lists, dict of lists
Values for the (single) specified category when `keys` is str.
"""
if keys is None:
return None
Expand All @@ -349,47 +340,42 @@ def __getitem__(self, keys):
for m in members]
for k in keys}
else:
raise TypeError("Invalid argument; argument must be" +
" a string, list of strings, or set" +
raise TypeError("Key must be a string, list of strings, or set"
" of strings.")

def __setitem__(self, key, values):
"""Set the value of Categories for each Treant in the collection.
"""Set the value of categories for each Treant in the collection.
If *values* is not a sequence and is a valid category type (int,
If `values` is not a sequence and is a valid category type (int,
string_types, bool, float), then it is broadcasted over all members of
the collection for the category specified by *key*.
the collection for the category specified by `key`.
If *values* is a sequence, it must have the same length as the number
If `values` is a sequence, it must have the same length as the number
of members in the collection so that, for each member, the value
assigned to its category (specified by *key*) is the element in
*values* whose index matches the index of that member in the
assigned to its category (specified by `key`) is the element in
`values` whose index matches the index of that member in the
collection.
Parameters
----------
key
key : str
Valid key for the category whose value should be set to *values*.
values
Value(s) for the category specified by *key*.
values : str, int, float, bool, list, tuple
Value(s) for the category specified by `key`.
"""
if values is None:
return

members = self._collection
if isinstance(key, (int, float, string_types, bool)):
v = values
if isinstance(values, (int, float, string_types, bool)):
for m in members:
m.categories.add({key: v})
m.categories.add({key: values})
elif isinstance(values, (list, tuple)):
if len(values) != len(members):
raise ValueError("Invalid argument; values must be a list of" +
" the same length as the number of members" +
" in the collection.")
for m in members:
gen = (v for v in values if v is not None)
for v in gen:
m.categories.add({key: v})
raise ValueError("Values must be a list of the same length as"
" the number of members in the collection.")
for m, v in zip(members, values):
m.categories[key] = v

def __delitem__(self, category):
"""Remove *category* from each Treant in collection.
Expand Down
48 changes: 36 additions & 12 deletions src/datreant/core/limbs.py
Original file line number Diff line number Diff line change
Expand Up @@ -374,26 +374,50 @@ def __str__(self):
out = out + "'{}': '{}'\n".format(key, categories[key])
return out

def __getitem__(self, key):
"""Get value at given key.
def __getitem__(self, keys):
"""Get values for given `keys`.
:Arguments:
*key*
key of value to return
If `keys` is a string, the single value for that string is returned.
If `keys` is a list of keys, the values for each key are returned in a
list, in order by the given keys.
if `keys` is a set of keys, a dict with the keys as keys and values as
values is returned.
Parameters
----------
keys : str, list, set
Key(s) of value to return.
Returns
-------
values : str, int, float, bool, list, or dict
Value(s) corresponding to given key(s).
:Returns:
*value*
value corresponding to given key
"""
categories = self._dict()
return categories[key]

if isinstance(keys, (int, float, string_types, bool)):
return categories[keys]
elif isinstance(keys, list):
return [categories[key] for key in keys]
elif isinstance(keys, set):
return {key: categories[key] for key in keys}
else:
raise TypeError("Key must be a string, list of strings, or set"
" of strings.")

def __setitem__(self, key, value):
"""Set value at given key.
:Arguments:
*key*
key of value to set
Parameters
----------
key : str
Key of value to set.
value : str, int, float, bool
Value to set for given key.
"""
outdict = {key: value}
self.add(outdict)
Expand Down

0 comments on commit 27880c0

Please sign in to comment.