In [1]:
%load_ext autoreload
%autoreload 2

This notebook is meant to demo a few dev techniques when tackling a new problem. 

It involves: TDD, typing, scaffolding, protocols, facades, repository pattern,... 

## The problem

The original problem: **As a user, I can tag (recording) sessions, view my tags, and retrieve the sessions for a given tag.**

_(Note that tagging is equivalent to grouping, and is itself more general than heirarchical (i.e. tree structure) grouping.)_

Questions:

- Where do I store the tag/sessions info?
- How (what data structure, what am I actually storing)?

Answers:

Shouldn't start with those questions. Those questions have to do with specifics. Of course, we'll need to make choices about those questions eventually, but we'll produce a brittle code-base if we don't first think of abstractions.

Let's first think of a more general expression of what we're actually trying to do, and find a less specific vocabulary to describe it. We can (maybe even should) reintroduce the domain-specific vocabulary, as a layer on top of the more general mechanism, but we want to give ourselves a chance of knowing what the more abstract problem pattern is.

The transformed, more general, so reusable, problem: **Same as above but replace "sessions" with "objects".**

Then:

* First look at what makes sense in the domain/interface -- Express this with types and tests

* Then implement the interface, with a backend that corresponds to the current constraints

## A little digression on scaffolding

In [4]:
from typing import MutableMapping, Iterable, Any, NewType, Callable

Group = NewType('Group', str)
Item = NewType('Item', Any)


What functionality do we want around groups and their items?

Let's express this through type annotations of some functions (which we'll encapsulate in a `Groups` class.

In [5]:
class GroupsDacc:
    add_items_to_group: Callable[[Iterable[Item], Group], Any]
    list_groups: Callable[[], Iterable[Group]]
    items_for_group: Callable[[Group],Iterable[Item]]
    

Note that this should now be sufficient to generate a scaffold of what we need. 

We'll do it in two ways: By (dynamically) creating a `typing.protocol` describing a concrete `Groups` object we could implement in the future, and by generating (the code string for) a concrete (but empty) such `Groups` class.

In [6]:
import i2

from meshed.scrap.annotations_to_meshes import (
    func_types_to_scaffold, 
    func_types_to_protocol
)

# Groups.__annotations__ is a {name: func_annotation, ...} dict
# We can make a protocol from that
GroupsProtocol = func_types_to_protocol(GroupsDacc.__annotations__)

# See that GroupsProtocol has methods for each item of the 
# Groups.__annotations__ dict. Each method bares a signature compatible with 
# the annotations.
i2.Sig(GroupsProtocol.add_items_to_group)
# <Sig (self, iterable: Iterable[__main__.Item], group: __main__.Group) -> Any>

<Sig (self, iterable: Iterable[__main__.Item], group: __main__.Group) -> Any>

In [7]:
# We can also 
print(func_types_to_scaffold(GroupsDacc.__annotations__))


class GeneratedClass:
    def add_items_to_group(self, iterable: Iterable, group: Group) -> Any:
    	pass

    def list_groups(self) -> Iterable:
    	pass

    def items_for_group(self, group: Group) -> Iterable:
    	pass



## TDD: Tests that describe the behavior we want

In [1]:
from typing import MutableMapping, Iterable, Any, NewType, Callable, Protocol, Optional

Tag = NewType('Tag', str)
Obj = NewType('Object', Any)  # or just object?


class TaggerProtocol(Protocol):
    def tag_objs(self, tag: Tag, *objs: Iterable[Obj]) -> Any:
        """tag one or several objs """

    def tags(self, obj: Optional[Obj] = None) -> Iterable[Tag]:
        """List tags of obj, or all tags if obj is None"""

    def objs(self, tag: Optional[Tag] = None) -> Iterable[Obj]:
        """List objs with tag, or all objs if tag is None"""
    
    
def test_tagger(tagger: TaggerProtocol):
    # the following assertion isn't part of the behavior we want -- just a condition we'll 
    # need to be able to conduct our test: Namely, that our collection of groups/items is empty.

    assert list(tagger.tags()) == []  # make sure test is well setup (tagger is empty)

    tagger.tag_objs('tag_a', 'obj_1', 'obj_2')
    assert sorted(tagger.tags()) == ['tag_a']  # unfiltered tags() method

    tagger.tag_objs('tag_b', 'obj_3')
    assert sorted(tagger.tags()) == ['tag_a', 'tag_b']  
    assert sorted(tagger.objs('tag_a')) == ['obj_1', 'obj_2']  # filtered objs() method

    tagger.tag_objs('tag_c', 'obj_3')
    assert sorted(tagger.tags()) == ['tag_a', 'tag_b', 'tag_c']
    assert sorted(tagger.tags('obj_3')) == ['tag_b', 'tag_c']  # filtered tags() method

    # unfiltered objs() method
    assert sorted(tagger.objs()) == ['obj_1', 'obj_2', 'obj_3']


Now we'll implement two concrete `TagsDacc`, using a store, a `MutableMapping`, as a back-end so as to keep the persistance concern still separate. 

The idea is: As long as we provide our concrete persister with the right `MutableMapping` facade (with a minimum of specifics/semantics such as what the keys and values are meant to be), we should have a working object.

The two `TagsDacc` options will differ on the particulars of the store. 
- In the first, we'll assume the store has objs as keys and tags as values. 
- In the second we'll assume the tags are the keys, and values are sets of objs of that group.

## Concrete Tagger (option 1): ObjTagDacc

In [2]:
from typing import MutableMapping, Iterable, Any, NewType
from dataclasses import dataclass

Tag = NewType('Tag', str)
Obj = NewType('Object', Any)  # or just object?
ObjTagPairs = NewType('ObjTagPairs', MutableMapping[Obj, Tag])

def flatten_set(set_of_sets):
    return {obj for subset in set_of_sets for obj in subset}

@dataclass
class ObjTagDacc:
    store: ObjTagPairs
        
    def tag_objs(self, tag: Tag, *objs: Iterable[Obj]) -> Any:
        for obj in objs:
            if obj in store:
                # self.store[obj].add(tag)  # TODO: Make this work with value wrapper
                tags = self.store[obj]
                tags.add(tag)
                self.store[obj] = tags
            else:
                self.store[obj] = {tag}

    def tags(self, obj: Optional[Obj] = None) -> Iterable[Tag]:
        if obj is None:
            return flatten_set(self.store.values())
        else:
            return self.store[obj]

    def objs(self, tag: Optional[Tag] = None) -> Iterable[Obj]:
        # TODO: Express this filtering in such a way that will allow us to take advantage of DB specifics
        #  (e.g., passing on the filtering to the DB instead of filtering in python itself)
        if tag is None:
            return self.store.keys()
        else:
            return (obj for obj, tags in self.store.items() if tag in tags)


# Test it:
store = dict()  # make a store for ObjTagDacc to use
tagger = ObjTagDacc(store)  # make a tagger (that will use that store to "persist")
test_tagger(tagger)  # test the tagger

## Concrete Tagger (option 2): TagSetsDacc

In [3]:
from typing import MutableMapping, Iterable, Any, NewType, Set
from dataclasses import dataclass

Group = NewType('Group', str)
Item = NewType('Item', Any)
GroupSets = NewType('GroupSets', MutableMapping[Group, Set[Item]])

Tag = NewType('Tag', str)
Obj = NewType('Object', Any)  # or just object?
TagSets = NewType('ObjTagPairs', MutableMapping[Tag, Set[Obj]])

# TODO: Note that the tags and objs methods are essentially those of ObjTagDacc, swapped
#  Let's use that fact!
@dataclass
class TagSetsDacc:
    store: TagSets
        
    def tag_objs(self, tag: Tag, *objs: Iterable[Obj]) -> Any:
        if tag not in self.store:
            self.store[tag] = set()
        self.store[tag] |= set(objs)

    def tags(self, obj: Optional[Obj] = None) -> Iterable[Tag]:
        if obj is None:
            return set(self.store)
        else:
            return set(tag for tag, objs in self.store.items() if obj in objs)
            
    def objs(self, tag: Optional[Tag] = None) -> Iterable[Obj]:
        if tag is None:
            return flatten_set(self.store.values())
        else:
            return self.store[tag]
    
# Test it:
store = dict()  # make a store for ObjTagDacc to use
tagger = TagSetsDacc(store)  # make a tagger (that will use that store to "persist")
test_tagger(tagger)  # test the tagger

## Actual persisting stores using mongo

In [4]:
# from mongodol.base import MongoClient

### ObjTagPairs (for ObjTagDacc)

Options for implementing:

```
s[obj] = tag
```

Option 1: But here we'd need to produce the ID on write

```
--> {'_id': ID, 'tags': tags, 'obj': obj}
```

Option 2: But we need to allow re-writes on `_id`

```
--> {'_id': obj, 'tags': tags}
```


In [5]:
# Option 2

from operator import itemgetter

from dol import wrap_kvs, Pipe
from mongodol.stores import MongoStore


# To be able to overwrite an existing (item, group) pair (by default MongoStore doesn't allow it)
def delete_if_exists(self, k, v):
    if k in self:
        del self[k]
    return v


trans = Pipe(
    wrap_kvs(
        key_of_id=itemgetter('_id'), 
        id_of_key=lambda x: {'_id': x}, 
        obj_of_data=lambda x: set(x['tags']),
        data_of_obj=lambda x: {'tags': list(x)}, 
        preset=delete_if_exists,
    )
)

@trans
class TagStore(MongoStore):
    """To group items"""
    def __init__(self,
        db_name='scrap',
        collection_name='tagged_objects',
        mongo_client_kwargs=None,
    ):
        super().__init__(
            db_name=db_name,
            collection_name=collection_name,
            key_fields=['_id'],
            data_fields=['tags'],
            mongo_client_kwargs=mongo_client_kwargs
        )
    
store = TagStore()

# empty the store
for k in store: 
    del store[k]

test_tagger(TagSetsDacc(store))


In [6]:
underlying_store = store.store
list(zip(underlying_store, underlying_store.values())) 

[({'_id': 'tag_a'}, {'tags': ['obj_2', 'obj_1']}),
 ({'_id': 'tag_b'}, {'tags': ['obj_3']}),
 ({'_id': 'tag_c'}, {'tags': ['obj_3']})]

In [7]:
base_store = store.store.store
list(zip(base_store, base_store.values())) 

[({'_id': 'tag_a'}, {'tags': ['obj_2', 'obj_1']}),
 ({'_id': 'tag_b'}, {'tags': ['obj_3']}),
 ({'_id': 'tag_c'}, {'tags': ['obj_3']})]

... to be continued

## Implementation that uses "metadata" collection

Say you already have a mongo collection that contains meta-data on your items. 
That is, a collection that contains docs, one per item, that is intended to record information about this item. 
The groups the item belongs to can be just one additional one. 

In [None]:
test_metadata_docs = [
    {
        "_id": "123",
        "ref": "absolute/reference/to/content",
        "some": "other metadata",  # just to show there can be other stuff
        "tags": {
            # instead of a list, we'll use an object (dict), whose fields are the group names
            # This is because mongoDB allows us to index fields, therefore automatically 
            # get the bidirectional mapping from groups to refs the group "contains"
            "tag1": True,
            "tag2": True,
        }
    },
    {
        "_id": "456",
        "ref": "absolute/reference/to/some/other/content",
        "tags": {
            "tag1": True,
            "tag2": True,
        }
    },
    {
        "_id": "789",
        "ref": "this/ref/is/necessary",
        "optional": "metadata",
        # and no tags here (but whenever someone/something adds a tag, it will be added here)
    }
]

# Make a collection with only those docs in it

from mongodol import MongoCollectionPersister

def prepare_test_collection():
    collection_uri = 'scrap/tagged_objects'  # not given to input ON PURPOSE!!
    s = MongoCollectionPersister(collection_uri, iter_projection=None)

    for doc in s:
        del s[doc]

    assert list(s) == []

    for doc in test_metadata_docs:
        s[doc] = doc

    assert list(s) == test_metadata_docs

    return s

s = prepare_test_collection()
list(s)


In [58]:
# Note: NOT WORKING YET!!

from operator import itemgetter

from dol import wrap_kvs, Pipe
# from mongodol.stores import MongoStore
from mongodol.base import MongoCollectionPersister


# # To be able to overwrite an existing (item, group) pair (by default MongoStore doesn't allow it)
# def delete_if_exists(self, k, v):
#     if k in self:
#         del self[k]        
#     return v

class UnicityError(ValueError):
    """When something should have been unique and wasn't"""

def there_is_more_in_the_cursor(cursor):
    return next(cursor, None) is not None

def first_item_of_cursor(cursor):
    v = next(cursor, None)
    if v is None:
        raise KeyError("No such key")  # TODO: Get the key from cursor
    elif there_is_more_in_the_cursor(cursor):
        raise UnicityError(f"There should be only one match, but was several")
    else:
        return v
    
def obj_of_data(cursor):
    doc = first_item_of_cursor(cursor)
    return set(doc.get('tags', set()))

def data_of_obj(tags):
    print(f"data_of_obj({tags=})")
    return {"tags": {tag: True for tag in tags}}

trans = Pipe(
    wrap_kvs(
        key_of_id=itemgetter('_id'), 
        id_of_key=lambda x: {'_id': x}, 
        obj_of_data=first_item_of_cursor,
        data_of_obj=data_of_obj,
        # preset=delete_if_exists,
    )
)

@trans
class TagStore(MongoCollectionPersister):
    """To group items"""
    def __init__(self,
        mgc='scrap/tagged_objects',
        iter_projection=('_id',), 
        # iter_projection=None, #tuple({'ref': True, '_id': False}.items()),
        **mgc_find_kwargs,
    ):
        # if iter_projection is not None:
        #     iter_projection = dict(iter_projection)
        super().__init__(
            mgc=mgc,
            iter_projection=iter_projection,
            **mgc_find_kwargs
        )
    

# for k in m:
#     del m[k]

prepare_test_collection()

m = TagStore()
list(m)

list(m)
# m['123'] = {'tags': ['tag1', 'tag2']}

['123', '456', '789']

In [60]:
t = m['123']
print(f"{m['123']=}")
# m['123'] = ('apple', 'sauce')
m['123']

m['123']={'_id': '123', 'ref': 'absolute/reference/to/content', 'some': 'other metadata', 'tags': {'tag1': True, 'tag2': True}}


{'_id': '123',
 'ref': 'absolute/reference/to/content',
 'some': 'other metadata',
 'tags': {'tag1': True, 'tag2': True}}

In [42]:
list(m[{'_id': '123'}])

[{'_id': '123',
  'ref': 'absolute/reference/to/content',
  'some': 'other metadata',
  'tags': {'tag1': True, 'tag2': True}}]

In [36]:
m['123']

{'_id': '123',
 'ref': 'absolute/reference/to/content',
 'some': 'other metadata',
 'tags': {'tag1': True, 'tag2': True}}

We'd like to have "ref" be the key, not `"_id"` (but couldn't get unicity enforcement to work)...

In [8]:
test_metadata_docs = [
    {
        "_id": "123",
        "ref": "absolute/reference/to/content",
        "some": "other metadata",  # just to show there can be other stuff
        "tags": {
            # instead of a list, we'll use an object (dict), whose fields are the group names
            # This is because mongoDB allows us to index fields, therefore automatically 
            # get the bidirectional mapping from groups to refs the group "contains"
            "tag1": True,
            "tag2": True,
        }
    },
    {
        "_id": "456",
        "ref": "absolute/reference/to/some/other/content",
        "tags": {
            "tag1": True,
            "tag2": True,
        }
    },
    {
        "_id": "789",
        "ref": "this/ref/is/necessary",
        "optional": "metadata",
        # and no tags here (but whenever someone/something adds a tag, it will be added here)
    }
]

# Make a collection with only those docs in it

from mongodol import MongoCollectionPersister

def prepare_test_collection():
    collection_uri = 'scrap/tagged_objects'  # not given to input ON PURPOSE!!
    s = MongoCollectionPersister(collection_uri, iter_projection=None)

    for doc in s:
        del s[doc]

    assert list(s) == []

    for doc in test_metadata_docs:
        s[doc] = doc

    assert list(s) == test_metadata_docs

    return s

s = prepare_test_collection()
list(s)


[{'_id': '123',
  'ref': 'absolute/reference/to/content',
  'some': 'other metadata',
  'tags': {'tag1': True, 'tag2': True}},
 {'_id': '456',
  'ref': 'absolute/reference/to/some/other/content',
  'tags': {'tag1': True, 'tag2': True}},
 {'_id': '789', 'ref': 'this/ref/is/necessary', 'optional': 'metadata'}]

[{'_id': '123'}, {'_id': '456'}, {'_id': '789'}]

In [30]:
# m.mgc.insert_one({'ref': 'absolute/reference/to/content', 'apple': 'sauce'})
# list(m.mgc.find())

[{'_id': '123',
  'ref': 'absolute/reference/to/content',
  'some': 'other metadata',
  'tags': {'tag1': True, 'tag2': True}},
 {'_id': '456',
  'ref': 'absolute/reference/to/some/other/content',
  'tags': {'tag1': True, 'tag2': True}},
 {'_id': '789', 'ref': 'this/ref/is/necessary', 'optional': 'metadata'},
 {'_id': ObjectId('64e7b9fd37cf048a5852e786'),
  'ref': 'absolute/reference/to/content',
  'apple': 'sauce'}]

In [27]:
# Note: NOT WORKING YET!!

from operator import itemgetter

from dol import wrap_kvs, Pipe
from mongodol.stores import MongoStore


# # To be able to overwrite an existing (item, group) pair (by default MongoStore doesn't allow it)
# def delete_if_exists(self, k, v):
#     if k in self:
#         del self[k]        
#     return v

def data_of_obj(tags):
    return 


def data_of_obj(tags):
    print(f"data_of_obj({tags=})")
    return {tag: True for tag in tags}

trans = Pipe(
    wrap_kvs(
        key_of_id=itemgetter('ref'), 
        id_of_key=lambda x: {'ref': x}, 
        obj_of_data=lambda x: set(x.get('tags', set())),
        data_of_obj=data_of_obj,
        # preset=delete_if_exists,
    )
)

@trans
class TagStore(MongoStore):
    """To group items"""
    def __init__(self,
        db_name='scrap',
        collection_name='tagged_objects',
        mongo_client_kwargs=None,
    ):
        super().__init__(
            db_name=db_name,
            collection_name=collection_name,
            key_fields=['ref'],
            # data_fields=['tags'],
            mongo_client_kwargs=mongo_client_kwargs
        )
    


# for k in m:
#     del m[k]

prepare_test_collection()

m = TagStore()
list(m)

list(m)
# m['123'] = {'tags': ['tag1', 'tag2']}

['absolute/reference/to/content',
 'absolute/reference/to/some/other/content',
 'this/ref/is/necessary']

In [28]:
k = 'absolute/reference/to/content'
m[k] = {'apple', 'sauce'}
m[k]

data_of_obj(tags={'apple', 'sauce'})


{'tag1', 'tag2'}

In [26]:
mm = m.store
list(mm)

AttributeError: 'TagStore' object has no attribute 'store'

In [127]:
debug

> [0;32m/Users/thorwhalen/Dropbox/py/proj/i/mongodol/mongodol/scrap/old01.py[0m(363)[0;36m__setitem__[0;34m()[0m
[0;32m    361 [0;31m[0;34m[0m[0m
[0m[0;32m    362 [0;31m    [0;32mdef[0m [0m__setitem__[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mk[0m[0;34m,[0m [0mv[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m--> 363 [0;31m        [0;32mreturn[0m [0mself[0m[0;34m.[0m[0m_mgc[0m[0;34m.[0m[0minsert_one[0m[0;34m([0m[0mdict[0m[0;34m([0m[0mk[0m[0;34m,[0m [0;34m**[0m[0mv[0m[0;34m)[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0m[0;32m    364 [0;31m[0;34m[0m[0m
[0m[0;32m    365 [0;31m    [0;32mdef[0m [0m__delitem__[0m[0;34m([0m[0mself[0m[0;34m,[0m [0mk[0m[0;34m)[0m[0;34m:[0m[0;34m[0m[0;34m[0m[0m
[0m
self = <mongodol.scrap.old01.OldMongoPersister object at 0x1223a9090>
k = {'ref': 'absolute/reference/to/content'}
v = ({'tag2': True, 'tag3': True},)
> [0;32m/Users/thorwhalen/Dropbox/py/proj/i/dol/dol/

In [111]:
# m['this/ref/is/necessary'] = {'tags': 3}

In [109]:
m['this/ref/is/necessary']

set()

In [113]:
# m.store[{'ref': 'this/ref/is/necessary'}]
# m.store[{'ref': 'this/ref/is/necessary'}] = {'tags': 'adf'}
# m.store[{'ref': 'this/ref/is/necessary'}]
list(m.store)

[{'ref': 'absolute/reference/to/content'},
 {'ref': 'absolute/reference/to/some/other/content'},
 {'ref': 'this/ref/is/necessary'}]

In [114]:
list(m.store.store)

[{'ref': 'absolute/reference/to/content'},
 {'ref': 'absolute/reference/to/some/other/content'},
 {'ref': 'this/ref/is/necessary'}]

In [105]:
k = 'this/ref/is/necessary'
k = 'absolute/reference/to/some/other/content'

m.store.store[{'ref': k}]

d = m.store[{'ref': k}].get('tags', dict())
d.update(**{'tag3': True})
d
# m[k]




{'tag1': True, 'tag2': True, 'tag3': True}

# Historical sections

The interface for grouping/tagging was significantly changed, but keeping the original proposal below.

## A little digression on scaffolding

In [None]:
from typing import MutableMapping, Iterable, Any, NewType, Callable

Group = NewType('Group', str)
Item = NewType('Item', Any)


What functionality do we want around groups and their items?

Let's express this through type annotations of some functions (which we'll encapsulate in a `Groups` class.

In [None]:
class GroupsDacc:
    add_items_to_group: Callable[[Iterable[Item], Group], Any]
    list_groups: Callable[[], Iterable[Group]]
    items_for_group: Callable[[Group],Iterable[Item]]
    

Note that this should now be sufficient to generate a scaffold of what we need. 

We'll do it in two ways: By (dynamically) creating a `typing.protocol` describing a concrete `Groups` object we could implement in the future, and by generating (the code string for) a concrete (but empty) such `Groups` class.

In [None]:
import i2

from meshed.scrap.annotations_to_meshes import (
    func_types_to_scaffold, 
    func_types_to_protocol
)

# Groups.__annotations__ is a {name: func_annotation, ...} dict
# We can make a protocol from that
GroupsProtocol = func_types_to_protocol(GroupsDacc.__annotations__)

# See that GroupsProtocol has methods for each item of the 
# Groups.__annotations__ dict. Each method bares a signature compatible with 
# the annotations.
i2.Sig(GroupsProtocol.add_items_to_group)
# <Sig (self, iterable: Iterable[__main__.Item], group: __main__.Group) -> Any>

<Sig (self, iterable: Iterable[__main__.Item], group: __main__.Group) -> Any>

In [None]:
# We can also 
print(func_types_to_scaffold(GroupsDacc.__annotations__))


class GeneratedClass:
    def add_items_to_group(self, iterable: Iterable, group: Group) -> Any:
    	pass

    def list_groups(self) -> Iterable:
    	pass

    def items_for_group(self, group: Group) -> Iterable:
    	pass



## TDD: Tests that describe the behavior we want

In [None]:
from typing import MutableMapping, Iterable, Any, NewType, Callable, Protocol

Group = NewType('Group', str)
Item = NewType('Item', Any)


class GroupsProtocol(Protocol):
    def add_items_to_group(self, iterable: Iterable, group: Group) -> Any:
        """Add one or several items to a group"""

    def list_groups(self) -> Iterable:
        """List group names"""

    def items_for_group(self, group: Group) -> Iterable:
        """List the items in a group"""
    
    
def test_groups(groups: GroupsProtocol):
    # the following assertion isn't part of the behavior we want -- just a condition we'll 
    # need to be able to conduct our test: Namely, that our collection of groups/items is empty.
    assert list(groups.list_groups()) == []  # make sure test is well setup
    
    groups.add_items_to_group('group_a', 'item_1', 'item_2')
    assert sorted(groups.list_groups()) == ['group_a']
    
    groups.add_items_to_group('group_b', 'item_3')
    assert sorted(groups.list_groups()) == ['group_a', 'group_b']
    assert sorted(groups.items_for_group('group_a')) == ['item_1', 'item_2']
    

Now we'll implement two concrete `GroupsDacc`, using a store, a `MutableMapping`, as a back-end so as to keep the persistance concern still separate. 

The idea is: As long as we provide our concrete persister with the right `MutableMapping` facade (with a minimum of specifics/semantics such as what the keys and values are meant to be), we should have a working object.

The two `GroupsDacc` options will differ on the particulars of the store. 
- In the first, we'll assume the store has items as keys and groups as values. 
- In the second we'll assume the groups are the keys, and values are sets of items of that group.

## Concrete GroupsDacc (option 1): ItemGroupDacc

In [None]:
from typing import MutableMapping, Iterable, Any, NewType
from dataclasses import dataclass

Group = NewType('Group', str)
Item = NewType('Item', Any)
ItemGroupPairs = NewType('ItemGroupPairs', MutableMapping[Item, Group])

@dataclass
class ItemGroupDacc:
    store: ItemGroupPairs
        
    def add_items_to_group(self, group: Group, *items: Iterable[Item]) -> Any:
        for item in items:
            self.store[item] = group

    def list_groups(self) -> Iterable[Group]:
        return set(self.store.values())

    def items_for_group(self, group: Group) -> Iterable[Item]:
        # TODO: Exxpress this filtering in such a way that will allow us to take advantage of DB specifics
        #  (e.g., passing on the filtering to the DB instead of filtering in python itself)
        return (item for item, group_ in self.store.items() if group_ == group)
    

In [None]:
store = dict()
test_groups(ItemGroupDacc(store))  # a dict works!

## Concrete GroupsDacc (option 2): GroupSetsDacc

In [None]:
from typing import MutableMapping, Iterable, Any, NewType, Set
from dataclasses import dataclass

Group = NewType('Group', str)
Item = NewType('Item', Any)
GroupSets = NewType('GroupSets', MutableMapping[Group, Set[Item]])

@dataclass
class GroupSetsDacc:
    store: GroupSets
        
    def add_items_to_group(self, group: Group, *items: Iterable[Item]) -> Any:
        self.store[group] |= set(items)

    def list_groups(self) -> Iterable[Group]:
        return set(self.store)
            
    def items_for_group(self, group: Group) -> Iterable[Item]:
        return self.store[group]
    

In [None]:
from collections import defaultdict

store = defaultdict(set)
test_groups(GroupSetsDacc(store))  # a defaultdict(set) works as a store!

## Actual persisting stores using mongo

In [None]:
# from mongodol.base import MongoClient

### ItemGroupPairs (for ItemGroupDacc)

Options for implementing:

```
s[item] = group
```

Option 1: But here we'd need to produce the ID on write

```
--> {'_id': ID, 'group': group', 'item': item}
```

Option 2: But we need to allow re-writes on `_id`

```
--> {'_id': item, 'group': group}
```


In [None]:
from operator import itemgetter

from dol import wrap_kvs, Pipe
from mongodol.stores import MongoStore


# To be able to overwrite an existing (item, group) pair (by default MongoStore doesn't allow it)
def delete_if_exists(self, k, v):
    if k in self:
        del self[k]
    return v


trans = Pipe(
    wrap_kvs(
        key_of_id=itemgetter('_id'), 
        id_of_key=lambda x: {'_id': x}, 
        obj_of_data=itemgetter('group'),
        data_of_obj=lambda x: {'group': x}, 
        preset=delete_if_exists,
    )
)

@trans
class GroupStore(MongoStore):
    """To group items"""
    def __init__(self,
        db_name='scrap',
        collection_name='group_items',
        mongo_client_kwargs=None,
    ):
        super().__init__(
            db_name=db_name,
            collection_name=collection_name,
            key_fields=['_id'],
            data_fields=['group'],
            mongo_client_kwargs=mongo_client_kwargs
        )
    
m = GroupStore()
list(m)

[]

In [None]:
store = GroupStore()

# empty the store
for k in store: 
    del store[k]

test_groups(ItemGroupDacc(store))

### GroupSets (for GroupSetsDacc)

Options for implementing:

```
s[group] |= items
```

Option 1: But here we'd need to produce the ID on write

```
--> {'_id': group, 'items': items}
```

Option 2: But we need to allow re-writes on `_id`

```
--> {'_id': ID, 'group': group, 'item': item}  # (group, items) -> (group, item_1), (group, item_2), ...
```


In [None]:
from operator import itemgetter

from dol import wrap_kvs, Pipe
from mongodol.stores import MongoStore


# To be able to overwrite an existing (item, group) pair (by default MongoStore doesn't allow it)
def delete_if_exists(self, k, v):
    if k in self:
        del self[k]
    return v


trans = Pipe(
    wrap_kvs(
        key_of_id=itemgetter('_id'), 
        id_of_key=lambda x: {'_id': x}, 
        obj_of_data=Pipe(itemgetter('items'), set),
        data_of_obj=lambda x: {'items': list(x) if not isinstance(x, str) else [x]}, 
#         preset=delete_if_exists,
    )
)

@trans
class ItemsStore(MongoStore):
    """To group items"""
    def __init__(self,
        db_name='scrap',
        collection_name='items_group',
        mongo_client_kwargs=None,
    ):
        super().__init__(
            db_name=db_name,
            collection_name=collection_name,
            key_fields=['_id'],
            data_fields=['items'],
            mongo_client_kwargs=mongo_client_kwargs
        )
        
    def __missing__(self, k):
        return {'items': []}
    

In [None]:
store = ItemsStore()

# empty the store
for k in store: 
    del store[k]

test_groups(GroupSetsDacc(store))

## Implementation that uses "metadata" collection

Say you already have a mongo collection that contains meta-data on your items. 
That is, a collection that contains docs, one per item, that is intended to record information about this item. 
The groups the item belongs to can be just one additional one. 

In [None]:
metadata_docs = [
    {
        "_id": "123",
        "ref": "absolute/reference/to/content",
        "some": "other metadata",  # just to show there can be other stuff
        "groups": {
            # instead of a list, we'll use an object (dict), whose fields are the group names
            # This is because mongoDB allows us to index fields, therefore automatically 
            # get the bidirectional mapping from groups to refs the group "contains"
            "group1": True,
            "group2": True,
        }
    },
    {
        "_id": "456",
        "ref": "absolute/reference/to/some/other/content",
        "groups": {
            "group1": True,
            "group3": True,
        }
    },
    {
        "_id": "789",
        "ref": "this/ref/is/necessary",
        "optional": "metadata",
        # and not groups here (but whenever someone/something adds a group, it will be added here)
    }
]