operations on ListField are not atomic #3

clayg · 2010-08-04T23:32:10Z

This may actually be a feature request. But, I had inferred from the documentation of Ohm that append and pop operations on a ListField would be immediately persisted and atomic (see "Persistence strategy" in http://github.com/soveran/ohm/blob/master/README.markdown) - but this doesn't seem to be the case...

with Book and Author models from tests.models.ListFieldTestCase.test_list_of_reference_fields

>>> b1 = Book.objects.create(title='Book1', date_published=date.today())
>>> b2 = Book.objects.create(title='Book2', date_published=date.today())
>>> a1 = Author.objects.create(name='Author1', books=[b1, b2])
>>> # then in another thread
... 
>>> a2 = Author.objects.get_by_id(1)
>>> b3 = Book.objects.create(title='Book3', date_published=date.today())
>>> a2.books.append(b3)
>>> a2.save()
True
>>> len(a2.books)
3
>>> # meanwhile back in the first request
... 
>>> a1.save()
True
>>> len(Author.objects.get_by_id(1).books)
2

Atomic operations on redis lists (and sets?) are one of the killer features, and I'd like redisco to abstract them for me - e.g. Counter! The "Last Write Wins" strategy makes a lot of sense in some places, but an atomic list field would probably be preferable in many cases.

I could see reason to disabled set on ListField entirely in favor of append and pop (lpush, rpop?) on the mode base a la Counters' incr and decr? But for backwards compat maybe it would better to create a new AtomicListField (don't love the name...)

More interesting ideas might include allowing get to return the counters.List instance directly instead of cast members, or at least some sort of proxy/wrapper around it - or for _redisco_model, maybe something similar to a ModelSet?

I wonder if you've considered this case, or what approach you would find more amiable?

iamteem · 2010-08-05T01:17:09Z

AFAIK Ohm's list doesn't allow assignment so you can only push values and pop.
Maybe you can use the containers.List class for this. It's the one used to save lists.

>>> List(a1.key('books')).rpush(b1.id)
>>> List(a1.key('books')).rpush(b2.id)

You'd have to reload a1 though. But then the other Author instance would just clear the list when it has loaded/cached the book ids from the list, and saving its own values.

Another idea is to add an atomic option in ListField class so that we can do:

class Author(models.Model):
    books = models.ListField(Book, atomic=True)

>>> author = Author.objects.create(books=[book1])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Exception, List assignment not allowed.
>>> author = Author.objects.create()
>>> author.books.append(book1)

And the append is atomic, and this lists won't be saved in the save method.

What do you think?

clayg · 2010-08-05T13:28:56Z

I agree with you about Ohm and that a containers.List instance is the correct way to expose atomic operations on a list.

I had not considered using a kwarg to specify the atomicity of the field... I like it!

clayg · 2010-08-06T16:13:59Z

For "author.books.append(book)" to be atomic, the object returned by author.books can not be a python list - any actions on that object would just modify the python list. I purpose that caching operations on a python list and trying to apply them atomically to redis during save is worse than overwrite. In order for atomic operations on author.books to be immediately persisted and atomic, it has to be an instance of a class that supports atomic operations on redis.

I thought about exposing redisco.containers.List directly - which was ok for strings, but of course doesn't work for a list of models...

>>> a
<Author:1 {'books': <List 'Author:1:book' []>, 'name': 'author1'}>
>>> b
<Book:1 {'date_published': datetime.date(2010, 8, 6), 'title': 'book1'}>
>>> a.books.append(b.id)
>>> a
<Author:1 {'books': <List 'Author:1:book' ['1']>, 'name': 'author1'}>
>>> a.books
<List 'Author:1:book' ['1']>

Ultimately, I feel that internally the option atomic=True should force ListField to set the value of the attribute name to an instance of a new redisco container - a TypedList.

basic gist of TypedList:
http://gist.github.com/511519

The new container is in my fork in it's own 'typedlist' branch:
http://github.com/clayg/redisco/tree/typedlist

^ if you want to check it out and run tests

Please give me a second set of eyes to help pick out the warts, if you think it might be worth pursuing I'll see about adding in the atomic option to ListField.

Thanks!

iamteem · 2010-08-06T18:10:46Z

Looks solid. :) I'm wondering if we can subclass TypedList from containers.List instead of object (and using the self.list attribute), just add the typecasting methods, and typecast values when reading/modifying the list. (But then again, all those stuff is accessible using self.list.)

Great work!

clayg · 2010-08-09T16:52:19Z

My original implementation tried to subclass List, but it looked kinda weird:

    def all(self):
        """Returns all items in the list."""
        v = super(TypedSubList, self).all()
        return self.typecast_iter(v)

Plus, I found out quickly that I'd have to be methodical in overrideing methods defined on List or risk breaking my subclass:

>>> from redisco.containers import TypedSubList
>>> a = TypedSubList('alpha', float)
>>> a
[0.0]
>>> a.all()
[0.0]
>>> a.members
['0']
>>> # ^ oops, forgot memebers!

I thought about trying to short-circut getattribute so I could dynamically typecast attributes, and wrap callables in a typecast decorator but it got messy pretty quickly:

    def typecast_call(self, f):
        def casting_call(*args, **kwargs):
            v = f(*args, **kwargs)
            if hasattr(v, 'append'):
                return self.typecast_iter(v)
            elif isinstance(v, basestring):
                return self.typecast_item(v)
            else:
                return v
        return casting_call

    def __getattribute__(self, att):
        try:
            v = getattr(super(TypedSubList, self), att)
        except AttributeError:
            # List class does not define this method directly
            try:
                # maybe I do?
                return object.__getattribute__(self, att)
            except AttributeError:
                # Nope, I don't have this method either! 
                # Maybe List will delegate?
                v = super(TypedSubList, self).__getattribute__(att)

        # this method is defined on List class
        if hasattr(v, '__call__'):
            return self.typecast_call(v)
        elif hasattr(v, 'append'):
            return self.typecast_iter(v)
        elif isinstance(v, basestring):
            return self.typecast_item(v)
        else:
            return v

Pretty gruesome... it may even violate import this:

In the face of ambiguity, refuse the temptation to guess.

It's certainly possible there's a better way to implement the above, and I agree with you that subclassing object duplicates work... so I'm not sure what's the most robust and maintainable solution here...

What do you think?

* fixes iamteem#3

xwssole pushed a commit to xwssole/redisco that referenced this issue Oct 10, 2012

Setting a reference field deletes the reference (and cached object)

6b05392

* fixes iamteem#3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

operations on ListField are not atomic #3

operations on ListField are not atomic #3

clayg commented Aug 4, 2010

iamteem commented Aug 5, 2010

clayg commented Aug 5, 2010

clayg commented Aug 6, 2010

iamteem commented Aug 6, 2010

clayg commented Aug 9, 2010

operations on ListField are not atomic #3

operations on ListField are not atomic #3

Comments

clayg commented Aug 4, 2010

iamteem commented Aug 5, 2010

clayg commented Aug 5, 2010

clayg commented Aug 6, 2010

iamteem commented Aug 6, 2010

clayg commented Aug 9, 2010