Fix for Issue 1634 #1636

Closed
wants to merge 4 commits into
from

Conversation

Projects
None yet
3 participants
Contributor

kux commented Feb 14, 2013

Proposed fix for #1634 (concurrency issues with mptt when creating root pages)

Note: edited by @ojii

kux added some commits Feb 13, 2013

@kux kux Fix mptt issues by making views that change the mptt fields mutually …
…exclusive

* mutual exclusivity is achieved by doing a select for update on the first Page object
* add_view and move_page are now mutually exclusive. Other will follow...
a97e437
@kux kux Make copy_page mutually exclusive as well 3c3524c
@kux kux Removed the old commit_on_success annotations f795d36

@ojii ojii commented on the diff Feb 14, 2013

cms/admin/pageadmin.py
@@ -143,6 +143,23 @@ def contribute_list_filter(cls):
setattr(cls, 'list_filter', list_filter)
+def mutually_exclusive(func):
+
+ @transaction.commit_manually
+ def wrap(*args, **kwargs):
+ transaction.commit()
+ try:
+ Page.objects.all().select_for_update().exists()
+ ret_value = func(*args, **kwargs)
+ transaction.commit()
+ return ret_value
+ except:
+ transaction.rollback()
+ raise
+
@ojii

ojii Feb 14, 2013

Collaborator

could you please use functools.update_wrapper on your wrap function to make sure that tracebacks don't get confusing because of this?

Collaborator

ojii commented Feb 14, 2013

upstream discussion: django-mptt/django-mptt#236

Collaborator

ojii commented Feb 14, 2013

how save is this? It looks like the upstream discussion is far from finished. What makes this solution work for us, but not upstream?

Contributor

kux commented Feb 14, 2013

django-mptt has some serious concurrency issues.
The main problem is that most of the tree updating algorithms go like:

  1. do some selects to figure out what needs to be updated and what values it needs to be updated with
  2. do the actual update

There's obviously a race condition there. Multiple processes might be doing step 1 concurrently.

There are multiple alternatives in fixing this:

  1. Try making all mptt tree structure updates atomic by having them execute in a single query. Something like collapsing step1 and step2 above. This would mean a massive change in mptt.
  2. Make all views that change mptt structures mutually exclusive. This is what this patch does. Adding similar logic in mptt is not desirable since it would mess up with the client code's transactions.
Contributor

kux commented Feb 14, 2013

Regarding safety, I did write the following unit tests:

class ConcurrentPageChangesTestCase(testcases.TransactionTestCase, TestHelper):

    def _add_page(self):
        page_data = self.get_new_page_data()
        return self.client.post(URL_CMS_PAGE_ADD, page_data)

    def _add_pages(self, num_pages):
        for _ in xrange(num_pages):
            self._add_page()

    def test_concurrently_create_pages(self):
        status_codes = Queue.Queue()

        class PageCreatingThread(threading.Thread):
            def run(self_):
                response = self._add_page()
                status_codes.put(response.status_code)

        superuser = self.get_superuser()
        num_concurrent_req = 5
        threads = []
        with self.login_user_context(superuser):
            self._add_page()
            for _ in xrange(num_concurrent_req):
                new_thread = PageCreatingThread()
                new_thread.start()
                threads.append(new_thread)
            for thread in threads:
                thread.join()
            # check that all page adds were successful and returned with a redirect
            while not status_codes.empty():
                self.assertEquals(status_codes.get(), 302)
            # commit any pending transaction so we make sure we see an up to date state of the db
            transaction.commit()
            # all previous created pages need to have different tree_ids
            self.assertEqual(len(set(p.tree_id for p in Page.objects.all())), num_concurrent_req + 1)

    def _concurrently_move_pages(self, movements):
        status_codes = Queue.Queue()

        class PageMove(threading.Thread):
            def run(self_):
                response = self.client.post("/admin/cms/page/%s/move-page/" % page_id, {"target": target_id, "position": position})
                status_codes.put(response.status_code)

        threads = []
        for page_id, target_id, position in movements:
            page_moving_thread = PageMove()
            page_moving_thread.start()
            threads.append(page_moving_thread)
        for thread in threads:
            thread.join()
        # check that all page moves were successful
        while not status_codes.empty():
            self.assertEquals(status_codes.get(), 200)

    def test_concurrently_move_pages_at_root_level(self):
        superuser = self.get_superuser()
        with self.login_user_context(superuser):
            page_count = 10
            self._add_pages(page_count)
            transaction.commit()
            self.assertEqual(len(set(p.tree_id for p in Page.objects.all())), page_count)
            movements = [(page_id, 10, 'left') for page_id in xrange(1, 5)]
            self._concurrently_move_pages(movements)
            # commit any pending transaction so we make sure we see an up to date state of the db
            transaction.commit()
            self.assertEqual(len(set(p.tree_id for p in Page.objects.all())), page_count)

I haven't included this in the pull request since I didn't manage to get them working with sqlite.
They do work ok with a mysql db.

Collaborator

ojii commented Feb 14, 2013

You just won't be able to get multithreaded sqlite to work (that's not how sqlite works, and is the reason you shouldn't use it in production).

The issue I have with threaded tests is that while they probably show the issues most of the time, couldn't you also just be "lucky" and the race condition doesn't occur, thus not making the test fail?

Contributor

kux commented Feb 14, 2013

The testing I did was with a mysql InnoDB using 'repeatable reads' transaction isolation level.
I did make the assumption that select for update behaves in a similar way across different RDMS.

Contributor

kux commented Feb 14, 2013

The issue I have with threaded tests is that while they probably show the issues most of the time, couldn't you also just be "lucky" and the race condition doesn't occur, thus not making the test fail?

If not using the mutually_exclusive decorator the tests always fail. Using it they never fail.

======================================================================
FAIL: test_concurrently_create_pages (cms.tests.page.ConcurrentPageChangesTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/kux/workspace/src/other/django-cms/cms/tests/page.py", line 73, in test_concurrently_create_pages
    self.assertEqual(len(set(p.tree_id for p in Page.objects.all())), num_concurrent_req + 1)
AssertionError: 2 != 6

======================================================================
FAIL: test_concurrently_move_pages_at_root_level (cms.tests.page.ConcurrentPageChangesTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/kux/workspace/src/other/django-cms/cms/tests/page.py", line 105, in test_concurrently_move_pages_at_root_level
    self.assertEqual(len(set(p.tree_id for p in Page.objects.all())), page_count)
AssertionError: 9 != 10

---------------------------------------------------------------------
Ran 2 tests in 7.328s
Contributor

kux commented Feb 15, 2013

select_for_update is only available starting with django 1.4 => this breaks for older versions

Member

digi604 commented Mar 26, 2013

I think older django versions fail.... and it would be nice to have this in 2.4 😄

Member

digi604 commented Mar 28, 2013

we do run tests on postgresql and mysql... we can exlude the test for sqlite. that said could you merge with current develop so i can have a look at it?

Contributor

kux commented Mar 28, 2013

@digi604 Sure. I'll close this and reopen something similar on develop.

kux closed this Mar 28, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment