New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detached instance when loading session model with invalid CSRF token #4398

Closed
chdorner opened this Issue Mar 2, 2017 · 4 comments

Comments

Projects
None yet
2 participants
@chdorner
Contributor

chdorner commented Mar 2, 2017

This is the issue for tracking the following error reports:

Steps to reproduce

I did not yet find a way to reproduce this.

Expected behaviour

No exception.

Actual behaviour

💥 Parent instance <User at 0x7f91ddb75d10> is not bound to a Session; lazy load operation of attribute 'groups' cannot proceed

Additional details

The errors are coming from at least the Chrome extension and via, I haven't seen embed yet but we can assume that this happens for all types of environments the clients is running.

Following the Sentry error report and the code it looks like this is happening:

  1. User submits the login in sidebar
  2. Pyramid thinks the CSRF token is wrong and throws BadCSRFToken
  3. h.views.accounts.bad_csrf_token_json is getting called
  4. The view renders the session model by calling h.sessions.model
  5. It tries to load the groups
  6. It succeeds to load the authority group
  7. It tries to load the user's groups - and this is the place where request.authenticated_user is not None, but a detached instance so request.authenticated_user.groups explodes.

I don't really know why the client is posting the login form when the user is already logged-in because we check the CSRF token before we handle the login request. So the user must already have a valid auth ticket. This is also the reason why I'm struggling to reproduce it.

The user itself is detached because any exception raised will rollback the transaction and detach any objects that are cached in the database session.

@chdorner

This comment has been minimized.

Show comment
Hide comment
@chdorner

chdorner Mar 2, 2017

Contributor

Pinging @fatbusinessman to see if we can do something about h.session.model when it's being called in an exception view.

And pinging @robertknight for some help regarding why the client could be sending a login form POST request when there already is an auth ticket/cookie.

Contributor

chdorner commented Mar 2, 2017

Pinging @fatbusinessman to see if we can do something about h.session.model when it's being called in an exception view.

And pinging @robertknight for some help regarding why the client could be sending a login form POST request when there already is an auth ticket/cookie.

@fatbusinessman

This comment has been minimized.

Show comment
Hide comment
@fatbusinessman

fatbusinessman Mar 3, 2017

Contributor

I’ve managed to reproduce this with a functional test:

diff --git a/tests/functional/test_accounts.py b/tests/functional/test_accounts.py
index 3d6d1a8..7312756 100644
--- a/tests/functional/test_accounts.py
+++ b/tests/functional/test_accounts.py
@@ -91,6 +91,9 @@ class TestAccountSettings(object):

         password_form.submit(xhr=True, status=400)

+    def test_something_something_csrf(self, app):
+        res = app.post_json('/app?__formid__=login', {})
+
     @pytest.fixture
     def user(self, db_session, factories):
         user = factories.User(password='pass')

Running this test¹ gives me the DetachedInstanceError.

¹ For my own later reference:

tox -e functional -- -k test_something_something_csrf tests/functional/test_accounts.py

Contributor

fatbusinessman commented Mar 3, 2017

I’ve managed to reproduce this with a functional test:

diff --git a/tests/functional/test_accounts.py b/tests/functional/test_accounts.py
index 3d6d1a8..7312756 100644
--- a/tests/functional/test_accounts.py
+++ b/tests/functional/test_accounts.py
@@ -91,6 +91,9 @@ class TestAccountSettings(object):

         password_form.submit(xhr=True, status=400)

+    def test_something_something_csrf(self, app):
+        res = app.post_json('/app?__formid__=login', {})
+
     @pytest.fixture
     def user(self, db_session, factories):
         user = factories.User(password='pass')

Running this test¹ gives me the DetachedInstanceError.

¹ For my own later reference:

tox -e functional -- -k test_something_something_csrf tests/functional/test_accounts.py

@chdorner

This comment has been minimized.

Show comment
Hide comment
@chdorner

chdorner Mar 3, 2017

Contributor

Hmm, the code takes a different path, but it results in the same error. It's now going through the error_validation exception view, instead of the bad_csrf_token_json view.

I presume that the fix would work for both cases. Depending on the fix this would get rid of both errors. The problem I see is that your functional test is that the session is already authenticated, and then you're sending a login request which will fail the form validation.

I did some more hunting for this, and it looks like the after_rollback event is not being sent, which is why the user service cache isn't cleared. Just checking why that is so.

Contributor

chdorner commented Mar 3, 2017

Hmm, the code takes a different path, but it results in the same error. It's now going through the error_validation exception view, instead of the bad_csrf_token_json view.

I presume that the fix would work for both cases. Depending on the fix this would get rid of both errors. The problem I see is that your functional test is that the session is already authenticated, and then you're sending a login request which will fail the form validation.

I did some more hunting for this, and it looks like the after_rollback event is not being sent, which is why the user service cache isn't cleared. Just checking why that is so.

@chdorner

This comment has been minimized.

Show comment
Hide comment
@chdorner

chdorner Mar 3, 2017

Contributor

Found the culprit. I will have to write some proper documentation and tests on Monday, but here is the fix:

diff --git a/h/services/user.py b/h/services/user.py
index 94a0a77ec..384d1e759 100644
--- a/h/services/user.py
+++ b/h/services/user.py
@@ -40,10 +40,15 @@ class UserService(object):
         self._cache = {}

         # But don't allow the cache to persist after the session is closed.
-        @sqlalchemy.event.listens_for(session, 'after_commit')
-        @sqlalchemy.event.listens_for(session, 'after_rollback')
-        def flush_cache(session):
-            self._cache = {}
+        # TODO: Add more documentation about:
+        #       - after_rollback is only called for explicit rollbacks
+        #       - in case of an exception, sqlalchemy is closing the session / connection, which implicitly closes the connection but doesn't emit after_rollback
+        #       - after_transaction_end is being called regardless of commit or rollback, we just need to check that the outer-most transaction is being ended,
+        #         to ensure that we don't clear the cache for sub-transactions.
+        @sqlalchemy.event.listens_for(session, 'after_transaction_end')
+        def flush_cache(session, transaction):
+            if transaction.parent is None:
+                self._cache = {}

     def fetch(self, userid_or_username, authority=None):
         """
Contributor

chdorner commented Mar 3, 2017

Found the culprit. I will have to write some proper documentation and tests on Monday, but here is the fix:

diff --git a/h/services/user.py b/h/services/user.py
index 94a0a77ec..384d1e759 100644
--- a/h/services/user.py
+++ b/h/services/user.py
@@ -40,10 +40,15 @@ class UserService(object):
         self._cache = {}

         # But don't allow the cache to persist after the session is closed.
-        @sqlalchemy.event.listens_for(session, 'after_commit')
-        @sqlalchemy.event.listens_for(session, 'after_rollback')
-        def flush_cache(session):
-            self._cache = {}
+        # TODO: Add more documentation about:
+        #       - after_rollback is only called for explicit rollbacks
+        #       - in case of an exception, sqlalchemy is closing the session / connection, which implicitly closes the connection but doesn't emit after_rollback
+        #       - after_transaction_end is being called regardless of commit or rollback, we just need to check that the outer-most transaction is being ended,
+        #         to ensure that we don't clear the cache for sub-transactions.
+        @sqlalchemy.event.listens_for(session, 'after_transaction_end')
+        def flush_cache(session, transaction):
+            if transaction.parent is None:
+                self._cache = {}

     def fetch(self, userid_or_username, authority=None):
         """

chdorner added a commit that referenced this issue Mar 7, 2017

Clear user service cache on transaction end.
We previously cleared the user service cache on `after_commit` and
`after_rollback` SQLAlchemy events. This caused issues where the view
was raising an exception, because SQLAlchemy only fires the
`after_rollback` event for an explicit rollback.

pyramid_tm and zope.sqlalchemy will close the database transaction when
the view raises an exception. A call to `SessionTransaction.close` does
implicitly send a `ROLLBACK` SQL query to the database, but it doesn't
emit the `after_rollback` event that we're relying on in order to clear
out the user service cache.
This results in a exception views potentially accessing detached
database objects which in turn throws another exception, which then
returns a generic 500 error back to the user.

For following all of this along at home:

1. pyramid_tm starts a transaction at the beginning of a request:
https://github.com/Pylons/pyramid_tm/blob/1.1.1/pyramid_tm/__init__.py#L86
2. pyramid_tm catches exceptions and if the error is not retryable,
aborts the transaction:
https://github.com/Pylons/pyramid_tm/blob/1.1.1/pyramid_tm/__init__.py#L116
3. The `abort` that pyramid_tm sends is finishing the transaction:
https://github.com/zopefoundation/zope.sqlalchemy/blob/0.7.7/src/zope/sqlalchemy/datamanager.py#L91
4. Finishing a transaction in turn will close it:
https://github.com/zopefoundation/zope.sqlalchemy/blob/0.7.7/src/zope/sqlalchemy/datamanager.py#L87

Whereas an explicit rollback works as we expected it to:
1. `sqlalchemy.orm.session.SessionTransaction.rollback` checks if it
needs to send a `ROLLBACK` SQL query to the database, and if it does...
(https://bitbucket.org/zzzeek/sqlalchemy/src/ce17fd4bf361e6cd08b6e698e49c118643abe724/lib/sqlalchemy/orm/session.py?at=rel_1_1_4&fileviewer=file-view-default#session.py-490)
2. calls `_rollback_impl` which emits the `after_rollback` event
(https://bitbucket.org/zzzeek/sqlalchemy/src/ce17fd4bf361e6cd08b6e698e49c118643abe724/lib/sqlalchemy/orm/session.py?at=rel_1_1_4&fileviewer=file-view-default#session.py-532).

Fixes #4398.

chdorner added a commit that referenced this issue Mar 8, 2017

Clear user service cache on transaction end.
We previously cleared the user service cache on `after_commit` and
`after_rollback` SQLAlchemy events. This caused issues where the view
was raising an exception, because SQLAlchemy only fires the
`after_rollback` event for an explicit rollback.

pyramid_tm and zope.sqlalchemy will close the database transaction when
the view raises an exception. A call to `SessionTransaction.close` does
implicitly send a `ROLLBACK` SQL query to the database, but it doesn't
emit the `after_rollback` event that we're relying on in order to clear
out the user service cache.
This results in a exception views potentially accessing detached
database objects which in turn throws another exception, which then
returns a generic 500 error back to the user.

For following all of this along at home:

1. pyramid_tm starts a transaction at the beginning of a request:
https://github.com/Pylons/pyramid_tm/blob/1.1.1/pyramid_tm/__init__.py#L86
2. pyramid_tm catches exceptions and if the error is not retryable,
aborts the transaction:
https://github.com/Pylons/pyramid_tm/blob/1.1.1/pyramid_tm/__init__.py#L116
3. The `abort` that pyramid_tm sends is finishing the transaction:
https://github.com/zopefoundation/zope.sqlalchemy/blob/0.7.7/src/zope/sqlalchemy/datamanager.py#L91
4. Finishing a transaction in turn will close it:
https://github.com/zopefoundation/zope.sqlalchemy/blob/0.7.7/src/zope/sqlalchemy/datamanager.py#L87

Whereas an explicit rollback works as we expected it to:
1. `sqlalchemy.orm.session.SessionTransaction.rollback` checks if it
needs to send a `ROLLBACK` SQL query to the database, and if it does...
(https://bitbucket.org/zzzeek/sqlalchemy/src/ce17fd4bf361e6cd08b6e698e49c118643abe724/lib/sqlalchemy/orm/session.py?at=rel_1_1_4&fileviewer=file-view-default#session.py-490)
2. calls `_rollback_impl` which emits the `after_rollback` event
(https://bitbucket.org/zzzeek/sqlalchemy/src/ce17fd4bf361e6cd08b6e698e49c118643abe724/lib/sqlalchemy/orm/session.py?at=rel_1_1_4&fileviewer=file-view-default#session.py-532).

Fixes #4398.

@chdorner chdorner closed this in #4409 Mar 8, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment