Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow the JSONSerializer to be used as a session serializer #435

Merged
merged 4 commits into from
Feb 9, 2023

Conversation

chris-allan
Copy link
Member

Outline

As of Django 4.1 1, the PickleSerializer has been deprecated and will be
removed in Django 5.0. The next Django LTS (4.2) which we would likely support
is scheduled to be released in April 2023. In order for the JSONSerializer to be used,
get ahead of this deprecation and prepare our userbase for the change we need
to avoid persisting objects, like the Connector, directly in the session.

This PR transitions use of the Connector to a persist and
rehydration strategy similar to how the Django authentication
middleware handles model objects. It is also consistent with how we
handle our own Server pseudo model object.

In addition, the SESSION_SERIALIZER Django setting has been moved
to an OMERO.web configuration property (omero.web.session_serializer) and
left defaulting to the PickleSerializer. When using Redis via the cached session
engine the PickleSerializer is used by default regardless of what
SESSION_SERIALIZER is set to.

Testing

OMERO.web in its default configuration (SESSION_ENGINE=omeroweb.filesessionstore
and SESSION_SERIALIZER=django.contrib.sessions.serializers.PickleSerializer)
functionality should remain completely unchanged. Particular attention should be
paid to the login, public user, and bsession workflows.

With SESSION_SERIALIZER=django.contrib.sessions.serializers.JSONSerializer
functionality should remain completely unchanged.

With SESSION_ENGINE=django.contrib.sessions.backends.cache
and SESSION_SERIALIZER=django.contrib.sessions.serializers.PickleSerializer again
functionality should remain completely unchanged.

With SESSION_ENGINE=django.contrib.sessions.backends.cache
and SESSION_SERIALIZER=django.contrib.sessions.serializers.JSONSerializer the
cache itself will need to be configured to use a JSON serializer separately.
For django-redis this can be achieved by configuration such as:

{
  "default": {
    "BACKEND": "django_redis.cache.RedisCache",
    "LOCATION": "redis://127.0.0.1:6379",
    "OPTIONS": {
      "SERIALIZER": "django_redis.serializers.json.JSONSerializer"
    }
  }
}

Footnotes

  1. https://docs.djangoproject.com/en/4.1/releases/4.1/#id2

As of Django 4.1, the PickleSerializer has been deprecated and will be
removed in Django 5.0.  In order for the JSONSerializer to be used, we
need to avoid persisting objects, like the Connector, directly in the
session.

This commit transitions use of the Connector to a persist and
rehydration strategy similar to how the Django authentication
middleware handles model objects.  It is also consistent with how we
handle our own Server pseudo model object.
@jburel jburel marked this pull request as ready for review January 17, 2023 12:32
Copy link
Member

@sbesson sbesson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a pure code review, the proposed changes make sense and replace the direct storage of Connector objects within the Django session by a dictionary representation and associated serialization/deserialization methods. This should allow the session to remain serializable both with the current PickleSerializer and the JSONSerializer.

From a consumer perspective, any downstream implementation which was directly assigning and/or retrieving request.session["connector"] would now need to update their codebase and use the new methods as done in this PR.
I did not find any explicit specification of the data stored in OMERO.web sessions either in https://omero.readthedocs.io/en/stable/ or in the docstrings. Additionally this change is motivated by the deprecation, and the upcoming removal, of the default session serialization mechanism in Django for security reasons. Similarly to how we have treated similar changes in the past e.g. the upgrade to Django 3.2, this might not justify to increment the major version of OMERO.web. However, we should definitely treat it as such in terms of documentation and be upfront about the implications for consumers.

I will test the PR in various configurations and report in the upcoming days.

omeroweb/decorators.py Outdated Show resolved Hide resolved
@chris-allan
Copy link
Member Author

From a consumer perspective, any downstream implementation which was directly assigning and/or retrieving request.session["connector"] would now need to update their codebase and use the new methods as done in this PR.

Yes, it's a breaking change.

Similarly to how we have treated similar changes in the past e.g. the upgrade to Django 3.2, this might not justify to increment the major version of OMERO.web.

The main reason I opened this now is that we also have #433 open which is going to require a major version update and an upgrade guide exactly like what we did for Django 3.2.

@kkoz
Copy link
Contributor

kkoz commented Jan 20, 2023

I'm getting the following error when trying to start omero web with omero.web.session_engine=omeroweb.filesessionstore and omero.web.session_serializer=django.contrib.sessions.serializers.JSONSerializer:

Stopping OMERO.web... [NOT STARTED]

0 static files copied to '/home/kevin/omero/OMERO.server/var/static', 533 unmodified, 2 post-processed.
Clearing expired sessions. This may take some time... Traceback (most recent call last):
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 121, in decode
    return signing.loads(session_data, salt=self.key_salt, serializer=self.serializer)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/signing.py", line 119, in loads
    return TimestampSigner(key, salt=salt).unsign_object(s, serializer=serializer, max_age=max_age)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/signing.py", line 198, in unsign_object
    return serializer().loads(data)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/signing.py", line 90, in loads
    return json.loads(data.decode('latin-1'))
  File "/usr/lib/python3.8/json/__init__.py", line 357, in loads
    return _default_decoder.decode(s)
  File "/usr/lib/python3.8/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/usr/lib/python3.8/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "manage.py", line 75, in <module>
    execute_from_command_line(sys.argv)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
    utility.execute()
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/management/__init__.py", line 413, in execute
    self.fetch_command(subcommand).run_from_argv(self.argv)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/management/base.py", line 354, in run_from_argv
    self.execute(*args, **cmd_options)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/core/management/base.py", line 398, in execute
    output = self.handle(*args, **options)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/contrib/sessions/management/commands/clearsessions.py", line 16, in handle
    engine.SessionStore.clear_expired()
  File "/home/kevin/code/omero-web/omeroweb/filesessionstore.py", line 230, in clear_expired
    session.load()
  File "/home/kevin/code/omero-web/omeroweb/filesessionstore.py", line 101, in load
    session_data = self.decode(file_data)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 134, in decode
    return self._legacy_decode(session_data)
  File "/home/kevin/omero/omerovenv/lib/python3.8/site-packages/django/contrib/sessions/backends/base.py", line 144, in _legacy_decode
    encoded_data = base64.b64decode(session_data.encode('ascii'))
  File "/usr/lib/python3.8/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Incorrect padding

I'm on Ubuntu 20.04 and have Django==3.2.16. I'll look into it some more. The filesessionstore seems to work with no session serializer specified or the pickle serializer explicitly specified. I'm going to try to find the root cause.

@kkoz
Copy link
Contributor

kkoz commented Jan 20, 2023

It looks like my issue was stemming from existing sessions in the filesessionstore directory which were pickled. The JSONSerializer attempts to decode using json.loads(data.decode('latin-1')), whose character set is insufficient for pickled data. We might want to mention in the documentation/release notes somewhere that the existing sessions must be cleared from the store before switching from the PickleSerializer to the JSONSerializer.

@kkoz
Copy link
Contributor

kkoz commented Jan 20, 2023

Tested the following:

  • omero.web.session_engine=omeroweb.filesessionstore and omero.web.session_serializer=django_redis.serializers.json.JSONSerializer
    • Normal user login: success
    • Public user login: success
    • bsession: success
  • omero.web.session_engine=omeroweb.filesessionstore and omero.web.session_serializer=django.contrib.sessions.serializers.PickleSerializer
    • Normal user login: success
    • Public user login: success
    • bsession: success
  • omero.web.session_engine=django.contrib.sessions.backends.cache and omero.web.session_serializer=django.contrib.sessions.serializers.PickleSerializer
    • Normal user login: success
    • Public user login: success
    • bsession: success
  • omero.web.session_engine=django.contrib.sessions.backends.cache and omero.web.session_serializer=django_redis.serializers.json.JSONSerializer Works, but the sessions are still serialized with the PickleSerializer.

@chris-allan
Copy link
Member Author

Thanks, @kkoz. I had similar binascii.Error: Incorrect padding issues with I chalked up to switches I was doing in the Django version separately but it looks like we might need to do a little more. On startup, OMERO.web runs the Django clearsessions 1 management command but unfortunately this tries to hydrate all the current sessions to clear the expired ones when using the file session store.

We have our own version of django/contrib/sessions/backends/file.py (omeroweb/filesessionstore.py) so this special handling could be done in the clear_expired() implementation there. For django/contrib/sessions/backends/cache.py clear_expired() is a no-op.

Footnotes

  1. https://docs.djangoproject.com/en/4.1/ref/django-admin/#django-admin-clearsessions

@kkoz
Copy link
Contributor

kkoz commented Feb 6, 2023

I re-tested using omero.web.session_engine=django.contrib.sessions.backends.cache and omero.web.session_serializer=django_redis.serializers.json.JSONSerializer And successfully got JSON sessions. I must've had some issue with my omero config or something. My testing shows everything is working as described.

"django.contrib.sessions.serializers.PickleSerializer",
str,
(
"You can use the this setting to customize the session "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"You can use the this setting to customize the session "
"You can use this setting to customize the session "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants