New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce custom serializers to ActiveJob arguments #30941

Merged
merged 15 commits into from Feb 14, 2018

Conversation

Projects
None yet
8 participants
@EPecherkin
Contributor

EPecherkin commented Oct 20, 2017

Summary

The way to serialize arguments for ActiveJob was completely reworked.
This PR brings an ability to define custom serializers for almost any object. A developer needs just to implement a simple interface.

class MySpecialSerializer
  class << self
    # Check if this object should be serialized using this serializer
    def serialize?(object)
      object.is_a? MySpecialValueObject
    end

    # Convert an object to a simpler representative using supported object types
    # Recommended representative is a Hash with a specific key. Keys can be of basic types only
    def serialize(object)
      {
        key => ActiveJob::Serializers.serialize(object.value)
        'another_attribute' => ActiveJob::Serializers.serialize(object.another_attribute)
      }
    end

    # Check if this serialized value be deserialized using this serializer
    def deserialize?(object)
      object.is_a?(Hash) && object.keys == [key, 'another_attribute']
    end

    # Convert serialized value into a proper object
    def deserialize(object)
      value = ActiveJob::Serializers.deserialize(object[key])
      another_attribute = ActiveJob::Serializers.deserialize(object['another_attribute'])
      MySpecialValueObject.new value, another_attribute
    end

    # Define this method if you are using a hash as a representative.
    # This key will be added to a list of restricted keys for hashes. Use basic types only
    def key
      "_aj_custom_my_special_value_object"
    end
  end
end

And add this serializer to a list:

ActiveJob::Base.add_serializers(MySpecialSerializer)

Testing

  1. Clone the repo
  2. Go to activejob folder
  3. Download test.rb and place here
  4. Launch irb -r ./lib/active_job.rb -r ./test.rb in terminal
  5. Basic test is deserialize(serialize(ARGUMENT)) == ARGUMENT. ARGUMENT contains all possible objects for serialization. But you can experiment as you wish
@rails-bot

This comment has been minimized.

rails-bot commented Oct 20, 2017

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @kamipo (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Oct 20, 2017

It is by design that we only serialize a small set of object. Im fine to allowing to define custom serializers, but this PR is adding more default serializers that we had before. Could you keep only the current types we have?

@rafaelfranca rafaelfranca requested a review from matthewd Oct 20, 2017

@rafaelfranca rafaelfranca assigned rafaelfranca and unassigned kamipo Oct 20, 2017

@mpapis

This comment has been minimized.

Contributor

mpapis commented Oct 21, 2017

@rafaelfranca would it be OK if we keep the serializers but remove them from the default list?

@EPecherkin

This comment has been minimized.

Contributor

EPecherkin commented Oct 23, 2017

@mpapis It will be confusing I think. What if we create a separate gem with additional serializers?

@EPecherkin

This comment has been minimized.

Contributor

EPecherkin commented Oct 24, 2017

@rafaelfranca you can check it

@EPecherkin

This comment has been minimized.

Contributor

EPecherkin commented Nov 16, 2017

@kirs

This comment has been minimized.

Member

kirs commented Nov 16, 2017

@EPecherkin can you describe a good use case when a Rails app would use a custom serializer?

@mpapis

This comment has been minimized.

Contributor

mpapis commented Nov 16, 2017

@kirs In our rails app we had a lot of boilerplate code to serialize parameters to basic types and then to deserialize them in the job, at one time we changed the types and this lead to more complicated code and even introduced bugs.

With automated serialization this would be a lot less painful, not only it would prevent bugs but also it would make the code better.

One of the classes was TimeWithZone, we had special code to serialize it and deserialize it around every job that was using it, with this serializers we define it once and it's done automatically from that point on. TimeWithZone is just one example, advanced applications (like ours) define more custom types that we want to pass to jobs without extra serialization/deserialization each time we use them. We even use it to pass ActiveData objects.

@EPecherkin

This comment has been minimized.

Contributor

EPecherkin commented Dec 4, 2017

activejob/lib/active_job/base.rb Outdated
@@ -1,6 +1,7 @@
# frozen_string_literal: true
require "active_job/core"
require "active_job/serializers"

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

We can remove this require from here since we have autoload in place.

activejob/lib/active_job/serializers.rb Outdated
end
# :nodoc:
SERIALIZERS = [

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

Why instead of defining a private API constant we just don't use the add_serializers method here?

activejob/lib/active_job/serializers/base_serializer.rb Outdated
class << self
def serialize?(argument)
argument.is_a?(klass)
end

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

This should implement klass, deserialize?, serialize and deserialize and raise a NotImplementedError

activejob/lib/active_job/serializers/object_serializer.rb Outdated
def keys
[key]
end

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

This should implement key and raise NotImplementedError.

activejob/lib/active_job/serializers/base_serializer.rb Outdated
module ActiveJob
module Serializers
class BaseSerializer

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

We should add documentation for this class and here also put the example you put in the guides

activejob/lib/active_job/serializers/object_serializer.rb Outdated
module ActiveJob
module Serializers
class ObjectSerializer < BaseSerializer

This comment has been minimized.

@rafaelfranca

rafaelfranca Dec 15, 2017

Member

Missing documentation for this class too

@matthewd

This comment has been minimized.

Member

matthewd commented Dec 16, 2017

I like the idea of supporting custom serializers -- I think field use has confirmed that while there are advantages to preferring basic/universal types, it can be a pain to manually transform values on their way in & out.

I don't think our current custom-hash-key-per-serializer model scales very well... it was fine when there was only one, and the two others enhance something that is still fundamentally a hash... but I think we've reached the end of its useful life.

For a full-on registered serializer setup, I think we'd be better off defining a single new reserved key, probably named something like _aj_serialized, and storing some sort of registered serializer name in its value. The serializer then has full control over the remaining content of the hash.

Beyond avoiding occupying an ever-increasing [albeit obscure] part of the possible hash key space, it also means we don't need to try every deserializer in turn: we know exactly which one can handle the value.

We should probably retain the existing handling for the current reserved keys, for compatibility across upgrades and with any 3rd party / non-ruby code that's already learned how to handle them specially.

Overall I think I'm suggesting that we keep the current case/when block for the "intrinsic" types, and thus focus the new Serializer API only on the hash-transformation needed for new custom-type handlers.

As for adding new serializers by default, I think there are some that are worthwhile: symbol and duration as you previously had, and also Date, Time, DateTime, TimeWithZone.

@rafaelfranca

This comment has been minimized.

@matthewd

This comment has been minimized.

Member

matthewd commented Feb 10, 2018

@rafaelfranca that looks great!

I'm still not sure about the serializers.detect bit... seems like we could explicitly handle the simple cases with our existing case/when, and then use a hash lookup to find the right custom handler. All that looping feels like it could really slow down de/serialization of complex structures.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Feb 12, 2018

Yeah, good point. I'll revert the changes to keep the old behavior as the case statement and only when the value is a Hash I'll use the new behavior.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Feb 12, 2018

Updated the PR with the new code.

I was going to remove the detect from the serialize method as I did in the deserialize method but having a direct mapping between the object class and the serilizers to use a hash lookup removed the possibility to define serializers for the superclass and reuse in all subclasses.

@matthewd

This comment has been minimized.

Member

matthewd commented Feb 13, 2018

😍

I was going to remove the detect from the serialize method as I did in the deserialize method but having a direct mapping between the object class and the serilizers to use a hash lookup removed the possibility to define serializers for the superclass and reuse in all subclasses.

We could use a search over the to-be-serialized object's ancestors instead of a search over the serializers... I'm not sure whether that would be better. 🤷🏻‍♂️


I note your last change has restored the ability to deserialize a hash that has no special keys, which had [by my reading?] gone away inside HashSerializer. If I'm right about that, is it worth adding a test for that case?

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Feb 13, 2018

We could use a search over the to-be-serialized object's ancestors instead of a search over the serializers... I'm not sure whether that would be better. 🤷🏻‍♂️

Yeah, I feel it would be worst if the ancestor chain is big and harder to optimize. Searching in the serializers we can change the order of the array and get the most used first.

@rafaelfranca

This comment has been minimized.

Member

rafaelfranca commented Feb 13, 2018

I just added the tests

activejob/lib/active_job/serializers/time_serializer.rb Outdated
module Serializers
class TimeSerializer < ObjectSerializer # :nodoc:
def serialize(time)
super("value" => time.to_s)

This comment has been minimized.

@bdewater

bdewater Feb 13, 2018

Contributor

time.iso8601 here and Time.iso8601(hash["value"]) to deserialize?

This comment has been minimized.

@rafaelfranca

rafaelfranca Feb 14, 2018

Member

yeah, it make sense to use a iso format.

EPecherkin and others added some commits Oct 17, 2017

Simplify the implementation of custom serialziers
Right now it is only possible to define serializers globally so we don't
need to use a class attribute in the job class.
Only add one more custom key in the serialized hash
Now custom serialziers can register itself in the serialized hash using
the "_aj_serialized" key that constains the serializer name.

This way we can avoid poluting the hash with many reserved keys.

rafaelfranca added some commits Feb 9, 2018

Simplify the implementation of custom argument serializers
We can speed up things for the supported types by keeping the code in the
way it was.

We can also avoid to loop trough all serializers in the deserialization by
trying to access the class already in the Hash.

We could also speed up the custom serialization if we define the class
that is going to be serialized when registering the serializers, but
that will remove the possibility of defining a serialzer for a
superclass and have the subclass serialized using it.
Add tests to serialize and deserialze individually
This will make easier to be backwards compatible when changing the
serialization implementation.

@rafaelfranca rafaelfranca merged commit fa9e791 into rails:master Feb 14, 2018

2 checks passed

codeclimate All good!
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details

rafaelfranca added a commit that referenced this pull request Feb 20, 2018

Merge pull request #32026 from bogdanvlviv/improve-30941
Improve ActiveJob custom argument serializers #30941

albertoalmagro added a commit to albertoalmagro/rails that referenced this pull request Nov 9, 2018

Document missing supported types [ci skip]
This commit adds missing types to the supported types list, which
was extended in rails#30941

albertoalmagro added a commit to albertoalmagro/rails that referenced this pull request Nov 11, 2018

Document missing supported types [ci skip]
This commit adds missing types to the supported types list, which
was extended in rails#30941
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment