Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json.dumps to check for obj.__json__ before raising TypeError #71549

Open
DanielWard mannequin opened this issue Jun 21, 2016 · 11 comments
Open

json.dumps to check for obj.__json__ before raising TypeError #71549

DanielWard mannequin opened this issue Jun 21, 2016 · 11 comments
Assignees
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@DanielWard
Copy link
Mannequin

DanielWard mannequin commented Jun 21, 2016

BPO 27362
Nosy @rhettinger, @etrepum, @stevendaprano, @bitdancer, @berkerpeksag, @serhiy-storchaka, @Vgr255
Files
  • json-customize.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/etrepum'
    closed_at = None
    created_at = <Date 2016-06-21.13:09:18.565>
    labels = ['3.7', 'type-feature', 'library']
    title = 'json.dumps to check for obj.__json__ before raising TypeError'
    updated_at = <Date 2018-02-01.16:33:45.801>
    user = 'https://bugs.python.org/DanielWard'

    bugs.python.org fields:

    activity = <Date 2018-02-01.16:33:45.801>
    actor = 'ulope'
    assignee = 'bob.ippolito'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2016-06-21.13:09:18.565>
    creator = 'Daniel Ward'
    dependencies = []
    files = ['46701']
    hgrepos = []
    issue_num = 27362
    keywords = ['patch']
    message_count = 11.0
    messages = ['268991', '268993', '268995', '268999', '269002', '269003', '269021', '269091', '289001', '289006', '289007']
    nosy_count = 10.0
    nosy_names = ['rhettinger', 'bob.ippolito', 'steven.daprano', 'r.david.murray', 'berker.peksag', 'serhiy.storchaka', 'abarry', 'ulope', 'Daniel Ward', 'Ollie Ford']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = 'needs patch'
    status = 'open'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue27362'
    versions = ['Python 3.7']

    @DanielWard
    Copy link
    Mannequin Author

    DanielWard mannequin commented Jun 21, 2016

    To help prevent retrospective JSONEncoder overrides when failing to serialize a given object, the intention of this issue is to propose that the JSON encoder checks if a given object has a __json__ attribute, using that rather than raising a TypeError.

    This will help in maintaining easier-to-follow code and keeps the responsibility of determining how an object should be represented in JSON objects firmly within the object itself.

    The obj.__json__ callable/attribute should behave in the same way as __repr__ or __str__, for example.

    I'm happy to look in to contributing this enhancement myself if that's preferred. Any pointers as to how I go about contributing are greatly appreciated.

    @DanielWard DanielWard mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jun 21, 2016
    @Vgr255
    Copy link
    Mannequin

    Vgr255 mannequin commented Jun 21, 2016

    I'm not too familiar with the json package, but what should __json__ return when called?

    @DanielWard
    Copy link
    Mannequin Author

    DanielWard mannequin commented Jun 21, 2016

    Sure, so for example:

    =========

    import json
    
    
    class ObjectCounter:
    
        def __init__(self, name, count):
            self.name = name
            self.count = count
    
        def __json__(self):
           return '[{name}] {count}'.format(name=self.name, count=self.count)
    
    
    object_counter = ObjectCounter('DC1', 3789)
    my_json_string = json.dumps({'success': True, 'counter': object_counter})

    ============

    In the above example, the value stored in my_json_string would be:

    '{"success": true, "counter": "[DC1] 3789"}'

    This is an untested and quick example, but I hope it explains what I'm aiming to achieve. Without the __json__ method, the json.dumps call would raise an exception along the lines of the below message, unless we create a new JSONEncoder object and call json.dumps(..., cls=MyJSONEncoder), which becomes difficult to manage and follow on larger projects.

    TypeError: <ObjectCounter instance at XXX> is not JSON serializable

    @Vgr255
    Copy link
    Mannequin

    Vgr255 mannequin commented Jun 21, 2016

    So __json__ returns a string meant to be serializable. I'm not too keen on using a dunder name (although my word doesn't weigh anything ;) and I'd personally prefer something like as_json_string(). I think the idea in general is good, though. Mind submitting a patch?

    @stevendaprano
    Copy link
    Member

    For starters, dunder names like __json__ are reserved for Python's own use, so you would have to get the core developers to officially bless this use.

    But... I'm not really sure that "the responsibility of determining how an object should be represented in JSON objects firmly within the object itself" is a good idea. For a general purpose protocol, I don't think you can trust any object to return valid JSON. What if my object.__json__ returned "}key='c" or some other invalid string? Whose responsibility is it to check that __json__ returns valid JSON?

    I don't think there is any need to make this an official protocol. You know your own objects, you know if you can trust them, and you can call any method you like. So your example becomes:

        my_json_string = json.dumps(
            {'success': True, 'counter': object_counter.to_json()})

    which is okay because that's clearly *your* responsibility to make sure that your object's to_json method returns a valid string. If you make it an official language wide protocol, it's unclear whose responsibility it is: the object (dangerous!), the caller (difficult), the Python interpreter (unlikely), json.dumps (unlikely).

    @DanielWard
    Copy link
    Mannequin Author

    DanielWard mannequin commented Jun 21, 2016

    I don't think I explained the response very well, effectively the __json__ call would return an object which is JSON-serializable. This would include dict objects containing JSON-serializable objects albeit natively-supporting JSON serialisation or by means of subsequent obj.__json__ calls.

    The reason I gave it __json__ is purely for easily-remembered implementation, separating it out from calls which may potentially clash with existing codebases, because let's face it, people don't often get to start again ;)

    I'm not adverse to changing the method name at all, but I do believe this is a progressive way to go regarding JSON-serialization.

    @berkerpeksag
    Copy link
    Member

    This was discussed on python-ideas before:

    I don't think there was an agreement on the idea so I suggest to send your proposal to python-ideas first.

    @bitdancer
    Copy link
    Member

    Pretty much any project that makes non-trivial use of json ends up implementing a jsonification protocol, usually by creating either a __json__ method or (more commonly, I think) a to_json method.

    But, yeah, this is python-ideas material and would get into the stdlib only as an officially blessed protocol, in which case using __json__ would make sense. So I'm going to close the issue pending a consensus on python-ideas. If it gets accepted the issue can be reopened.

    @serhiy-storchaka
    Copy link
    Member

    This could fix other issues:

    Currently the blessed way of JSON encoder customization is to implement the default method in JSONEncoder subclass or pass the default argument to dump(). But that requires changing every JSON serialization call and handling all non-standard types in one function. I think it would be handly to pick the type-specific serialization function from: 1) per-encoder dispatch table, 2) global dispatch type (registry), 3) __json__ method. This can be done after the default function fails or be included in the default default method.

    This will add JSON support of standard library types (e.g. collections other than list, tuple and dict, numbers other than int and float) and will help to implement task specific serialization of user classes.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Mar 5, 2017
    @rhettinger
    Copy link
    Contributor

    I concur with David Murray that this should be kicked around on python-dev or python-ideas first. Also, we should ask Bob Ippolito for his thoughts.

    @serhiy-storchaka
    Copy link
    Member

    This feature already was proposed for simplejson (simplejson/simplejson#52). Special __json__ method is used in wild in a number of projects for exactly this purpose. It looks to me the main disagreement in the past Python-Idea discussion (https://mail.python.org/pipermail/python-ideas/2010-July/007811.html) was about whether implement the customization as a special method or as a registry. I suggest to implement both. Special methods are good for standard collection and numeric classes, global registry is good for application-wide serialization, local dispatch table or the default method are good for more specific task-specific customization.

    Here is a draft implementation. It follows the design of pickle and copy modules.

    There are few design questions.

    1. What is the order of using different customization methods? Should registries and __json__ be checked before calling the default method, after calling the default method (if it fails), or inside the default implementation of the default method?

    2. For Decimal we need to customize raw JSON representation. In the past it was possible to implement an intermediate float or int subclass with __str__ or __repr__ returning raw JSON representation. But this hack no longer works. Needed to add explicit support of special JSON representation objects. Other way -- add yet one special method (raw_json or __json_str__).

    3. Do we need the json.registry() function for global registration, or it is enough to expose the json.dispatch_table mapping?

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    Status: No status
    Development

    No branches or pull requests

    5 participants