Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

json C vs pure-python implementation difference #59091

Open
socketpair mannequin opened this issue May 23, 2012 · 10 comments
Open

json C vs pure-python implementation difference #59091

socketpair mannequin opened this issue May 23, 2012 · 10 comments
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@socketpair
Copy link
Mannequin

socketpair mannequin commented May 23, 2012

BPO 14886
Nosy @pitrou, @scoder, @ezio-melotti, @socketpair, @matrixise

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2012-05-23.09:25:29.193>
labels = ['3.7', '3.8', 'type-bug', 'library']
title = 'json C vs pure-python implementation difference'
updated_at = <Date 2018-08-17.12:02:57.553>
user = 'https://github.com/socketpair'

bugs.python.org fields:

activity = <Date 2018-08-17.12:02:57.553>
actor = 'scoder'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2012-05-23.09:25:29.193>
creator = 'socketpair'
dependencies = []
files = []
hgrepos = []
issue_num = 14886
keywords = []
message_count = 10.0
messages = ['161395', '161399', '161474', '161496', '161536', '161538', '161540', '171409', '323649', '323654']
nosy_count = 8.0
nosy_names = ['pitrou', 'scoder', 'thomaslee', 'ezio.melotti', 'cvrebert', 'socketpair', 'alexey-smirnov', 'matrixise']
pr_nums = []
priority = 'normal'
resolution = None
stage = 'needs patch'
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue14886'
versions = ['Python 3.7', 'Python 3.8']

@socketpair
Copy link
Mannequin Author

socketpair mannequin commented May 23, 2012

Pure-python implementation:
if isinstance(o, (list, tuple)):

C implementation:
if (PyList_Check(obj) || PyTuple_Check(obj))

This make real difference (!) in my code.

So, please change pure-python implementation to:
if type(o) in (list, tuple):
Or, fix C implementation to: /* intentionally forgot (only for this example) to check if return value is -1 */
if (PyObject_IsInstance(obj, PyList_Type) || PyObject_IsInstance(obj, PyTuple_Type)

@socketpair socketpair mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 23, 2012
@pitrou
Copy link
Member

pitrou commented May 23, 2012

What difference does it make? Are you using __instancecheck__ perhaps?

@socketpair
Copy link
Mannequin Author

socketpair mannequin commented May 24, 2012

#!/usr/bin/python2.7

import json

class pseudo_list(object):
    __class__ = list # fake isinstance

    def __init__(self, iterator):
        self._saved_iterator = iterator

    def __iter__(self):
        return self._saved_iterator

class myenc(json.JSONEncoder):
    def default(self, o):
        try:
            return pseudo_list(iter(o))
        except TypeError:
            return super(myenc, self).default(o)

# works (pure-python implementation)
print json.dumps({1:xrange(10), 2:[5,6,7,8]}, cls=myenc, indent=1)

# does not work (C implementation)
print json.dumps({1:xrange(10), 2:[5,6,7,8]}, cls=myenc, indent=None)

@pitrou
Copy link
Member

pitrou commented May 24, 2012

class pseudo_list(object):
__class__ = list # fake isinstance

Why not inherit from list directly?
Setting __class__ to something else isn't widely supported in the Python code base. It may work or may not work, depending on the API, but it's not something we design or test for.

@socketpair
Copy link
Mannequin Author

socketpair mannequin commented May 25, 2012

Well, __class_ = list is my problem, but python's problem is that it uses different approaches in C and python implementation.

P.S.
I don't want to subclass list, as I don't want things like this:
x = pseudo_list(iter(xrange(10))
x.append('test')
print len(x)

@pitrou
Copy link
Member

pitrou commented May 25, 2012

Well, __class_ = list is my problem, but python's problem is that it
uses different approaches in C and python implementation.

Well, by construction a C accelerator will use the fastest method
available within what the API's specification allows. The json API
doesn't specify whether isinstance() or a more concrete type check is
used when dispatching over argument types, so I'd classify this as an
implementation detail.

@socketpair
Copy link
Mannequin Author

socketpair mannequin commented May 25, 2012

Inconsistency is bother me. If I specify indent in dumps(), I will have one semantics, else other ones.

Why not to fix pure-python implementation using "type(o) in (list, tuple)" ? This is faster too (as I think).

@thomaslee
Copy link
Mannequin

thomaslee mannequin commented Sep 28, 2012

FWIW, I think Mark's right here. I'm +1 on the implementations being consistent.

Seems like a potentially nasty surprise if you move from one implementation to the other and, lacking awareness of this quirk, design your algorithm around semantics. I think this was Mark's original point.

If the json API doesn't care how the type check is performed, then we get a (probably very small :)) win from the type(o) in (list, tuple) for the Python impl in addition to bringing consistency to the two implementations.

@matrixise
Copy link
Member

We have received a notification about this bug for 3.5

@matrixise matrixise added 3.7 (EOL) end of life 3.8 (EOL) end of life labels Aug 17, 2018
@scoder
Copy link
Contributor

scoder commented Aug 17, 2018

FWIW, the C implementation of the sequence encoder uses PySequence_Fast(), so adding a lower priority instance check that calls the same encoding function would solve this.

s_fast = PySequence_Fast(seq, "_iterencode_list needs a sequence");

Probably not something to fix in Py3.5/6 anymore, though.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

3 participants