Skip to content


Subversion checkout URL

You can clone with
Download ZIP


[ujson] Support for serializing arbitrary types #130

wants to merge 1 commit into from

5 participants

  • Previously, any arbitrary python object that wasn't a dict, int, string, etc. would be considered a dict/object. The library would try to call the toDict function to find the object's dict value. If no toDict function is found, the object will be serialized as an empty JSON object ({}).
  • This change will add a fallback that calls a python object's __unicode__ function. If an object defines, __unicode__ the library should probably serialize the object as a utf-8 string.
  • Added unit tests for regular and error handling cases
  • Ran guppy in tests/ and confirmed there are no memory leaks

Thanks for this pull request. Is there a more specific use case to when this may be useful?

@jskorpan jskorpan closed this

Hi @jskorpan,

Sure, no problem. This pull request will allow developers to serialize any arbitrary object to a string instead of an empty JSON object (currently the default). There are many times where developers will use classes/objects to wrap a string so they can add additional functionality, but the representation of the object is a string.

One example is the Django Lazy Translation object: This object wraps a string and is meant to represent the translated value of a string. Currently, the serialized value of a lazy translation instance is an empty JSON object ('{}'). To me, this seems that the library is silently failing to compute a JSON value for the objects and seems broken (especially since most objects don't have a toDict function).

The long term solution is to eventually support customer encoders as #124 suggests; however, it seems that the feature isn't being worked on currently and is far from completion (please, correct me if I'm wrong).

Anyways, the pull request will add functionality that will use an object's __unicode__ function to serialize the object to JSON if that function is defined. It's supplemental to the toDict function lookup that is already implemented in ujson. This solution will allow developers to specify string values for their objects instead of just dict values, which is actually a pretty common use case (at least, it seems like it from issued I've looked through). If you have any other questions, let me know.


@jskorpan jskorpan reopened this
@mvismonte mvismonte referenced this pull request from a commit
Jonas Tarnstrom Minor performance optimization 510a8fc
@mvismonte mvismonte [ujson] Added support for seralizing objects to JSON
- This is accomplished by calling the __unicode__ function on an object.
- Added unit tests to test this change.  It also seems that this should not be
  leaking memory according to the test in tests/

@jskorpan: I just rebased and all unit tests are passing now. Thanks for reverting that last commit!


Should this be merged?


@jskorpan I need this for the Encoder/Decoder API (see #124).


This would be a huge help. I would really like to use ujson but I need to dates to be returned in .isoformat()


Hm, that is interesting, but I would need something like this, but just that the object can return raw JSON. I have situations where I already have a JSON from somewhere else and I would like to include it directly, instead of having to first deserialize it to be able to serialize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Commits on Apr 17, 2014
  1. @mvismonte

    [ujson] Added support for seralizing objects to JSON

    mvismonte authored
    - This is accomplished by calling the __unicode__ function on an object.
    - Added unit tests to test this change.  It also seems that this should not be
      leaking memory according to the test in tests/
This page is out of date. Refresh to see the latest.
Showing with 75 additions and 5 deletions.
  1. +30 −4 python/objToJSON.c
  2. +45 −1 tests/
34 python/objToJSON.c
@@ -67,6 +67,7 @@ typedef struct __TypeContext
PyObject *iterator;
JSINT64 longValue;
+ PyObject *unicodeValue;
} TypeContext;
#define GET_TC(__ptrtc) ((TypeContext *)((__ptrtc)->prv))
@@ -143,6 +144,14 @@ static void *PyUnicodeToUTF8(JSOBJ _obj, JSONTypeContext *tc, void *outValue, si
return PyString_AS_STRING(newObj);
+static void *PyObjToUTF8(JSOBJ _obj, JSONTypeContext *tc, void *outValue, size_t *_outLen)
+ PyObject *obj = GET_TC(tc)->unicodeValue;
+ void *retValue = PyUnicodeToUTF8(obj, tc, outValue, _outLen);
+ Py_DECREF(obj);
+ return retValue;
static void *PyDateTimeToINT64(JSOBJ _obj, JSONTypeContext *tc, void *outValue, size_t *_outLen)
PyObject *obj = (PyObject *) _obj;
@@ -482,7 +491,7 @@ void SetupDictIter(PyObject *dictObj, TypeContext *pc)
void Object_beginTypeContext (JSOBJ _obj, JSONTypeContext *tc)
- PyObject *obj, *exc, *toDictFunc, *iter;
+ PyObject *obj, *exc, *iter;
TypeContext *pc;
if (!_obj) {
@@ -509,6 +518,7 @@ void Object_beginTypeContext (JSOBJ _obj, JSONTypeContext *tc)
pc->index = 0;
pc->size = 0;
pc->longValue = 0;
+ pc->unicodeValue = NULL;
if (PyIter_Check(obj))
@@ -646,10 +656,9 @@ void Object_beginTypeContext (JSOBJ _obj, JSONTypeContext *tc)
- toDictFunc = PyObject_GetAttrString(obj, "toDict");
- if (toDictFunc)
+ if (PyObject_HasAttrString(obj, "toDict"))
+ PyObject* toDictFunc = PyObject_GetAttrString(obj, "toDict");
PyObject* tuple = PyTuple_New(0);
PyObject* toDictResult = PyObject_Call(toDictFunc, tuple, NULL);
@@ -673,6 +682,23 @@ void Object_beginTypeContext (JSOBJ _obj, JSONTypeContext *tc)
tc->type = JT_OBJECT;
SetupDictIter(toDictResult, pc);
+ } else if (PyObject_HasAttrString(obj, "__unicode__")) {
+ // Since __unicode__ will fall back to calling __repr__ if the object
+ // doesn't define it, we don't want to want to use it as the serialized
+ // value of the object unless the object explicitly defines it.
+ PyObject* unicodeResult = PyObject_Unicode(obj);
+ // If calling unicode throws an exception, we'll let python raise it.
+ if (unicodeResult == NULL)
+ {
+ goto INVALID;
+ }
+ // Set the JSON type and store the unicode result temporarily.
+ pc->PyTypeToJSON = PyObjToUTF8;
+ tc->type = JT_UTF8;
+ GET_TC(tc)->unicodeValue = unicodeResult;
+ return;
46 tests/
@@ -795,18 +795,62 @@ def test_decodeBigEscape(self):
input = quote + (base * 1024 * 1024 * 2) + quote
output = ujson.decode(input)
+ def test_object_default(self):
+ # An object without toDict or __unicode__ defined should be seralized
+ # as an empty dict.
+ class ObjectTest:
+ pass
+ output = ujson.encode(ObjectTest())
+ dec = ujson.decode(output)
+ self.assertEquals(dec, {})
def test_toDict(self):
d = {u"key": 31337}
class DictTest:
def toDict(self):
return d
+ def __unicode__(self):
+ return 'unicode defined' # Fallback and shouldn't be called.
o = DictTest()
output = ujson.encode(o)
dec = ujson.decode(output)
self.assertEquals(dec, d)
+ def test_object_with_unicode(self):
+ # If __unicode__ returns a unicode string, then the that string
+ # will be the serialized value of the object.
+ output_text = 'this is the correct output'
+ class UnicodeTest:
+ def __unicode__(self):
+ return output_text
+ d = {u'key': UnicodeTest()}
+ output = ujson.encode(d)
+ dec = ujson.decode(output)
+ self.assertEquals(dec, {u'key': output_text})
+ def test_object_with_unicode_type_error(self):
+ # __unicode__ must return a string, otherwise it should raise an error.
+ for return_value in (None, 1234, 12.34, True):
+ class UnicodeTest:
+ def __unicode__(self):
+ return return_value
+ d = {u'key': UnicodeTest()}
+ self.assertRaises(TypeError, ujson.encode, d)
+ def test_object_with_unicode_attribute_error(self):
+ # If __unicode__ raises an error, make sure python actually raises it.
+ class UnicodeTest:
+ def __unicode__(self):
+ raise AttributeError
+ d = {u'key': UnicodeTest()}
+ self.assertRaises(AttributeError, ujson.encode, d)
def test_decodeArrayTrailingCommaFail(self):
input = "[31337,]"
@@ -1062,4 +1106,4 @@ def test_decodeStringUTF8(self):
heap = hp.heapu()
print heap
Something went wrong with that request. Please try again.