New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] set_class = OrderedSet by default? #1744
Comments
I'm not opposed to this, so long as there is no perf impact (I think your intuition on this is right). |
Or since this is easily achieved in user code thanks to a base schema as shown above, we postpone this to marshmallow 4 as described in my last paragraph. Adding v4 milestone for now. |
I believe that is caused by OrderedDict Benchmark#!/usr/bin/env python3.9
from datetime import datetime
from collections import OrderedDict
def run(D):
A = [('a', None), ('b', None), ('r', None), ('c', None), ('d', None)]
B = [('1', None), ('2', None), ('3', None)]
t0 = datetime.now()
for i in range(5000000):
s = D(A)
s['e'] = None
del s['e']
s.update(B)
t = datetime.now() - t0
return t
a = run(dict)
b = run(OrderedDict)
m = min(a, b)
print(f'dict\t\t{a}\t{100*(a/m-1):+.0f}%')
print(f'OrderedDict\t{b}\t{100*(b/m-1):+.0f}%')
OrderedSet Benchmark#!/usr/bin/env python3.9
from datetime import datetime
from marshmallow.orderedset import OrderedSet
def run(D, i=200000):
A = ['a', 'b', 'c', 'd']
B = D(['1', '2', '3'])
t0 = datetime.now()
for _ in range(i):
s = D(A)
s.add('e')
s.remove('e')
s &= B
return datetime.now() - t0
a = run(set)
b = run(OrderedSet)
m = min(a, b)
print(f'set\t\t{a}\t{100*(a/m-1):+.0f}%')
print(f'OrderedSet\t{b}\t{100*(b/m-1):+.0f}%')
If an application has schema instantiation in its hot loop I think the vast majority of the execution time is going to be consumed by memory allocation. +1 for dropping (or deprecating + ignoring) |
Yes, this is exactly what I meant. Hence my proposal. In 4.0, I'd drop |
@lafrech I was intrigued by #1891 (comment), so I took a stab at replacing |
@deckar01 Nice. On the short run (marshmallow 3.x) we'll probably be keeping The question of whether we keep vendoring the Later on (marshmallow 4), we can either keep |
+1 for this. It will become a key issue when the user generates the local OpenAPI spec file with apispec while the spec output is not deterministic (apiflask/apiflask#373). |
I assume the performance impact would be minimal. From a quick look, it would only affect schema instantiation.
The benefit is that when using Python 3.7+, all schemas would be ordered. They still wouldn't return
OrderedDict
instances unlessordered
Meta is passed, but they would keep fields declaration order in their output and in their schema list, so that order would be respected in API responses and apispec users would get their fields documented in declaration order.Currently, users can achieve this in a base schema with
My gut feeling is that this change would have no functional downside and a negligible perf impact at field init (less that we care about when adding a new feature), while using
ordered=True
does have an impact on serialization performance (I didn't measure it) and functionality (equality test depends on order), hence my proposal. There would still be an interest in allowing the use ofOrderedDict
.If we go this route, we could use
OrderedSet
by default in marshmallow 3.x, and in marshmallow 4 perhaps rework theordered
feature because schemas would already be ordered so it would be specifically for users really needing anOrderedDict
instance. Perhaps drop theordered
Meta option and let the user specify a dict class of their choice another way. But that's another story.See marshmallow-code/flask-smorest#228 (comment).
The text was updated successfully, but these errors were encountered: