New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Atoms iterator performance improvement #476
Atoms iterator performance improvement #476
Conversation
Awesome, thanks for the benchmark! I'll review it and merge ASAP. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, merging this! I'll update the returned tuple to match the docs (ie: keep the keys
as a filter of all fields), even thought the only use case is filtering the value for now.
|
||
atom_tuple = Atom(**dict((k, atom_dict.get(k)) for k in keys)) | ||
atom_tuple = Atom(name=field_name, field=field, value=value) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By always setting the keys, we loose the option to not return a field, but since there is no use case defined for that it is not a loss. Compare the output:
Before patch
list(iteration.atoms(schema, data, keys=['value']))
Out[3]:
[Atom(name=None, field=None, value='42'),
Atom(name=None, field=None, value=Undefined),
Atom(name=None, field=None, value=Undefined)]
After patch
list(iteration.atoms(schema, data, keys=['value']))
Out[10]:
[Atom(name='id', field=<IntType() instance as 'id'>, value='42'),
Atom(name='first_name', field=<StringType() instance as 'first_name'>, value=Undefined),
Atom(name='last_name', field=<StringType() instance as 'last_name'>, value=Undefined)]
Just FYI: I updated the development branch code to still keep the before-patch functionality and added test to document it, while keeping the speed improvement. Thanks! |
Awesome! Thanks for merging! I had lost from my view savings of initial At this moment I try to search any another things that can be more faster in Schematics. In my current project Schematics is bottleneck, that deliver clean models for business logic, but also bring slowness at marshalling level. It may be confusing, but, if it is interesting for you as developer, in details, I have customised ModelType that allows to lazy resolving of models at runtime and some limiting logic on export loop to avoid infinite recursion. I think about PR of this ModelType, but I don't have any time at this moment to offer flexible decision about type syntax to avoid setup full path anytime (e.g. P.S. Excuse me for changing this place to discuss. |
@vovanbo It would be great if you could provide test cases for slow performance. Schematics is definitely heavy on looping over nested models and is something we should try to optimize. https://github.com/schematics/schematics-benchmark is where we will be storing the benchmarks we want to keep track of. For the dynamic ModelTypes, I have a feature of |
Try to avoid unnecessary dict conversions in most frequently called method of Schematics.
Simple benchmark is here: https://gist.github.com/vovanbo/dbb3044b417bcd407a4411119aacf415