Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atoms iterator performance improvement #476

Merged
merged 1 commit into from Mar 20, 2017
Merged

Atoms iterator performance improvement #476

merged 1 commit into from Mar 20, 2017

Conversation

vovanbo
Copy link

@vovanbo vovanbo commented Mar 17, 2017

Try to avoid unnecessary dict conversions in most frequently called method of Schematics.
Simple benchmark is here: https://gist.github.com/vovanbo/dbb3044b417bcd407a4411119aacf415

@lkraider
Copy link
Contributor

Awesome, thanks for the benchmark! I'll review it and merge ASAP.

Copy link
Contributor

@lkraider lkraider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, merging this! I'll update the returned tuple to match the docs (ie: keep the keys as a filter of all fields), even thought the only use case is filtering the value for now.


atom_tuple = Atom(**dict((k, atom_dict.get(k)) for k in keys))
atom_tuple = Atom(name=field_name, field=field, value=value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By always setting the keys, we loose the option to not return a field, but since there is no use case defined for that it is not a loss. Compare the output:

Before patch

list(iteration.atoms(schema, data, keys=['value']))
Out[3]:
[Atom(name=None, field=None, value='42'),
 Atom(name=None, field=None, value=Undefined),
 Atom(name=None, field=None, value=Undefined)]

After patch

list(iteration.atoms(schema, data, keys=['value']))
Out[10]:
[Atom(name='id', field=<IntType() instance as 'id'>, value='42'),
 Atom(name='first_name', field=<StringType() instance as 'first_name'>, value=Undefined),
 Atom(name='last_name', field=<StringType() instance as 'last_name'>, value=Undefined)]

@lkraider lkraider merged commit aebc7b0 into schematics:development Mar 20, 2017
@lkraider
Copy link
Contributor

lkraider commented Mar 20, 2017

Just FYI: I updated the development branch code to still keep the before-patch functionality and added test to document it, while keeping the speed improvement. Thanks!

@vovanbo
Copy link
Author

vovanbo commented Mar 20, 2017

Awesome! Thanks for merging!

I had lost from my view savings of initial None values as you added in tests. Looks like atoms is more useful, than I think initially. :)

At this moment I try to search any another things that can be more faster in Schematics. In my current project Schematics is bottleneck, that deliver clean models for business logic, but also bring slowness at marshalling level.

It may be confusing, but, if it is interesting for you as developer, in details, I have customised ModelType that allows to lazy resolving of models at runtime and some limiting logic on export loop to avoid infinite recursion. I think about PR of this ModelType, but I don't have any time at this moment to offer flexible decision about type syntax to avoid setup full path anytime (e.g. ModelType('SomeModel') instead of ModelType('full.path.to.models.SomeModel'). I guess Schematics haven't this functionality out of the box, because of serialization of circular "wired" models often depend on project requirements (and also can be impossible at all). So, in my case marshalling of deeply nested models can take up to 90% of time of request handling and many-many import/export loops calls.

P.S. Excuse me for changing this place to discuss.

@lkraider
Copy link
Contributor

@vovanbo It would be great if you could provide test cases for slow performance. Schematics is definitely heavy on looping over nested models and is something we should try to optimize.

https://github.com/schematics/schematics-benchmark is where we will be storing the benchmarks we want to keep track of.

For the dynamic ModelTypes, I have a feature of Model.append_field(...) that allows extending a Model after declaration, but I would like to see your approach. Can you create an Issue for this feature and propose your PR for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants